Tom Honermann wrote:This keeps neglecting the basic fact that there are implementations and ecosystems that cannot adopt what you are suggesting. Not now, not in the near term, probably never.Would you please give one concrete example of such an implementation or an ecosystem, and how translating Unicode literals to the _ordinary_ literal encoding on stream insertion would be a problem there?
Any EBCDIC based system like z/OS.
C++ code can't distinguish between literals and non-literals
(except for UDLs, but that is irrelevant here), but I don't think
you intended to constrain the question to Unicode literals.
UTF-8 solves problems with mojibake. It does not solve problems
with translations. Let's go back to a variation of an example I
gave earlier that uses a hypothetical message catalog similar to
GNU gettext() to provide
translations of strings in UTF-8 in char8_t.
Say the ordinary literal encoding is IBM-1047. Translation to the ordinary literal encoding will limit the output to characters representable in that encoding; any other characters would presumably be replaced with substitution characters. If the program is run in an IBM-1047 environment, there is no problem. Now run that program in an environment with a Japanese locale using code page 954 (euc-jp). The message catalog lookup would produce a UTF-8 string that probably uses characters not in IBM-1047. Conversion to code page 954 will likely preserve those characters while conversion to IBM-1047 definitely would not.std::cout << u8msg("In the month of ") << std::chrono::August << "\n";
Tom.