SG16 will hold a telecon on Wednesday, May 26th at 19:30 UTC (timezone conversion).

The agenda is:

Since we did not get to discuss P2295R3 at our last telecon, it will again retain the top spot on the agenda followed by P2093R6.  Thus, the agenda looks much the same as for the last telecon (I dropped P2348R0 for now; we won't realistically get to it).

With regard to P2093R6, the current status is unchanged; LEWG has referred the paper back to SG16 for further discussion; please see the LEWG meeting minutes here.  Specifically, LEWG would benefit from additional analysis of previously deferred questions regarding character encoding concerns, transcoding requirements (or the lack there of) and the ensuing consequences (or lack there of).

  1. How errors in transcoding should be handled.  E.g., when transcoding from UTF-8 to a UTF-16 based console interface and the UTF-8 input is not well-formed.
  2. The choice to base behavior on the compile-time choice of literal encoding.  An implication of the current proposal is that a program that contains only ASCII characters in string literals will change behavior depending on whether the literal encoding is UTF-8 vs ASCII (or some other ASCII derived encoding).
  3. Whether transcoding to the console interface encoding should be performed when the literal encoding is not UTF-8.
  4. What the implications are for future support of std::print("{} {} {} {}", L"Wide text", u8"UTF-8 text", u"UTF-16 text", U"UTF-32 text").

At our last telecon, we focused on how to handle ill-formed inputs, but did not much discuss how such inputs arise.  Now that LWG3547 has been effectively (though not officially) resolved by P2372R1, we have a concrete example of how the std::print() facility itself can produce ill-formed input (assuming that std::print() transcodes all inputs using the same encoding).  I would like to start with this example as I think it is fundamental to how we choose to answer the above questions.

std::print("{:L%p}\n", std::chrono::system_clock::now().time_since_epoch());

At issue is the encoding used by chrono formatters specified with the L option to request a locale specific form.  The example above contains the %p specifier with the L option.  In a Chinese locale the desired translation of "PM" is "下午", but the locale will provide the translation in the locale encoding.  As specified in P2093R6, if the literal encoding is UTF-8, than std::print() will expect the translation to be provided in UTF-8, but if the locale is not UTF-8-based (e.g., Big5; perhaps Shift-JIS for the Japanese 午後 translation), then the result is mojibake.

These are possible directions we can investigate to resolve the encoding concerns.

Tom.