C++ Logo


Advanced search

Re: [SG16] Agenda for the 2021-04-28 SG16 telecon

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 26 Apr 2021 12:18:24 -0400
On 4/19/21 10:58 AM, Tom Honermann via SG16 wrote:
> SG16 will hold a telecon on Wednesday, April 28th at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20210428T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
> The agenda is:
> * P2093R5: Formatted output <https://wg21.link/p2093r5>
> * P2348R0: Whitespaces Wording Revamp
> <https://isocpp.org/files/papers/P2348R0.pdf>
> LEWG discussed P2093R5 at their 2021-04-06 telecon and decided to
> refer the paper back to SG16 for further discussion. LEWG meeting
> minutes are available here
> <https://wiki.edg.com/bin/view/Wg21telecons2021/P2093#Library-Evolution-2021-04-06>;
> please review them prior to the telecon. LEWG reviewed the list of
> prior SG16 deferred questions posted to them here
> <http://lists.isocpp.org/lib-ext/2021/03/18189.php>. Of those, they
> established consensus on an answer for #2 (they agreed not to block
> std::print() on a proposal for underlying terminal facilities), but
> referred the rest back to us. My interpretation of their actions is
> that LEWG would like a revision of the paper to address these concerns
> based on SG16 input (e.g., discuss design options and SG16 consensus
> or lack thereof). We'll therefore focus on these questions at this
> telecon.
> Hubert provided the following very interesting example usage.
> std::print("{:%r}\n",
> std::chrono::system_clock::now().time_since_epoch());
> At issue is the encoding used by locale sensitive chrono formatters.
> Search [time.format] <http://eel.is/c++draft/time.format> for "locale"
> to find example chrono format specifiers that are locale dependent.
> The example above contains the %r specifier and is locale sensitive
> because AM/PM designations may be localized. In a Chinese locale the
> desired translation of "PM" is "下午", but the locale will provide the
> translation in the locale encoding. As specified in P2093R5, if the
> execution (literal) encoding is UTF-8, than std::print() will expect
> the translation to be provided in UTF-8, but if the locale is not
> UTF-8-based (e.g., Big5; perhaps Shift-JIS for the Japanese 午後
> translation), then the result is mojibake. This is a good example of
> how locale conflates translation and character encoding.
> Addressing the above will be our first order of business. Please
> reserve some time to independently think about this problem (ignore
> responses to this message for a few days if you need to). I am
> explicitly not listing possible approaches to address this concern in
> this message so as to avoid adding (further) bias in any specific
> direction. I suspect the answers to the previously deferred SG16
> questions will be easier to answer once this concern is resolved.
Now that we've all had some time to think about this issue, here are
some possible directions we can pursue to resolve it. These are
presented in no particular order.

  * Specialize std::locale facets
    <https://en.cppreference.com/w/cpp/locale/locale> and related I/O
    manipulators like std::put_time()
    <https://en.cppreference.com/w/cpp/io/manip/put_time> for char8_t.
    This would allow std::print() to, when the literal encoding is
    UTF-8, opt-in to use of the UTF-8/char8_t facets and I/O manipulators.
  * When the literal encoding is UTF-8, stipulate that running the
    program in a non-UTF-8 based locale is non-conforming. This would
    effectively require MSVC programmers to, when building code with the
    /utf-8 option, to also force selection of a UTF-8 code page via a
    and require use of Windows 10 build 1903 or later.
  * When the literal encoding is UTF-8, specify that non-UTF-8 based
    locale dependent translations be implicitly transcoded (such
    transcoding should never result in errors except perhaps for memory
    allocation failures).
  * Drop the special case handling for the literal encoding being UTF-8
    and specify that, when bypassing a stream to write directly to the
    console, that the output be implicitly transcoded from the current
    locale dependent encoding (whatever it is) to the console encoding

Please feel free to comment on these, or additional, approaches before
our meeting on Wednesday.

I think it would benefit LEWG if a revision of the paper presented each
of these possibilities, the consequences, and the rationale (and
hopefully SG16 consensus) for the proposed direction.


> I do not intend to time limit discussion of P2093R5 as I believe this
> is an important matter to resolve. If we are able to complete
> discussion of P2093R5, then we'll discuss P2348R0.
> Tom.

Received on 2021-04-26 11:18:50