C++ Logo

SG16

Advanced search

Subject: Re: Agenda for the 2021-04-28 SG16 telecon
From: Victor Zverovich (victor.zverovich_at_[hidden])
Date: 2021-04-27 10:23:00


> Would it not be useful to be able to format dates and times in a locale
independent manner though (and have that be the default)?

It would be useful. Thinking more of it and reading the LWG issue thread, I
now agree that %r should be locale-independent by default.

- Victor

On Tue, Apr 27, 2021 at 8:16 AM Tom Honermann <tom_at_[hidden]> wrote:

> On 4/27/21 10:52 AM, Victor Zverovich via SG16 wrote:
>
> Dear Unicoders,
>
> Thanks, Tom, for putting together a detailed list of options. Just want to
> add that print is a completely wrong abstraction level to try to address
> this (we should address it but it has little to do with P2093). The root
> cause is a mismatch between literal and locale encoding and it should be
> addressed on the formatting level in cases where a locale is used.
>
> Yes, I agree, 100%. I think the only reason the problem is seen as more
> relevant for std::print() is because the proposal intends to process the
> formatted output (e.g., transcode it from UTF-8 to native console encoding)
> when the literal encoding is UTF-8 where as std::format() just dumps the
> bytes and produces mojibake (thus making it someone else's problem).
>
> Here's an example from another thread that illustrates this:
>
> std::cout << std::format("时间 {:%r}\n",
> std::chrono::system_clock::now().time_since_epoch());
>
> I think this belongs to a separate small (but important) paper unless the
> resolution is so trivial that it can be a drive-by fix in P2093.
>
> A separate paper works for me though, per above, I think std::print() is
> arguably more impacted by the issue than std::format() is.
>
>
> One more option is to give a runtime error when trying to use (via 'L' or
> other means) a locale with the encoding incompatible with the literal
> encoding. I'd either go with that or do transcoding. Dropping UTF-8
> handling is the least desirable option in my opinion and will basically
> render the feature useless for me as a user.
>
> Agreed.
>
>
> I mostly agree with Corentin except that '%r' can be considered as an
> explicit locale opt-in similar to 'L'.
>
> Would it not be useful to be able to format dates and times in a locale
> independent manner though (and have that be the default)?
>
> Tom.
>
>
> Cheers,
> Victor
>
> On Tue, Apr 27, 2021 at 4:11 AM Corentin Jabot via SG16 <
> sg16_at_[hidden]> wrote:
>
>>
>>
>> On Tue, Apr 27, 2021 at 12:57 PM Jean-Marc Bourguet via SG16 <
>> sg16_at_[hidden]> wrote:
>>
>>> I'm probably too much a Unix guy, but having
>>>
>>> prog
>>>
>>> and
>>>
>>> prog | more
>>>
>>> or
>>>
>>> prog > file; cat file
>>>
>>> displaying different things is not something that meets my expectations.
>>> The difference in buffering behaviour is already hard enough to explain.
>>> Piping to more is far too common for it behaving differently than direct
>>> output.
>>>
>>> Yours,
>>>
>> Either way you are printing out the same content.
>> Except in one case it is _rendered_ correctly and in the other it might
>> not.
>> This will also not affect linux, it is addressing a very windows-specific
>> problem for which the encoding the console assumes by default is not the
>> execution encoding.
>> This is explained in more details in Victor's paper
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2093r5.html#unicode
>>
>>
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
>
>



SG16 list run by sg16-owner@lists.isocpp.org