> Would it not be useful to be able to format dates and times in a locale independent manner though (and have that be the default)?

It would be useful. Thinking more of it and reading the LWG issue thread, I now agree that %r should be locale-independent by default.

- Victor



On Tue, Apr 27, 2021 at 8:16 AM Tom Honermann <tom@honermann.net> wrote:
On 4/27/21 10:52 AM, Victor Zverovich via SG16 wrote:
Dear Unicoders,

Thanks, Tom, for putting together a detailed list of options. Just want to add that print is a completely wrong abstraction level to try to address this (we should address it but it has little to do with P2093). The root cause is a mismatch between literal and locale encoding and it should be addressed on the formatting level in cases where a locale is used.

Yes, I agree, 100%.  I think the only reason the problem is seen as more relevant for std::print() is because the proposal intends to process the formatted output (e.g., transcode it from UTF-8 to native console encoding) when the literal encoding is UTF-8 where as std::format() just dumps the bytes and produces mojibake (thus making it someone else's problem).

Here's an example from another thread that illustrates this:

  std::cout << std::format("时间 {:%r}\n", std::chrono::system_clock::now().time_since_epoch());

I think this belongs to a separate small (but important) paper unless the resolution is so trivial that it can be a drive-by fix in P2093.
A separate paper works for me though, per above, I think std::print() is arguably more impacted by the issue than std::format() is.

One more option is to give a runtime error when trying to use (via 'L' or other means) a locale with the encoding incompatible with the literal encoding. I'd either go with that or do transcoding. Dropping UTF-8 handling is the least desirable option in my opinion and will basically render the feature useless for me as a user.

Agreed.


I mostly agree with Corentin except that '%r' can be considered as an explicit locale opt-in similar to 'L'.

Would it not be useful to be able to format dates and times in a locale independent manner though (and have that be the default)?

Tom.


Cheers,
Victor

On Tue, Apr 27, 2021 at 4:11 AM Corentin Jabot via SG16 <sg16@lists.isocpp.org> wrote:


On Tue, Apr 27, 2021 at 12:57 PM Jean-Marc Bourguet via SG16 <sg16@lists.isocpp.org> wrote:

I'm probably too much a Unix guy, but having

prog

and

prog | more

or

prog > file; cat file

displaying different things is not something that meets my expectations. The difference in buffering behaviour is already hard enough to explain. Piping to more is far too common for it behaving differently than direct output.

Yours,

Either way you are printing out the same content. 
Except in one case it is _rendered_ correctly and in the other it might not.
This will also not affect linux, it is addressing a very windows-specific problem for which the encoding the console assumes by default is not the execution encoding.
This is explained in more details in Victor's paper http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2093r5.html#unicode 

 
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16