C++ Logo

sg16

Advanced search

Re: Follow up on SG16 review of P2996R2 (Reflection for C++26)

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 29 Apr 2024 14:58:53 -0400
On 4/29/24 1:22 PM, Peter Dimov wrote:
> Corentin Jabot wrote:
>> If someone wants to support
>>
>> cout << u8"", please write a paper.
>> I am working on cout << std::format("", u8""),

Thank you for doing that, Corentin! I'm looking forward to the paper;
particularly with regard to encoding selection (I presume the UTF-8
argument will be transcoded to the ordinary/wide literal encoding) and
how transcoding errors will be handled.

Wait, it should be done by now, right? You said you were going to work
on it this last weekend! 😉😂

>> but you might think this is
>> insufficient, in which case please be the change you want to see in the world
>> (changing iostream, how hard can it be?)!
> I'm not entirely sure that cout << std::format("{}", u8"...") is that much easier
> to specify and support.
>
> But I'll be glad to be proven wrong, of course. :-)

There is a relevant SO comment
<https://stackoverflow.com/questions/58878651/what-is-the-printf-formatting-character-for-char8-t/58895428#58895428>.

std::format() and std::print(), to some extent, improve the likelihood
that an implementation selected encoding will be a good match for the
programmer's intent. This is because:

 1. std::format() and std::print() are not implicitly locale dependent;
    that rules out selection of a locale dependent execution encoding.
 2. std::format() returns a std::string; that rules out selection of an
    I/O dependent encoding.
 3. std::print() writes to an I/O stream, but has special behavior for
    writes to a terminal; that rules out selection of a terminal
    encoding (as unnecessary, at least in important cases).
 4. std::format() and std::print() are both strongly associated with the
    ordinary/wide literal encoding.
 5. std::format() and std::print() should have the same behavior (other
    than that std::print(...) may produce a better result than std::cout
    << std::format(...) when the output is directed to a terminal).
 6. std::format() and std::print() have additional guarantees when the
    ordinary/wide literal encoding is a UTF encoding.

Due to those characteristics, we have good motivation for implicit use
of the ordinary/wide literal encoding as the target for transcoding for
std::format() and std::print().

For std::cout, it is less clear (at least to me) that the same guidance
applies; particularly because of the implicit use of locale data and an
imbued std::codecvt facet.

Tom.

Received on 2024-04-29 18:58:55