C++ Logo

sg16

Advanced search

Re: Follow up on SG16 review of P2996R2 (Reflection for C++26)

From: Peter Dimov <pdimov_at_[hidden]>
Date: Mon, 29 Apr 2024 23:11:21 +0300
Tom Honermann wrote:
> I'm not entirely sure that cout << std::format("{}", u8"...") is that much
> easier
> to specify and support.
>
> But I'll be glad to be proven wrong, of course. :-)
>
> There is a relevant SO comment
> <https://stackoverflow.com/questions/58878651/what-is-the-printf-
> formatting-character-for-char8-t/58895428#58895428> .
>
> std::format() and std::print(), to some extent, improve the likelihood that an
> implementation selected encoding will be a good match for the programmer's
> intent. This is because:
>
> 1. std::format() and std::print() are not implicitly locale dependent; that
> rules out selection of a locale dependent execution encoding.
> 2. std::format() returns a std::string; that rules out selection of an I/O
> dependent encoding.
> 3. std::print() writes to an I/O stream, but has special behavior for writes
> to a terminal; that rules out selection of a terminal encoding (as unnecessary,
> at least in important cases).
> 4. std::format() and std::print() are both strongly associated with the
> ordinary/wide literal encoding.
> 5. std::format() and std::print() should have the same behavior (other than
> that std::print(...) may produce a better result than std::cout <<
> std::format(...) when the output is directed to a terminal).
> 6. std::format() and std::print() have additional guarantees when the
> ordinary/wide literal encoding is a UTF encoding.
>
>
> Due to those characteristics, we have good motivation for implicit use of the
> ordinary/wide literal encoding as the target for transcoding for std::format()
> and std::print().

I'm afraid that I don't quite understand.

What does std::format( "{}", u8"..." ) actually do? I suppose it transcodes
the UTF-8 input into the narrow literal encoding (replacing irrepresentable
characters with '?' instead of throwing, I presume, or it would be not very
usable)?

And then we just fall back to std::cout << "...", where the "..." is in the
narrow literal encoding and hence we assume works, more or less.

And we don't want to make std::cout << u8"..." do that, because it can,
in principle, do better?

But let me get back to your list.

> 1. std::format() and std::print() are not implicitly locale dependent; that
> rules out selection of a locale dependent execution encoding.

What is in a locale-dependent execution encoding in std::cout << u8"..."?

> 2. std::format() returns a std::string; that rules out selection of an I/O
> dependent encoding.

Same question. Where is the I/O dependent encoding in std::cout << u8"..."
(that is not also present in std::cout << some_std_string)?

> 3. std::print() writes to an I/O stream, but has special behavior for writes
> to a terminal; that rules out selection of a terminal encoding (as unnecessary,
> at least in important cases).

This doesn't apply here, because we're using std::format.

> 5. std::format() and std::print() should have the same behavior (other than
> that std::print(...) may produce a better result than std::cout <<
> std::format(...) when the output is directed to a terminal).

OK... but this isn't relevant.

> 6. std::format() and std::print() have additional guarantees when the
> ordinary/wide literal encoding is a UTF encoding.

What additional guarantees, and how do they help here?

Received on 2024-04-29 20:11:26