C++ Logo

sg16

Advanced search

Re: Follow up on SG16 review of P2996R2 (Reflection for C++26)

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 29 Apr 2024 15:10:39 -0400
On 4/29/24 1:14 PM, Victor Zverovich wrote:
> > cout << u8"", please write a paper.
>
> This is extremely misleading. Nobody needs to write any additional
> papers to support cout (or format) output: the existing paper makes it
> work already and I strongly support the approach chosen in the paper
> and would be opposed to only supporting char8_t regardless of whether
> it is formattable or not.

The proposal in the current paper revision has the unfortunate
consequence of providing a disappointing experience for the common and
forward looking case where the ordinary literal encoding is UTF-8. As a
user, I would be disappointed if, on a modern Windows, Linux, or macOS
system, I was presented a name like "\U{1D6DB}" because a programmer
named their identifier "đť››". I think we can do better than that.

Here is another option:

 1. Require that consumers of names like data_member_spec() accept
    UNC-like encoded characters (UNC-like because these wouldn't
    actually be UNCs).
 2. Require that producers of names like name_of() only generate a
    UNC-like encoded character when the associated (ordinary/wide
    literal) encoding lacks representation for the character.

Tom.

>
> - Victor
>
> On Mon, Apr 29, 2024 at 10:08 AM Corentin Jabot
> <corentinjabot_at_[hidden]> wrote:
>
>
>
> On Sun, Apr 28, 2024 at 8:36 PM Tom Honermann via SG16
> <sg16_at_[hidden]> wrote:
>
>
>> On Apr 28, 2024, at 12:57 PM, Victor Zverovich
>> <victor.zverovich_at_[hidden]> wrote:
>>
>> 
>> > Is support for std::cout specifically required or would
>> support for std::format() and std::print() suffice? If
>> std::cout support is specifically required, what motivates
>> that requirement?
>>
>> Not an author, but it would be very novel and potentially
>> surprising to users to not support std::cout for such
>> a fundamental thing as a reflected name.
>
> There will always be some way to print a name. The question is
> what machinations will be required to do so and what quality
> result will be achieved.
>
>
> If someone wants to support
>
> cout << u8"", please write a paper.
> I am working on cout << std::format("", u8""), but you might think
> this is insufficient, in which case please be the change you want
> to see in the world (changing iostream, how hard can it be?)!
>
>
> We have no way to ensure that a name will actually be rendered
> as a recognizable sequence of glyphs; we can’t make an EBCDIC
> or CP437 terminal display all valid identifiers no matter what
> lengths we go to. std::print() does an objectively better job
> (on Windows) than std::cout is capable of (without the
> application first calling a number of Windows specific
> functions that we cannot do on its behalf), but it can’t
> overcome font limitations and does nothing for EBCDIC-based
> systems.
>
> We can provide a best effort default behavior, but regardless,
> some subset of users will likely have to apply their own
> transformations to satisfy their needs subject to the
> restrictions of their environment.
>
> We don’t have a lot of precedent to draw on here. I think the
> closest we have is the conditional display of “μs” vs “us” in
> https://eel.is/c++draft/time.duration.io#1.5.
>
> Tom.
>
>>
>> - Victor
>>
>> On Sun, Apr 28, 2024 at 8:49 AM Tom Honermann via SG16
>> <sg16_at_[hidden]> wrote:
>>
>> SG16 reviewed P2996R2 (Reflection for C++26)
>> <http://wg21.link/p2996r2> during its 2024-04-24 meeting
>> and will continue review during the 2024-05-08 SG16
>> meeting. I am working on the meeting summary for the
>> previous meeting now and hope to publish it in the next
>> few days. In the meantime, I wanted to get some
>> discussion going to help prepare for our next review (in
>> ~10 days).
>>
>> Daveed's presentation slides are available here
>> <https://docs.google.com/presentation/d/1XYGCTXfnxWyWio8UmLdskl4D4Z7HvKkHHIVmBgT19f8/edit?usp=sharing>
>> for anyone that would like to review them. They appear to
>> have had some minor updates since the SG16 review (e.g.,
>> slide 10 is new).
>>
>> Slide 6
>> <https://docs.google.com/presentation/d/1XYGCTXfnxWyWio8UmLdskl4D4Z7HvKkHHIVmBgT19f8/edit#slide=id.g2cf22bb8e94_0_16>
>> lists some requirements for the design. These include:
>>
>> * Round-tripping must work (e.g., names returned by
>> name_of() must be valid input to data_member_spec()
>> via data_member_options_t::name).
>> * Output to std::cout must work reasonably (e.g.,
>> std::cout << name_of(^int)).
>> * Some text may not be source-like text
>> (std::meta::display_name_of).
>>
>> I would like to further clarify these requirements.
>> P2996R2 authors, please answer the following questions.
>>
>> Are names returned by qualified_name_of() required to be
>> round-trippable?
>>
>> Is support for std::cout specifically required or would
>> support for std::format() and std::print() suffice? If
>> std::cout support is specifically required, what
>> motivates that requirement?
>>
>> Are all of name_of(), qualified_name_of(), and
>> display_name_of() required to return text (perhaps not
>> source-like text, but content that is nevertheless text)?
>> In other words, can they be guaranteed to provide
>> well-formed text in some encoding?
>>
>> During the meeting, we briefly discussed use of an opaque
>> type for the return type of name_of() and friends. I
>> would like to see further exploration of this idea prior
>> to our next review. I'm envisioning a type something like
>> the following (This particular formulation follows
>> existing precedent established by the
>> std::filesystem::path native format observers,
>> [fs.path.native.obs]
>> <http://eel.is/c++draft/fs.path.native.obs>).
>>
>> class name {
>> std::string_view /internal-representation/; //
>> exposition only.
>> name(/* unspecified */);
>> public:
>> constexpr std::string string() const; // ordinary
>> literal encoding.
>> constexpr std::wstring wstring() const; // wide
>> literal encoding.
>> constexpr std::u8string u8string() const; // UTF-8.
>> constexpr std::u16string u16string() const; // UTF-16.
>> constexpr std::u32string u32string() const; // UTF-32.
>> };
>>
>> The intent is that the data accessed by the
>> /internal-representation/ member has static storage
>> duration; perhaps a string literal. The observers would
>> then provide access to the name in the above encodings.
>> If I'm not mistaken, this should enable use of these
>> names in std::basic_string objects during constant
>> evaluation (so long as the object's lifetime is
>> appropriately constrained) and run-time while only
>> requiring static persistence of the internal representation.
>>
>> Support for all five of the standard specified encodings
>> is not necessarily required. SG16 can provide a
>> recommendation for LEWG. The paper should discuss the
>> pros, cons, and implementation costs for support of each
>> encoding. Given existing precedent, lack of support for
>> any given encoding should be motivated.
>>
>> Note that the above type suffices to provide support for
>> printing names via std::cout, std::format(), and
>> std::print(). A iostream insertion operator and/or a
>> std::formatter specialization could be defined to enable
>> printing names without having to call one of the member
>> functions.
>>
>> If I understand the intent correctly (it would be helpful
>> to clarify this in a revision of the paper), names
>> returned by name_of() do not reflect a scope but do not
>> necessarily reflect an identifier either. For example,
>> something like "operator bool" might be returned.
>> Tangentially, I think the formatting of names accepted by
>> data_member_spec() needs to be rigorously specified for
>> programs to be portable.
>>
>> We will need to make a decision regarding how characters
>> that lack representation in the ordinary and wide literal
>> encodings are to be handled. We have a few options.
>>
>> 1. Don't provide the above string() and wstring() member
>> functions.
>> 2. Constrain the above string() and wstring() member
>> functions so that they are not callable (e.g., does
>> not participate in overload resolution...) if the
>> associated literal encoding is unable to represent
>> all characters that might appear in an identifier.
>> 3. Specify that the above string() and wstring() member
>> functions fail constant evaluation or throw an
>> exception (or similar error handling) if the name
>> uses a character that is not representable in the
>> associated literal encoding.
>> 4. Specify a way to encode non-representable characters
>> in the names returned by the above string() and
>> wstring() member functions and specify that
>> data_member_spec() accepts such encoded names.
>>
>> If data_member_spec() is modified to accept names
>> specified by a class like name above, then it will be
>> necessary to specify a way to construct an object of that
>> type with a name constructed during constant evaluation.
>> Assuming the /internal-representation/ format is
>> unspecified, this will require means to construct that
>> format in a buffer with a lifetime that matches the
>> lifetime of the constructed object.
>>
>> Tom.
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2024-04-29 19:10:46