C++ Logo

sg16

Advanced search

Re: Follow up on SG16 review of P2996R2 (Reflection for C++26)

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 29 Apr 2024 17:56:46 -0400
On 4/29/24 12:01 PM, Daveed Vandevoorde wrote:
>
>
>> On Apr 28, 2024, at 11:49 AM, Tom Honermann <tom_at_[hidden]> wrote:
>>
>> SG16 reviewed P2996R2 (Reflection for C++26)
>> <http://wg21.link/p2996r2> during its 2024-04-24 meeting and will
>> continue review during the 2024-05-08 SG16 meeting. I am working on
>> the meeting summary for the previous meeting now and hope to publish
>> it in the next few days. In the meantime, I wanted to get some
>> discussion going to help prepare for our next review (in ~10 days).
>>
>> Daveed's presentation slides are available here
>> <https://docs.google.com/presentation/d/1XYGCTXfnxWyWio8UmLdskl4D4Z7HvKkHHIVmBgT19f8/edit?usp=sharing>
>> for anyone that would like to review them. They appear to have had
>> some minor updates since the SG16 review (e.g., slide 10 is new).
>>
>> Slide 6
>> <https://docs.google.com/presentation/d/1XYGCTXfnxWyWio8UmLdskl4D4Z7HvKkHHIVmBgT19f8/edit#slide=id.g2cf22bb8e94_0_16>
>> lists some requirements for the design. These include:
>>
>> * Round-tripping must work (e.g., names returned by name_of() must
>> be valid input to data_member_spec() via
>> data_member_options_t::name).
>> * Output to std::cout must work reasonably (e.g., std::cout <<
>> name_of(^int)).
>> * Some text may not be source-like text (std::meta::display_name_of).
>>
>> I would like to further clarify these requirements. P2996R2 authors,
>> please answer the following questions.
>>
>> Are names returned by qualified_name_of() required to be round-trippable?
>>
>
> No.
>>
>> Is support for std::cout specifically required or would support for
>> std::format() and std::print() suffice? If std::cout support is
>> specifically required, what motivates that requirement?
>>
>
> Although *any* reasonably reliable form of output would suffice
> (including printf), having `std::cout << name_of(refl)` just work is
> quite desirable, if only for didactic purposes.
Thanks. I agree having that just work is desirable and I think we can
get there if name_of() returns a dedicated type as opposed to
std::string_view.
>
>> Are all of name_of(), qualified_name_of(), and display_name_of()
>> required to return text (perhaps not source-like text, but content
>> that is nevertheless text)? In other words, can they be guaranteed to
>> provide well-formed text in some encoding?
>>
>
> Yes, I think so.
Excellent.
>
>> During the meeting, we briefly discussed use of an opaque type for
>> the return type of name_of() and friends. I would like to see further
>> exploration of this idea prior to our next review. I'm envisioning a
>> type something like the following (This particular formulation
>> follows existing precedent established by the std::filesystem::path
>> native format observers, [fs.path.native.obs]
>> <http://eel.is/c++draft/fs.path.native.obs>).
>>
>> class name {
>> std::string_view /internal-representation/; // exposition only.
>> name(/* unspecified */);
>> public:
>> constexpr std::string string() const; // ordinary literal
>> encoding.
>> constexpr std::wstring wstring() const; // wide literal
>> encoding.
>> constexpr std::u8string u8string() const; // UTF-8.
>> constexpr std::u16string u16string() const; // UTF-16.
>> constexpr std::u32string u32string() const; // UTF-32.
>> };
>>
>> The intent is that the data accessed by the /internal-representation/
>> member has static storage duration; perhaps a string literal. The
>> observers would then provide access to the name in the above
>> encodings. If I'm not mistaken, this should enable use of these names
>> in std::basic_string objects during constant evaluation (so long as
>> the object's lifetime is appropriately constrained) and run-time
>> while only requiring static persistence of the internal representation.
>>
>
> I think that option has some interest, but not with the
> internal-representation having static storage duration and not with
> all encodings.
Can you elaborate? Why not give the internal-representation static
storage duration and convert on demand? Is it because persisting a
particular encoded result produced during constant evaluation to
run-time is awkward and could potentially increase static storage (I
would expect the linker to discard representations that are unreferenced
at run-time)? I don't have much experience with reflection use cases, so
this is a good opportunity for me to learn something.
> Instead, the internal representation could be “reflection-like”, and
> the member “conversion” functions could produce string/u8string for
> ephemeral results, string_view/u8string_view for persistent ones
> (i.e., the latter would create a static storage duration literal-like
> representation from the compiler-internal representation on demand),
> and perhaps even a char const* for easy interaction with C interfaces.

This sounds like what I was trying to achieve with the above.

What is the motivation for persisting a particular encoded result? If a
converted result is used to generate a name that is consumed by
data_member_spec() or similar, I would expect the compiler magic that
powers those consumers to perform any necessary persistence; likely in
the internal-representation format. Likewise, if a particular encoded
result is needed at run-time, it can be produced at run-time from the
persisted internal-representation (with commensurate overhead of course).

How would you define the class?

>
>> Support for all five of the standard specified encodings is not
>> necessarily required. SG16 can provide a recommendation for LEWG. The
>> paper should discuss the pros, cons, and implementation costs for
>> support of each encoding. Given existing precedent, lack of support
>> for any given encoding should be motivated.
>>
>> Note that the above type suffices to provide support for printing
>> names via std::cout, std::format(), and std::print(). A iostream
>> insertion operator and/or a std::formatter specialization could be
>> defined to enable printing names without having to call one of the
>> member functions.
>>
>> If I understand the intent correctly (it would be helpful to clarify
>> this in a revision of the paper), names returned by name_of() do not
>> reflect a scope but do not necessarily reflect an identifier either.
>> For example, something like "operator bool" might be returned.
>> Tangentially, I think the formatting of names accepted by
>> data_member_spec() needs to be rigorously specified for programs to
>> be portable.
>>
>
> Right. For now, since we only propose the synthesis of data members
> in P2996, something like `operator+` or `operator bool` is not
> possible on the input side. But eventually, I’m sure those things
> will be proposes.
>
>> We will need to make a decision regarding how characters that lack
>> representation in the ordinary and wide literal encodings are to be
>> handled. We have a few options.
>>
>> 1. Don't provide the above string() and wstring() member functions.
>> 2. Constrain the above string() and wstring() member functions so
>> that they are not callable (e.g., does not participate in
>> overload resolution...) if the associated literal encoding is
>> unable to represent all characters that might appear in an
>> identifier.
>> 3. Specify that the above string() and wstring() member functions
>> fail constant evaluation or throw an exception (or similar error
>> handling) if the name uses a character that is not representable
>> in the associated literal encoding.
>> 4. Specify a way to encode non-representable characters in the names
>> returned by the above string() and wstring() member functions and
>> specify that data_member_spec() accepts such encoded names.
>>
>
> I much prefer option 4 here, and I would like the encoding to look
> like UCNs. Second is probably option 3. I think option 2 is
> impractical altogether.

In retrospect, I agree that option 2 seems weird at best. I don't favor
option 1 either.

I offered an additional option in response to one of Peter's messages.

5. Require that consumers of names like data_member_spec() accept
UNC-like encoded characters (UNC-like because these wouldn't actually be
UNCs) and require that producers of names like name_of() only generate a
UNC-like encoded character when the associated (ordinary/wide literal)
encoding lacks representation for the character.

This would enable both strictly portable names (e.g., a name provided
for consumption could contain only basic literal character set
characters with UNC-like escapes for other characters regardless of
whether the associated literal encoding has support for additional
characters; much like UNCs) and avoid a requirement to escape characters
outside the basic literal character set.

>
>> 4.
>>
>>
>>
>> If data_member_spec() is modified to accept names specified by a
>> class like name above, then it will be necessary to specify a way to
>> construct an object of that type with a name constructed during
>> constant evaluation. Assuming the /internal-representation/ format is
>> unspecified, this will require means to construct that format in a
>> buffer with a lifetime that matches the lifetime of the constructed
>> object.
>>
> Yes, sort of. That “buffer” need not be something visible to the
> language. We have a reflection interface called “reflect_value” that
> can also be used for that purpose.

I'm envisioning a change to data_member_options_t to store an (optional)
object of type name instead of string_view. That would require a way to
construct an object of type name given an appropriately encoded name.
The version of name defined above is a reference-like type. Are you
suggesting that there would be consteval constructors that utilize
compiler magic like that needed for reflect_value() provided for this
purpose? If so, that makes sense to me.

Tom.

>
> Daveed
>
>

Received on 2024-04-29 21:56:50