C++ Logo

sg16

Advanced search

Re: Agenda for the 2024-05-08 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 6 May 2024 14:34:27 -0400
On 5/6/24 1:22 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting on Wednesday, May 8th, at 19:30 UTC (timezone
> conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20240508T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
>
> The agenda follows.
>
> * D3258R0: Formatting of charN_t <https://wg21.link/d3258r0>.
> * P2996R2: Reflection for C++26 <http://wg21.link/p2996r2>.
>
> D3258R0 was hastily produced by Corentin following the review of
> P2996R2 during the 2024-04-24 SG16 meeting
> <https://github.com/sg16-unicode/sg16-meetings/#april-24th-2024> with
> the goal of providing a convenient solution for printing UTF-8 text
> held in char8_t-based storage. It proposes extending std::format() and
> std::print() to support formatting arguments of Unicode character type
> (characters and strings of char8_t, char16_t, or char32_t type). It
> does not propose a solution for iostreams. We won't poll this paper
> during this meeting for two reasons: 1) the paper is hot off the press
> and I don't expect everyone to have already read it and internalized
> all the implications, and 2) I'm going to limit discussion of it to
> the first half of the meeting so that we continue to make progress on
> P2996. The intent in discussing it, particularly with the P2996
> authors present, is to build a sense of whether it suffices to at
> least minimally address the printing requirements posed by the P2996
> authors; we may take a poll on that point.
>
> Our recent review of P2996R2 was constructive but not conclusive.
> We'll continue discussion with a goal of establishing consensus on the
> following points. Please review the meeting summary from the last
> review
> <https://github.com/sg16-unicode/sg16-meetings/#april-24th-2024> as
> well as the ensuing "Follow up on SG16 review of P2996R2" discussion
> on the SG16 mailing list
> <https://lists.isocpp.org/sg16/2024/04/index.php> prior to the meeting.
>
> 1. The character type(s) and encoding(s) used for names produced and
> consumed by reflection interfaces. My sense is that we're leaning
> in the following direction (not unanimously though):
> 1. Names will be produced and consumed in both the ordinary
> literal encoding via type char and UTF-8 via type char8_t.
> 2. Production of names that contain characters that are not
> representable in the ordinary literal encoding will produce a
> string that contains a UCN-like escape sequence for such
> characters.
> 3. Consumption of names in the ordinary literal encoding will
> accept a UCN-like escape sequence for characters not in the
> basic literal character set that may lack representation in
> the ordinary literal encoding.
> 2. The use of a distinct type for names (e.g., a type that stores
> names in an internal representation and exposes them via char and
> char8_t interfaces).
> 3. Unicode NFC requirements (see below).
>
> We briefly discussed Unicode normalization form C (NFC) last time.
> Following adoption of P1949R7 (C++ Identifier Syntax using Unicode
> Standard Annex 31) <https://wg21.link/p1949r7> as a DR for C++23,
> identifiers are required to be written in NFC. Conversion to the
> ordinary literal encoding could result in names that are not in NFC.
> It will presumably be necessary for P2996 to specify that, for
> round-trip purposes, conversion to the ordinary literal encoding will
> not perform character substitutions (e.g., UNC-like escape sequences
> will be generated instead). Likewise, it will be necessary to specify
> how names that do not conform to NFC will be handled by reflection
> interfaces that consume user provided names. Note that current
> compiler releases exhibit implementation divergence with respect to
> enforcement of the NFC requirement (https://godbolt.org/z/E35r1K7hE;
> gcc does diagnose, Clang and EDG do not, MSVC does not yet implement
> P1949R7).
>
Thank you to Robin for pointing out an error in my use of Compiler
Explorer linked above; I neglected to add the /source-charset:utf-8
option for MSVC, so the source code wasn't interpreted correctly.
Corrected at https://godbolt.org/z/x1nxGfrYq; MSVC does not diagnose.
According to MSVC documenation
<https://learn.microsoft.com/en-us/cpp/overview/visual-cpp-language-conformance>,
P1949R7 is not yet implemented (Clang and EDG both document it as
implemented, but fail to diagnose).

Tom.

> Finally, and as a separable issue that can be discussed at another
> time, I think we should discuss differentiating between names and
> identifiers in the reflection interfaces. This isn't an issue for
> data_member_spec() since data members are always identifiers (or are
> unnamed; that is another interesting case, but isn't an SG16 concern),
> but could be an issue for a hypothetical function_spec() or
> member_function_spec() interface used for named functions,
> constructors and destructors, overloaded operators, conversion
> operators, user-defined literals, etc.... Distinguishing between names
> and identifiers would avoid the need to parse, e.g., operator bool or
> ""_udl, when consuming names.
>
> Tom.
>
>

Received on 2024-05-06 18:34:33