C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Agenda for the 2026-02-25 SG16 meeting

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Wed, 25 Feb 2026 17:14:40 -0500
On Wed, Feb 25, 2026 at 4:53 PM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:

> On 2/25/26 3:10 PM, Corentin Jabot via SG16 wrote:
>
>
>
> On Wed, Feb 25, 2026, 20:31 Tom Honermann via SG16 <sg16_at_[hidden]>
> wrote:
>
>> While re-reading the papers today, I encountered a couple of questions
>> related to lexing of interpolated literals and handling of digraphs and
>> UCNs. If time permits today, we can discuss these examples. I'm using the
>> syntax from P3412R3 below, but I think the questions apply to P2951R0 as
>> well.
>>
>> P3412R3 section 7.1 prompted me to think of these lambda examples
>> concerning lexical scanning for ',' and ':'. Note that angle brackets are
>> not used for bracket matching. These might be worth adding as examples.
>>
>> - f"{[]<int,int>{}}" // lambda must be enclosed in
>> parenthesis.
>> - f"{[] post (r:r>0) { return 1; }}" // lambda must be enclosed in
>> parenthesis.
>>
>> Ignore the examples above. They are nonsense that I put together too
> quickly without thinking things through. I was trying to find counter
> examples to the following claim from section 7.1 of P3412R3
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3412r3.pdf>. I
> failed. Thanks to Barry for correcting me offline.
>
> Note that lambdas, which may contain any type of code including for
> instance goto labels, always contain this code inside matched braces, so
> any colons will be ignored when detecting the expression-field end. The
> same goes for *statement-expressions* of gcc and *blocks* of Clang.
>
>
>>
>> It makes sense for digraphs to not be recognized as part of an f-literal;
>> that is consistent with string literals. Within string literals, digraphs
>> aren't needed because UCNs can be used to specify characters that are in
>> the basic character set but not necessarily available in the input source
>> file encoding. Normally, UCNs are not permitted to match members of the
>> basic character set in language syntax. What about in extraction fields?
>> Should the following be well-formed?
>>
>> - f"The answer is {boolVar ? 17 \u003A 42}"; // U+003A is ':'
>>
>> There appears to be a tension with regard to lexical scanning of
>> f-literals and parsing of extraction fields. Can macros allow for use of
>> digraphs and UCNs in extraction fields? Should these be well-formed?
>>
>> - #define COLON :
>> f"The answer is {boolVar ? 17 COLON 42}";
>> - #define LEFT_SQUARE_BRACKET <:
>> #define RIGHT_SQUARE_BRACKET :>
>> f"The size is {sizeof int LEFT_SQUARE_BRACKET 42
>> RIGHT_SQUARE_BRACKET}";
>>
>> Tom.
>>
> When parsing an embedded expression, they should be parsed as expressions
> (digraphs, no ASCII ucn etc). when parsing the string literal outside of
> embedded expression fragments, the rules of string literals should apply
> (no digraphs, etc).
>
> That matches what I was thinking and what lead me to ask the question.
>
>
> anything else would be a great implementation burden and weird form a user
> standpoint.
>
> Whether macro expands seems like an lewg question (if we follow the model
> of these being nested expressions, then macro expansion should happen for
> consistency but it depends when an implementation can actually parse these
> fragments)
>
> Yes, that is a LEWG question. But it is relevant to SG16 with regard to
> the ability to specify the '[', ']', '{', '}', '#', and '##' tokens in
> extraction fields in source files that have an encoding that doesn't
> support them. If macro expansion doesn't occur, then we presumably need to
> support use of digraphs or UCNs in some way.
>
I think you guys meant EWG (not LEWG). If there is no '{' or '}' in the
source file encoding, then I think the only answer is that the feature
cannot be used with such source files.

-- HT

Received on 2026-02-25 22:15:11