<div dir="ltr"><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Wed, Feb 25, 2026 at 4:53 PM Tom Honermann via SG16 &lt;<a href="mailto:sg16@lists.isocpp.org">sg16@lists.isocpp.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>

  
    
  
  <div>
    <div>On 2/25/26 3:10 PM, Corentin Jabot via
      SG16 wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="auto">
        <div><br>
          <br>
          <div class="gmail_quote">
            <div dir="ltr" class="gmail_attr">On Wed, Feb 25, 2026,
              20:31 Tom Honermann via SG16 &lt;<a href="mailto:sg16@lists.isocpp.org" target="_blank">sg16@lists.isocpp.org</a>&gt;
              wrote:<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
              <div>
                <p>While re-reading the papers today, I encountered a
                  couple of questions related to lexing of interpolated
                  literals and handling of digraphs and UCNs. If time
                  permits today, we can discuss these examples. I&#39;m
                  using the syntax from P3412R3 below, but I think the
                  questions apply to P2951R0 as well.</p>
                <p>P3412R3 section 7.1 prompted me to think of these
                  lambda examples concerning lexical scanning for &#39;,&#39;
                  and &#39;:&#39;. Note that angle brackets are not used for
                  bracket matching. These might be worth adding as
                  examples.</p>
                <ul>
                  <li><font face="monospace">f&quot;{[]&lt;int,int&gt;{}}&quot;   
                                   // lambda must be enclosed in
                      parenthesis.</font><br>
                  </li>
                  <li><font face="monospace">f&quot;{[] post (<a rel="noreferrer">r:r</a>&gt;0)
                      { return 1; }}&quot; // lambda must be enclosed in
                      parenthesis.</font></li>
                </ul>
              </div>
            </blockquote>
          </div>
        </div>
      </div>
    </blockquote>
    <p>Ignore the examples above. They are nonsense that I put together
      too quickly without thinking things through. I was trying to find
      counter examples to the following claim from section 7.1 of <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3412r3.pdf" target="_blank">P3412R3</a>.
      I failed. Thanks to Barry for correcting me offline.</p>
    <blockquote>
      <p>Note that lambdas, which may contain any type of code including
        for instance goto labels, always contain this code inside
        matched braces, so any colons will be ignored when detecting the
        expression-field end. The same goes for <i>statement-expressions</i>
        of gcc and <i>blocks</i> of Clang.</p>
    </blockquote>
    <blockquote type="cite">
      <div dir="auto">
        <div>
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
              <div>
                <ul>
                </ul>
                <p>It makes sense for digraphs to not be recognized as
                  part of an f-literal; that is consistent with string
                  literals. Within string literals, digraphs aren&#39;t
                  needed because UCNs can be used to specify characters
                  that are in the basic character set but not
                  necessarily available in the input source file
                  encoding. Normally, UCNs are not permitted to match
                  members of the basic character set in language syntax.
                  What about in extraction fields? Should the following
                  be well-formed?</p>
                <ul>
                  <li><font face="monospace">f&quot;The answer is {boolVar ?
                      17 \u003A 42}&quot;; // U+003A is &#39;:&#39;</font></li>
                </ul>
                <p>There appears to be a tension with regard to lexical
                  scanning of f-literals and parsing of extraction
                  fields. Can macros allow for use of digraphs and UCNs
                  in extraction fields? Should these be well-formed?</p>
                <ul>
                  <li><font face="monospace">#define COLON :<br>
                      f&quot;The answer is {boolVar ? 17 COLON 42}&quot;;</font></li>
                  <li><font face="monospace">#define LEFT_SQUARE_BRACKET
                      &lt;:<br>
                      #define RIGHT_SQUARE_BRACKET :&gt;<br>
                      f&quot;The size is {sizeof int LEFT_SQUARE_BRACKET 42
                      RIGHT_SQUARE_BRACKET}&quot;;</font></li>
                </ul>
                <p>Tom.</p>
              </div>
            </blockquote>
          </div>
        </div>
        <div dir="auto">When parsing an embedded expression, they should
          be parsed as expressions (digraphs, no ASCII ucn etc). when
          parsing the string literal outside of embedded expression
          fragments, the rules of string literals should apply (no
          digraphs, etc). </div>
      </div>
    </blockquote>
    <p>That matches what I was thinking and what lead me to ask the
      question.</p>
    <blockquote type="cite">
      <div dir="auto">
        <div dir="auto"><br>
        </div>
        <div dir="auto">anything else would be a great implementation
          burden and weird form a user standpoint.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">Whether macro expands seems like an lewg
          question (if we follow the model of these being nested
          expressions, then macro expansion should happen for
          consistency but it depends when an implementation can actually
          parse these fragments)</div>
      </div>
    </blockquote>
    <p>Yes, that is a LEWG question. But it is relevant to SG16 with
      regard to the ability to specify the &#39;[&#39;, &#39;]&#39;, &#39;{&#39;, &#39;}&#39;, &#39;#&#39;, and
      &#39;##&#39; tokens in extraction fields in source files that have an
      encoding that doesn&#39;t support them. If macro expansion doesn&#39;t
      occur, then we presumably need to support use of digraphs or UCNs
      in some way.</p></div></blockquote>I think you guys meant EWG (not LEWG). If there is no &#39;{&#39; or &#39;}&#39; in the source file encoding, then I think the only answer is that the feature cannot be used with such source files.</div><div class="gmail_quote gmail_quote_container"><br></div><div class="gmail_quote gmail_quote_container">-- HT</div></div>

