<div dir="ltr">Just to clarify: I wouldn&#39;t say that P2216 introduces a new concept. If you look at P0645 as well as its discussions, locale-independent formatting has always been a part of the design and it includes both format string parsing and formatting. Moreover, the ability to do compile-time checks via constexpr formatter&lt;T&gt;::parse functions were added per LEWG request back in R1 of P0645 although admittedly it could be better specified. I agree that if there are any concerns they can be addressed via LWG issues.<div><br></div><div>&gt; I am not aware of implementation experience for this paper in environments where characters significant to the interpretation of the format string are not locale-invariant.<br><br>There is implementation experience with statically detecting UTF-8 in std::format implementation in Microsoft/STL: <a href="https://github.com/microsoft/STL/issues/1820">https://github.com/microsoft/STL/issues/1820</a>. P2216 is also fully implemented in the fmt library and compile-time checks have been available in a different form (because of lack of consteval) for several years.<br><div><br></div><div>Cheers,</div><div>Victor</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jun 6, 2021 at 10:13 PM Tom Honermann &lt;<a href="mailto:tom@honermann.net">tom@honermann.net</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <div>On 6/6/21 8:15 PM, Hubert Tong wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>Just hoping to clarify my own understanding of this, and
          perhaps making an observation in case this is helpful to
          others... I know that some of this has been discussed in
          specific circles.<br>
        </div>
        <div><br>
        </div>
        <div>It is my understanding that this paper introduces a concept
          of compile-time conversion of strings associated with the
          `char` type (and also the same for `wchar_t`) into a &quot;format
          string&quot;. I believe this necessitates attaching semantics to
          the &quot;character&quot; values (at compile time) to recognize
          characters such as the left brace (`{`). It is observed that
          the coded value of the left brace character is locale-variant
          in certain environments. Thus, the paper is establishing that
          format strings are not parsed based on locale. I am aware that
          the intent is to specify that the interpretation is based on
          the encoding used for literals; however, at this time, the
          wording does not indicate that intent.<br>
        </div>
      </div>
    </blockquote>
    Thank you for raising this issue, Hubert.  I agree that there is a
    wording oversight here.  This concern seems like it can be addressed
    via a LWG issue.<br>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>I am not aware of implementation experience for this paper
          in environments where characters significant to the
          interpretation of the format string are not locale-invariant.
          There is, however, reason to believe that an implementation
          can be realistically deployed to such environments while
          giving some ability of the user to choose the text encoding
          under which format strings are parsed. As it is, the paper
          uses <i>format-string</i> and <i>wformat-string</i> as
          exposition-only types in the signature of the `format`
          functions. It is possible for an implementation to version
          these functions (across translation unit boundaries) through
          embedding the text encoding information into these types. A
          mechanism such as std::text_encoding::literal().mib() from
          P1885R5 (which has not yet advanced to plenary) could be used.</div>
      </div>
    </blockquote>
    <p>That mechanism does not appear to be an option for the <tt>vformat_to()</tt>
      or <tt>vformat()</tt> overloads since their signatures do not
      include <i>format-string</i> or <i>wformat-string</i>.  It
      doesn&#39;t look to me like that information can be smuggled through
      the types of <tt>basic_format_args</tt> or <tt>basic_format_context</tt>
      either, though perhaps they could be used to store a value that
      indicates the literal encoding.<br>
    </p>
    <p>[P1885 is currently scheduled for consideration by LEWG during
      its 2021-08-03 telecon]</p>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>The above approach is perhaps more limiting than strictly
          necessary upon extensions that allow the translation of string
          literals to be changed within a translation unit. It is noted
          that P1885R5 exposes the literal encoding as a consteval
          function, which is compatible with context-sensitive
          evaluation by the implementation. In case there is an appetite
          to allow for such context-sensitivity for format strings, it
          is probably the case that updating the text to allow for
          exposition-only extra parameters in the signature is purely a
          specification matter and does not affect implementations where
          such scenarios do not occur. It is also rather likely that
          implementations which do employ such extra parameters are
          conforming anyway (because the extra parameters are only
          observable when a user applies an extension). Nevertheless,
          the paper may be just the beginning of a number of changes
          that are candidates for being considered retroactive to C++20.<br>
        </div>
      </div>
    </blockquote>
    <p>It looks to me like construction of a <tt>basic_format_context</tt>
      specialization is effectively unspecified due to lack of
      constructors and the presence of exposition only data members. 
      Perhaps more of it can be specified as exposition only.</p>
    <p>Incidentally, I think the specification of <tt>basic_format_context</tt>
      may be missing an exposition only <tt>std::locale</tt> data
      member corresponding to any passed to a formatting function.<br>
    </p>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>TL;DR: The paper sets a direction (but does not actually
          spell out that it does) of using the encoding associated with
          literal translation for parsing format strings. The work
          around improving the management and handling of said encodings
          is still ongoing; therefore, where this paper leads us is not
          as clear as it could be given additional time. Nevertheless,
          it is probably the case that further incremental improvements
          can be made on top of this paper without compatibility
          breakage for implementations that choose to deploy earlier. In
          certain environments, quality-of-implementation around this
          paper may be dependent on additional improvements to the
          specification. Since this paper is being considered to be
          retroactive to C++20, it is reasonable to expect that
          improvements of the aforementioned kind would also be
          considered for retroactive inclusion as they are discovered.<br>
        </div>
      </div>
    </blockquote>
    <p>I agree, though the words &quot;probably the case&quot; give me pause. 
      Regardless, at least for me, this does not translate to a desire
      to delay adopting this paper.<br>
    </p>
    <p>Tom.<br>
    </p>
    <p><br>
    </p>
  </div>

</blockquote></div>

