<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, May 25, 2021 at 5:19 PM Tom Honermann &lt;<a href="mailto:tom@honermann.net">tom@honermann.net</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <div>On 5/25/21 10:36 AM, Corentin Jabot via
      SG16 wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div dir="ltr"><br>
        </div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Tue, May 25, 2021 at 3:08
            PM Tom Honermann via SG16 &lt;<a href="mailto:sg16@lists.isocpp.org" target="_blank">sg16@lists.isocpp.org</a>&gt;
            wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div>
              <div>
                <div>Reminder that this meeting is taking place
                  tomorrow.  The agenda remains the same.<br>
                </div>
                <div><br>
                </div>
                <div>Tom.<br>
                </div>
              </div>
              <div><br>
              </div>
              <div>On 5/16/21 5:23 PM, Tom Honermann via SG16 wrote:<br>
              </div>
              <blockquote type="cite">
                <p> </p>
                <div lang="x-unicode">
                  <p>SG16 will hold a telecon on Wednesday, May 26th at
                    19:30 UTC (<a href="https://www.timeanddate.com/worldclock/converter.html?iso=20210526T193000&amp;p1=1440&amp;p2=tz_pdt&amp;p3=tz_mdt&amp;p4=tz_cdt&amp;p5=tz_edt&amp;p6=tz_cest" target="_blank">timezone
                      conversion</a>).</p>
                  <p>The agenda is:</p>
                  <ul>
                    <li><a href="https://wg21.link/p2295r3" rel="nofollow" target="_blank">P2295R3: Support for
                        UTF-8 as a portable source file encoding</a></li>
                    <ul>
                      <li>Review updates intended to address prior SG16
                        feedback.</li>
                    </ul>
                    <li><a href="https://wg21.link/p2093r6" rel="nofollow" target="_blank">P2093R6: Formatted output</a></li>
                    <ul>
                      <li>Discuss locale dependent character encoding
                        concerns.<br>
                      </li>
                    </ul>
                  </ul>
                  <p>Since we did not get to discuss P2295R3 at our last
                    telecon, it will again retain the top spot on the
                    agenda followed by P2093R6.  Thus, the agenda looks
                    much the same as for the last telecon (I dropped <a href="https://wg21.link/p2348r0" target="_blank">P2348R0</a> for now; we
                    won&#39;t realistically get to it).<br>
                  </p>
                </div>
              </blockquote>
            </div>
          </blockquote>
          <div><br>
          </div>
          <div>I will try to be there, no promise though.</div>
        </div>
      </div>
    </blockquote>
    <p>Thanks for letting me know.  If you are unable to attend, and if
      you don&#39;t object, we&#39;ll still review P2295R3 and carefully record
      any requested changes so that we can keep making progress on this
      paper.<br></p></div></blockquote><div><br></div><div>Given my disagreement with some recent suggestions, that might be counterproductive!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
    </p>
    <p>Tom.<br>
    </p>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_quote">
          <div>Btw I would love feedback on P2348. There is little but
            wording in this paper so mail might be as good or better
            avenue for such feedback :)</div>
          <div><br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div>
              <blockquote type="cite">
                <div lang="x-unicode">
                  <p> </p>
                  <p>With regard to <a href="https://wg21.link/p2093r6" target="_blank">P2093R6</a>,
                    the current status is unchanged; LEWG has referred
                    the paper back to SG16 for further discussion;
                    please see the LEWG meeting minutes <a href="https://wiki.edg.com/bin/view/Wg21telecons2021/P2093#Library-Evolution-2021-04-06" target="_blank">here</a>. 
                    Specifically, LEWG would benefit from additional
                    analysis of <a href="http://lists.isocpp.org/lib-ext/2021/03/18189.php" target="_blank">previously
                      deferred questions</a> regarding character
                    encoding concerns, transcoding requirements (or the
                    lack there of) and the ensuing consequences (or lack
                    there of).<br>
                  </p>
                  <ol>
                    <li>How errors in transcoding should be handled. 
                      E.g., when transcoding from UTF-8 to a UTF-16
                      based console interface and the UTF-8 input is not
                      well-formed.</li>
                    <li>The choice to base behavior on the compile-time
                      choice of literal encoding.  An implication of the
                      current proposal is that a program that contains
                      only ASCII characters in string literals will
                      change behavior depending on whether the literal
                      encoding is UTF-8 vs ASCII (or some other ASCII
                      derived encoding).</li>
                    <li>Whether transcoding to the console interface
                      encoding should be performed when the literal
                      encoding is not UTF-8.</li>
                    <li>What the implications are for future support of
                      <tt>std::print(&quot;{} {} {}</tt><tt> {}&quot;, L&quot;Wide
                        text&quot;, u8&quot;UTF-8 text&quot;, u&quot;UTF-16 text&quot;, U&quot;UTF-32
                        text&quot;</tt>).<br>
                    </li>
                  </ol>
                  <p>At our last telecon, we focused on how to handle
                    ill-formed inputs, but did not much discuss how such
                    inputs arise.  Now that <a href="https://cplusplus.github.io/LWG/issue3547" target="_blank">LWG3547</a>
                    has been effectively (though not officially)
                    resolved by <a href="https://wg21.link/p2372r1" target="_blank">P2372R1</a>,
                    we have a concrete example of how the <tt>std::print()</tt>
                    facility itself can produce ill-formed input
                    (assuming that <tt>std::print()</tt> transcodes all
                    inputs using the same encoding).  I would like to
                    start with this example as I think it is fundamental
                    to how we choose to answer the above questions.<br>
                  </p>
                  <blockquote>
                    <p><tt>std::print(&quot;{:L%p}\n&quot;,
                        std::chrono::system_clock::now().time_since_epoch());</tt></p>
                  </blockquote>
                  <p>At issue is the encoding used by chrono formatters
                    specified with the <tt>L</tt> option to request a
                    locale specific form.  The example above contains
                    the <tt>%p</tt> specifier with the <tt>L</tt>
                    option.  In a Chinese locale the desired translation
                    of &quot;PM&quot; is &quot;下午&quot;, but the locale will provide the
                    translation in the locale encoding.  As specified in
                    P2093R6, if the literal encoding is UTF-8, than <tt>std::print()</tt>
                    will expect the translation to be provided in UTF-8,
                    but if the locale is not UTF-8-based (e.g., Big5;
                    perhaps Shift-JIS for the Japanese 午後 translation),
                    then the result is mojibake.</p>
                  <p>These are possible directions we can investigate to
                    resolve the encoding concerns.</p>
                  <ul>
                    <li>Specialize <a href="https://en.cppreference.com/w/cpp/locale/locale" target="_blank"><tt>std::locale</tt>
                        facets</a> and related I/O manipulators like <a href="https://en.cppreference.com/w/cpp/io/manip/put_time" target="_blank"><tt>std::put_time()</tt></a>
                      for <tt>char8_t</tt>.  This would allow <tt>std::print()</tt>
                      to, when the literal encoding is UTF-8, opt-in to
                      use of the UTF-8/<tt>char8_t</tt> facets and I/O
                      manipulators.<br>
                    </li>
                    <li>When the literal encoding is UTF-8, stipulate
                      that running the program in a non-UTF-8 based
                      locale is non-conforming.  This would effectively
                      require MSVC programmers to, when building code
                      with the <tt>/utf-8</tt> option, to also <a href="https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page" target="_blank">force
                        selection of a UTF-8 code page via a manifest</a>
                      and require use of Windows 10 build 1903 or later.<br>
                    </li>
                    <li>When the literal encoding is UTF-8, specify that
                      non-UTF-8 based locale dependent translations be
                      implicitly transcoded (such transcoding should
                      never result in errors except perhaps for memory
                      allocation failures).<br>
                    </li>
                    <li>Drop the special case handling for the literal
                      encoding being UTF-8 and specify that, when
                      bypassing a stream to write directly to the
                      console, that the output be implicitly transcoded
                      from the current locale dependent encoding
                      (whatever it is) to the console encoding (UTF-8). </li>
                  </ul>
                  <p>Tom.</p>
                </div>
                <br>
                <fieldset></fieldset>
              </blockquote>
              <p><br>
              </p>
            </div>
            -- <br>
            SG16 mailing list<br>
            <a href="mailto:SG16@lists.isocpp.org" target="_blank">SG16@lists.isocpp.org</a><br>
            <a href="https://lists.isocpp.org/mailman/listinfo.cgi/sg16" rel="noreferrer" target="_blank">https://lists.isocpp.org/mailman/listinfo.cgi/sg16</a><br>
          </blockquote>
        </div>
      </div>
      <br>
      <fieldset></fieldset>
    </blockquote>
    <p><br>
    </p>
  </div>

</blockquote></div></div>

