<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>SG16 will hold a telecon on Wednesday, October 12th, at 19:30 UTC
      (<a
href="https://www.timeanddate.com/worldclock/converter.html?iso=20221012T193000&amp;p1=1440&amp;p2=tz_pdt&amp;p3=tz_mdt&amp;p4=tz_cdt&amp;p5=tz_edt&amp;p6=tz_cest"
        moz-do-not-send="true">timezone conversion</a>).</p>
    <p>The agenda is:</p>
    <ul>
      <li>A presentation by Michael Kuperstein regarding i18n and l10n
        and existing practice
        in the industry.</li>
      <li>NB comment processing.</li>
    </ul>
    <p>INCITS has made US NB comments available to its members. I
      reviewed the list and identified the following as ones that I
      believe SG16 should establish a position on. There are other
      comments that are related to papers SG16 has previously discussed,
      but in those cases, I believe the concerns raised do not require
      SG16 input.</p>
    <p>Due to duplicated comments in the list of US comments, it is
      possible that the comment identifiers below will change.</p>
    <h1>US-2: <a moz-do-not-send="true"
        href="http://eel.is/c++draft/defns.multibyte">[defns.multibyte]</a></h1>
    <p> The notion of an "execution character set" is no longer given
      prominence in the Draft standard, aside from some notes about its
      relationship to the concept as defined by C, and clarifying that
      certain character encodings are unrelated to this character set.
      This makes it a questionable choice for use in the definition of
      "multibyte character".
      <style type="text/css">p { line-height: 115%; text-align: justify; orphans: 2; widows: 2; margin-bottom: 0.1in; direction: ltr; background: transparent }p.western { font-family: "Arial", serif; font-size: 11pt; so-language: en-GB }p.cjk { font-size: 11pt }a:visited { color: #954f72; text-decoration: underline }a:link { color: #0563c1; text-decoration: underline }</style></p>
    <p><b>Proposed change:</b></p>
    <p>Change the definition of "multibyte character" to use a character
      encoding with a more definite specification given by the Standard.</p>
    <h1>US-38: <a moz-do-not-send="true"
        href="https://eel.is/c++draft/format.string.escaped">[format.string.escaped]</a></h1>
    <p>The subject subclause describes how characters or strings are
      "escaped" to be formatted more suitably "for debugging or for
      logging".<br>
      <br>
      The actual suitability for debugging or for logging depends on the
      needs of the application, and there is a conflict between
      formatting for human readability of the textual content and
      formatting for clarity and fidelity of encoding nuances. Indeed,
      for the latter, there can still be (for stateful encodings) a
      conflict between formatting for human visual inspection versus
      formatting for machine consumption of the output sequence as a C++
      string/character literal.<br>
      <br>
      The current design introduces extensions to the API and to the
      format string syntax that assume that there is one specific
      default that should be chosen "for debugging or for logging". The
      reasoning behind the chosen default and the extensibility of the
      current design does not appear to be sufficiently explored.<br>
      <br>
      Note 1:<br>
      An example, for Unicode encodings, of a choice between
      prioritizing between human readability of the textual content and
      visual clarity of encoding nuances is in the treatment of
      characters having Unicode property Grapheme_Extend=Yes. The
      current design favors visual clarity of encoding nuances by
      outputing such characters as escape sequences.<br>
      <br>
      Note 2:<br>
      For stateful encodings, the lack of return to the initial shift
      state at the end of the sequence cannot be represented using a C++
      string/character literal unless if a prior shift sequence from the
      initial shift state is rendered via escape sequence(s). It is not
      clear that scanning forward is generally always an option (nor is
      it clear that doing so is desirable).<br>
    </p>
    <p><b>Proposed change:</b></p>
    <p>Narrow the purported scope and affirm the design choices of the
      default behavior:<br>
      Modify "logging" to "technical logging" and spell out the
      priorities in order in the description (this has the benefit of
      clearly communicating intention and providing guidance for
      implementation choices).<br>
    </p>
    <ol>
      <li>The output is intended to be a C++ string/character literal
        that reproduces the encoded sequence. (This seems to be taken
        for granted and not made explicit in the current draft.)</li>
      <li>Prefer visually distinguishing between different methods of
        encoding "equivalent" textual content.<br>
      </li>
    </ol>
    <p>Make any adjustments necessary to the API or the format string
      syntax associated with "escaped" strings to allow for future
      additions for alternative escaping.</p>
    <h1>US-64: <a moz-do-not-send="true"
        href="https://eel.is/c++draft/uaxid.pattern">[uaxid.pattern]</a></h1>
    <p>The Unicode org has clarified that the pattern whitespace and
      pattern syntax rules apply to the lexing and parsing of computer
      languages.</p>
    <p><b>Proposed change:</b></p>
    <p>Replace with "UAX#31 describes how formal languages such as
      computer languages should describe and implement their use of
      whitespace and syntactically significant characters during the
      processes of lexing and parsing. C++ does not claim conformance
      with this requirement." <br>
    </p>
    <p>Tom.<br>
    </p>
  </body>
</html>

