sg16: [SG16] Agenda for the 2021-03-24 SG16 telecon

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 16 Mar 2021 10:59:34 -0400

SG16 will hold a telecon on Wednesday, March 24th at 19:30 UTC (timezone
conversion
<https://www.timeanddate.com/worldclock/converter.html?iso=20210324T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cet>).

*For participants in North America, please note that daylight savings
time went into effect this past weekend, so this telecon will start one
hour later than our last telecon (Mexico doesn't observe DST until April
4th).*

The agenda is:

  * Continue discussion from the last telecon concerning:
      o D2314R1: Character sets and encodings
        <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>
      o D2297R1: Wording improvements for encodings and character sets
        <https://isocpp.org/files/papers/D2297R1.pdf>
  * Discuss priorities and goals for C++23.

For D2314R1
<https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> and
D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>, discussion will
be limited to new information that might help to break the stalemate
regarding use of an abstract character set or UCS scalar values as the
specification tool for describing translation. If consensus is not
reached, we'll poll forwarding D2314R1
<https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> with
direction that EWG and/or CWG choose the wording mechanism.

Per P1000 <https://wg21.link/p1000>, papers targeting C++23 must be
forwarded by EWG/LEWG to CWG/LWG by the February, 2022 meeting
(Portland). However, the deadline for initial papers proposing new
language features is ~November, 2021. Time is running short, and
competition for time in EWG/LEWG will increase.

The following lists the current state of SG16 related papers and our
C++23 effort to date. This is presented as food for thought. What story
does this tell? How will that story be received by the C++ community?
What should we do with our remaining time to either strengthen or change
that story? What can we realistically do to bring more direct benefits
to the C++ community? It may be interesting to review what we were
thinking about during our March 13th, 2019 telecon
<https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2019.md#march-13th-2019>.

These papers have been accepted for C++23:

  * P2029 <https://wg21.link/p2029>: Proposed resolution for core issues
    411, 1656, and 2333; numeric and universal character escapes in
    character and string literals

These papers have been approved by EWG and are in the pipeline for CWG:

  * P1949 <https://wg21.link/p1949>: C++ Identifier Syntax using Unicode
    Standard Annex 31
  * P2201 <https://wg21.link/p2201>: Mixed string literal concatenation
  * P2223 <https://wg21.link/p2223>: Trimming whitespaces before line
    splicing

These papers have been approved by SG16 and are in the pipeline for
EWG/LEWG:

  * P1885 <https://wg21.link/p1885>: Naming Text Encodings to Demystify Them
  * P2093 <https://wg21.link/p2093>: Formatted output
  * P2246 <https://wg21.link/p2246>: Character encoding of diagnostic text
  * P2316 <https://wg21.link/p2316>: Consistent character literal encoding

These papers are in the pipeline for EWG/LEWG, but require a revision to
make progress:

  * P2071 <https://wg21.link/p2071>: Named universal character escapes

These papers are currently active in SG16:

  * D2314R1
    <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>:
    Character sets and encodings
  * D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>: Wording
    improvements for encodings and character sets

With that summary of what we have been doing above in mind, the
following lists provide some options for what we could work on next.

These are existing papers available for SG16 to prioritize: (Some of
these, such as P1629, are awaiting revisions).

  * P1628 <https://wg21.link/p1628>: Unicode character properties
  * P1629 <https://wg21.link/p1629>: Standard Text Encoding
  * P1729 <https://wg21.link/p1729>: Text Parsing
  * P1859 <https://wg21.link/p1859>: Standard terminology for execution
    character set encodings
  * P1953 <https://wg21.link/p1953>: Unicode Identifiers And Reflection
  * P2295 <https://wg21.link/p2295>: Correct UTF-8 handling during phase
    1 of translation

And finally, here are some ideas that have been discussed, but that we
do not currently have papers covering:

  * UTF-8 as a portable source file encoding (the paper Tom started and
    has long intended to complete).
  * Requiring wchar_t to represent all members of the execution wide
    character set does not match existing practice
    <https://github.com/sg16-unicode/sg16/issues/9>
  * WG21 P1854: Source to Execution encoding conversion should not lead
    to loss of information <https://github.com/sg16-unicode/sg16/issues/50>
  * Deprecate std::regex <https://github.com/sg16-unicode/sg16/issues/57>
  * Make wide multicharacter character literals ill-formed
    <https://github.com/sg16-unicode/sg16/issues/65>
  * Improve portable ingestion of command-line arguments
    <https://github.com/sg16-unicode/sg16/issues/66>
  * Alias barriers; a replacement for the ICU hack
    <https://github.com/sg16-unicode/sg16/issues/67>

Our efforts will need to be balanced with any effort expended to align
C23 with changes made for C++20 and C++23:

  * WG14 N2231: char8_t: A type for UTF-8 characters and strings
    <https://github.com/sg16-unicode/sg16/issues/5>
  * WG14: Make char16_t/char32_t string literals be UTF-16/32
    <https://github.com/sg16-unicode/sg16/issues/54>
  * WG14: Improve support for Unicode characters in identifiers
    <https://github.com/sg16-unicode/sg16/issues/56>
  * WG14: numerical & universal character escapes in char & string
    literals <https://github.com/sg16-unicode/sg16/issues/63>
  * WG14: Trimming whitespace before line splicing
    <https://github.com/sg16-unicode/sg16/issues/64>

Tom.

Received on 2021-03-16 09:59:37