On Tue, Mar 16, 2021 at 10:59 AM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:

SG16 will hold a telecon on Wednesday, March 24th at 19:30 UTC (timezone conversion).

For participants in North America, please note that daylight savings time went into effect this past weekend, so this telecon will start one hour later than our last telecon (Mexico doesn't observe DST until April 4th).

The agenda is:

Continue discussion from the last telecon concerning:

D2314R1: Character sets and encodings

D2297R1: Wording improvements for encodings and character sets

Discuss priorities and goals for C++23.

For D2314R1 and D2297R1, discussion will be limited to new information that might help to break the stalemate regarding use of an abstract character set or UCS scalar values as the specification tool for describing translation. If consensus is not reached, we'll poll forwarding D2314R1 with direction that EWG and/or CWG choose the wording mechanism.

Per P1000, papers targeting C++23 must be forwarded by EWG/LEWG to CWG/LWG by the February, 2022 meeting (Portland). However, the deadline for initial papers proposing new language features is ~November, 2021. Time is running short, and competition for time in EWG/LEWG will increase.

The following lists the current state of SG16 related papers and our C++23 effort to date. This is presented as food for thought. What story does this tell? How will that story be received by the C++ community? What should we do with our remaining time to either strengthen or change that story? What can we realistically do to bring more direct benefits to the C++ community? It may be interesting to review what we were thinking about during our March 13th, 2019 telecon.

These papers have been accepted for C++23:

P2029: Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals

These papers have been approved by EWG and are in the pipeline for CWG:

P1949: C++ Identifier Syntax using Unicode Standard Annex 31

P2201: Mixed string literal concatenation

P2223: Trimming whitespaces before line splicing

These papers have been approved by SG16 and are in the pipeline for EWG/LEWG:

P1885: Naming Text Encodings to Demystify Them

P2093: Formatted output

P2246: Character encoding of diagnostic text

P2316: Consistent character literal encoding

These papers are in the pipeline for EWG/LEWG, but require a revision to make progress:

P2071: Named universal character escapes

These papers are currently active in SG16:

D2314R1: Character sets and encodings

D2297R1: Wording improvements for encodings and character sets

With that summary of what we have been doing above in mind, the following lists provide some options for what we could work on next.

These are existing papers available for SG16 to prioritize: (Some of these, such as P1629, are awaiting revisions).

P1628: Unicode character properties

P1629: Standard Text Encoding

P1729: Text Parsing

P1859: Standard terminology for execution character set encodings

P1953: Unicode Identifiers And Reflection

P2295: Correct UTF-8 handling during phase 1 of translation

And finally, here are some ideas that have been discussed, but that we do not currently have papers covering:

UTF-8 as a portable source file encoding (the paper Tom started and has long intended to complete).

Requiring wchar_t to represent all members of the execution wide character set does not match existing practice

WG21 P1854: Source to Execution encoding conversion should not lead to loss of information

Deprecate std::regex

Make wide multicharacter character literals ill-formed

Improve portable ingestion of command-line arguments

Alias barriers; a replacement for the ICU hack

Our efforts will need to be balanced with any effort expended to align C23 with changes made for C++20 and C++23:

WG14 N2231: char8_t: A type for UTF-8 characters and strings

WG14: Make char16_t/char32_t string literals be UTF-16/32

WG14: Improve support for Unicode characters in identifiers

WG14: numerical & universal character escapes in char & string literals

WG14: Trimming whitespace before line splicing

Tom.

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16