C++ Logo

SG16

Advanced search

Subject: Re: Agenda for the 2021-03-24 SG16 telecon
From: Tom Honermann (tom_at_[hidden])
Date: 2021-03-23 17:42:32


Reminder that this telecon will be taking place tomorrow.

Please review the summary from our last telecon
<https://github.com/sg16-unicode/sg16-meetings#march-10th-2021>.

With regard to C++23 goals and priorities, what I plan to do is to
briefly review suggestions for things we could work on (including items
listed below, others already discussed in email, and any suggestions
offered during the meeting).  An item that is nominated for
prioritization by someone present and for which a champion willing to
progress it is identified will become a candidate for prioritization
polling.  I expect this will yield ~12 candidates.  We'll then poll
those with the expectation of identifying up to six items to prioritize
with good consensus. To poll, everyone will get 4 votes (maybe more,
maybe less, depending on the item count) to cast for their top 4
candidates.  If anyone has concerns about this, please feel free to
express them and suggest an alternative method.

The WG14 related items that we are tracking (listed earlier in this
email thread) will not be included in the polling above.  That is only
because I don't anticipate spending SG16 time on them; champions are
encouraged to progress them through SG22 and WG14 channels.

Here is an updated list of items to consider.

  * P1628 <https://wg21.link/p1628>: Unicode character properties
  * P1629 <https://wg21.link/p1629>: Standard Text Encoding
  * P1729 <https://wg21.link/p1729>: Text Parsing
  * P1859 <https://wg21.link/p1859>: Standard terminology for execution
    character set encodings
  * P1953 <https://wg21.link/p1953>: Unicode Identifiers And Reflection
  * P2071 <https://wg21.link/p2071>: Named universal character escapes
  * P2295 <https://wg21.link/p2295>: Correct UTF-8 handling during phase
    1 of translation
  * Requiring wchar_t to represent all members of the execution wide
    character set does not match existing practice
    <https://github.com/sg16-unicode/sg16/issues/9>
  * WG21 P1854: Source to Execution encoding conversion should not lead
    to loss of information <https://github.com/sg16-unicode/sg16/issues/50>
  * std::to_chars/std::from_chars overloads for char8_t
    <https://github.com/sg16-unicode/sg16/issues/38>
  * Publish an SG16 library design guidelines paper
    <https://github.com/sg16-unicode/sg16/issues/53>
  * Deprecate std::regex <https://github.com/sg16-unicode/sg16/issues/57>
  * Make wide multicharacter character literals ill-formed
    <https://github.com/sg16-unicode/sg16/issues/65>
  * Improve portable ingestion of command-line arguments
    <https://github.com/sg16-unicode/sg16/issues/66>
  * Alias barriers; a replacement for the ICU hack
    <https://github.com/sg16-unicode/sg16/issues/67>
  * Support for UTF encodings in std::format() and std::print()
    <https://github.com/sg16-unicode/sg16/issues/68>
  * Specify what constitutes white-space characters
    <https://github.com/sg16-unicode/sg16/issues/69>
  * Specify what constitutes a new-line
    <https://github.com/sg16-unicode/sg16/issues/70>
  * A portable mechanism to specify source file encoding
    <https://github.com/sg16-unicode/sg16/issues/71>

Tom.

On 3/16/21 10:59 AM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a telecon on Wednesday, March 24th at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20210324T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cet>).
>
> *For participants in North America, please note that daylight savings
> time went into effect this past weekend, so this telecon will start
> one hour later than our last telecon (Mexico doesn't observe DST until
> April 4th).*
>
> The agenda is:
>
> * Continue discussion from the last telecon concerning:
> o D2314R1: Character sets and encodings
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>
> o D2297R1: Wording improvements for encodings and character sets
> <https://isocpp.org/files/papers/D2297R1.pdf>
> * Discuss priorities and goals for C++23.
>
> For D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> and
> D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>, discussion will
> be limited to new information that might help to break the stalemate
> regarding use of an abstract character set or UCS scalar values as the
> specification tool for describing translation.  If consensus is not
> reached, we'll poll forwarding D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> with
> direction that EWG and/or CWG choose the wording mechanism.
>
> Per P1000 <https://wg21.link/p1000>, papers targeting C++23 must be
> forwarded by EWG/LEWG to CWG/LWG by the February, 2022 meeting
> (Portland).  However, the deadline for initial papers proposing new
> language features is ~November, 2021.  Time is running short, and
> competition for time in EWG/LEWG will increase.
>
> The following lists the current state of SG16 related papers and our
> C++23 effort to date.  This is presented as food for thought.  What
> story does this tell?  How will that story be received by the C++
> community?  What should we do with our remaining time to either
> strengthen or change that story?  What can we realistically do to
> bring more direct benefits to the C++ community?  It may be
> interesting to review what we were thinking about during our March
> 13th, 2019 telecon
> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2019.md#march-13th-2019>.
>
> These papers have been accepted for C++23:
>
> * P2029 <https://wg21.link/p2029>: Proposed resolution for core
> issues 411, 1656, and 2333; numeric and universal character
> escapes in character and string literals
>
> These papers have been approved by EWG and are in the pipeline for CWG:
>
> * P1949 <https://wg21.link/p1949>: C++ Identifier Syntax using
> Unicode Standard Annex 31
> * P2201 <https://wg21.link/p2201>: Mixed string literal concatenation
> * P2223 <https://wg21.link/p2223>: Trimming whitespaces before line
> splicing
>
> These papers have been approved by SG16 and are in the pipeline for
> EWG/LEWG:
>
> * P1885 <https://wg21.link/p1885>: Naming Text Encodings to
> Demystify Them
> * P2093 <https://wg21.link/p2093>: Formatted output
> * P2246 <https://wg21.link/p2246>: Character encoding of diagnostic text
> * P2316 <https://wg21.link/p2316>: Consistent character literal encoding
>
> These papers are in the pipeline for EWG/LEWG, but require a revision
> to make progress:
>
> * P2071 <https://wg21.link/p2071>: Named universal character escapes
>
> These papers are currently active in SG16:
>
> * D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>:
> Character sets and encodings
> * D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>: Wording
> improvements for encodings and character sets
>
> With that summary of what we have been doing above in mind, the
> following lists provide some options for what we could work on next.
>
> These are existing papers available for SG16 to prioritize: (Some of
> these, such as P1629, are awaiting revisions).
>
> * P1628 <https://wg21.link/p1628>: Unicode character properties
> * P1629 <https://wg21.link/p1629>: Standard Text Encoding
> * P1729 <https://wg21.link/p1729>: Text Parsing
> * P1859 <https://wg21.link/p1859>: Standard terminology for
> execution character set encodings
> * P1953 <https://wg21.link/p1953>: Unicode Identifiers And Reflection
> * P2295 <https://wg21.link/p2295>: Correct UTF-8 handling during
> phase 1 of translation
>
> And finally, here are some ideas that have been discussed, but that we
> do not currently have papers covering:
>
> * UTF-8 as a portable source file encoding (the paper Tom started
> and has long intended to complete).
> * Requiring wchar_t to represent all members of the execution wide
> character set does not match existing practice
> <https://github.com/sg16-unicode/sg16/issues/9>
> * WG21 P1854: Source to Execution encoding conversion should not
> lead to loss of information
> <https://github.com/sg16-unicode/sg16/issues/50>
> * Deprecate std::regex <https://github.com/sg16-unicode/sg16/issues/57>
> * Make wide multicharacter character literals ill-formed
> <https://github.com/sg16-unicode/sg16/issues/65>
> * Improve portable ingestion of command-line arguments
> <https://github.com/sg16-unicode/sg16/issues/66>
> * Alias barriers; a replacement for the ICU hack
> <https://github.com/sg16-unicode/sg16/issues/67>
>
> Our efforts will need to be balanced with any effort expended to align
> C23 with changes made for C++20 and C++23:
>
> * WG14 N2231: char8_t: A type for UTF-8 characters and strings
> <https://github.com/sg16-unicode/sg16/issues/5>
> * WG14: Make char16_t/char32_t string literals be UTF-16/32
> <https://github.com/sg16-unicode/sg16/issues/54>
> * WG14: Improve support for Unicode characters in identifiers
> <https://github.com/sg16-unicode/sg16/issues/56>
> * WG14: numerical & universal character escapes in char & string
> literals <https://github.com/sg16-unicode/sg16/issues/63>
> * WG14: Trimming whitespace before line splicing
> <https://github.com/sg16-unicode/sg16/issues/64>
>
> Tom.
>
>



SG16 list run by sg16-owner@lists.isocpp.org