sg16: Re: [SG16] Agenda for the 2021-03-24 SG16 telecon

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 17 Mar 2021 09:22:04 -0400

On 3/17/21 5:23 AM, Corentin Jabot wrote:
>
>
> On Tue, Mar 16, 2021 at 3:59 PM Tom Honermann via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>
> SG16 will hold a telecon on Wednesday, March 24th at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20210324T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cet>).
>
> *For participants in North America, please note that daylight
> savings time went into effect this past weekend, so this telecon
> will start one hour later than our last telecon (Mexico doesn't
> observe DST until April 4th).*
>
> The agenda is:
>
> * Continue discussion from the last telecon concerning:
> o D2314R1: Character sets and encodings
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>
> o D2297R1: Wording improvements for encodings and character
> sets <https://isocpp.org/files/papers/D2297R1.pdf>
> * Discuss priorities and goals for C++23.
>
> For D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> and
> D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>, discussion
> will be limited to new information that might help to break the
> stalemate regarding use of an abstract character set or UCS scalar
> values as the specification tool for describing translation. If
> consensus is not reached, we'll poll forwarding D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> with
> direction that EWG and/or CWG choose the wording mechanism.
>
> Per P1000 <https://wg21.link/p1000>, papers targeting C++23 must
> be forwarded by EWG/LEWG to CWG/LWG by the February, 2022 meeting
> (Portland). However, the deadline for initial papers proposing
> new language features is ~November, 2021. Time is running short,
> and competition for time in EWG/LEWG will increase.
>
> The following lists the current state of SG16 related papers and
> our C++23 effort to date. This is presented as food for thought.
> What story does this tell? How will that story be received by the
> C++ community? What should we do with our remaining time to
> either strengthen or change that story? What can we realistically
> do to bring more direct benefits to the C++ community? It may be
> interesting to review what we were thinking about during our March
> 13th, 2019 telecon
> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2019.md#march-13th-2019>.
>
> These papers have been accepted for C++23:
>
> * P2029 <https://wg21.link/p2029>: Proposed resolution for core
> issues 411, 1656, and 2333; numeric and universal character
> escapes in character and string literals
>
> These papers have been approved by EWG and are in the pipeline for
> CWG:
>
> * P1949 <https://wg21.link/p1949>: C++ Identifier Syntax using
> Unicode Standard Annex 31
> * P2201 <https://wg21.link/p2201>: Mixed string literal
> concatenation
> * P2223 <https://wg21.link/p2223>: Trimming whitespaces before
> line splicing
>
> These papers have been approved by SG16 and are in the pipeline
> for EWG/LEWG:
>
> * P1885 <https://wg21.link/p1885>: Naming Text Encodings to
> Demystify Them
> * P2093 <https://wg21.link/p2093>: Formatted output
> * P2246 <https://wg21.link/p2246>: Character encoding of
> diagnostic text
> * P2316 <https://wg21.link/p2316>: Consistent character literal
> encoding
>
> These papers are in the pipeline for EWG/LEWG, but require a
> revision to make progress:
>
> * P2071 <https://wg21.link/p2071>: Named universal character escapes
>
>
> I would like us to make progress on that! Afaik there isn't a lot of
> work remaining, right?

I need to review notes, but from what I remember, only minor updates are
needed to the paper; doing that is on my plate and it is realistic that
I could get to it soon.

Implementing it in a compiler would help to reduce some concerns. I'm
afraid I won't have time to do that for a while though.

> *
>
>
> These papers are currently active in SG16:
>
> * D2314R1
> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>:
> Character sets and encodings
> * D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>: Wording
> improvements for encodings and character sets
>
> With that summary of what we have been doing above in mind, the
> following lists provide some options for what we could work on next.
>
> These are existing papers available for SG16 to prioritize: (Some
> of these, such as P1629, are awaiting revisions).
>
> * P1628 <https://wg21.link/p1628>: Unicode character properties
>
> As the author I do not expect to do further work on this in the 23 cycle
That matches my expectations, thanks for confirming.
>
> * P1629 <https://wg21.link/p1629>: Standard Text Encoding
> * P1729 <https://wg21.link/p1729>: Text Parsing
> * P1859 <https://wg21.link/p1859>: Standard terminology for
> execution character set encodings
>
> This is mostly superseded by 2314/2297 - we should make sure the
> direction are consistent
>
> * P1953 <https://wg21.link/p1953>: Unicode Identifiers And
> Reflection
>
> This is ending progress in SG-7

That isn't the effect I would expect this paper to have on SG-7.
"pending" on the other hand... ;)

> * P2295 <https://wg21.link/p2295>: Correct UTF-8 handling during
> phase 1 of translation
>
> Expect a revision of that soon
>
> *
>
>
> And finally, here are some ideas that have been discussed, but
> that we do not currently have papers covering:
>
> * UTF-8 as a portable source file encoding (the paper Tom
> started and has long intended to complete).
>
> See also P2295
Yes, clearly related.
>
> * Requiring wchar_t to represent all members of the execution
> wide character set does not match existing practice
> <https://github.com/sg16-unicode/sg16/issues/9>
>
> Please let's investigate that!
I had started a paper on this a while back. Yet another unfinished
paper. I'd like to see this done, but it will have no meaningful impact
to the C++ community, so we should consider that when prioritizing.
>
> * WG21 P1854: Source to Execution encoding conversion should not
> lead to loss of information
> <https://github.com/sg16-unicode/sg16/issues/50>
>
> Expect further work on that in the coming months
>
> * Deprecate std::regex
> <https://github.com/sg16-unicode/sg16/issues/57>
> * Make wide multicharacter character literals ill-formed
> <https://github.com/sg16-unicode/sg16/issues/65>
>
> I'll write a paper
>
> *
>
>
> * Improve portable ingestion of command-line arguments
> <https://github.com/sg16-unicode/sg16/issues/66>
> * Alias barriers; a replacement for the ICU hack
> <https://github.com/sg16-unicode/sg16/issues/67>
>
> This seems very important - the char8_t adoption story isn't great
> right now.

I agree, and providing this would be useful for the story we tell, but I
suspect won't impact actual adoption.

Tom.

> Our efforts will need to be balanced with any effort expended to
> align C23 with changes made for C++20 and C++23:
>
> * WG14 N2231: char8_t: A type for UTF-8 characters and strings
> <https://github.com/sg16-unicode/sg16/issues/5>
> * WG14: Make char16_t/char32_t string literals be UTF-16/32
> <https://github.com/sg16-unicode/sg16/issues/54>
> * WG14: Improve support for Unicode characters in identifiers
> <https://github.com/sg16-unicode/sg16/issues/56>
> * WG14: numerical & universal character escapes in char & string
> literals <https://github.com/sg16-unicode/sg16/issues/63>
> * WG14: Trimming whitespace before line splicing
> <https://github.com/sg16-unicode/sg16/issues/64>
>
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2021-03-17 08:22:08