sg16: Re: [SG16] Agenda for the 2021-03-24 SG16 telecon

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Thu, 18 Mar 2021 10:32:48 +0100

On Wed, Mar 17, 2021 at 2:22 PM Tom Honermann <tom_at_[hidden]> wrote:

> On 3/17/21 5:23 AM, Corentin Jabot wrote:
>
>
>
> On Tue, Mar 16, 2021 at 3:59 PM Tom Honermann via SG16 <
> sg16_at_[hidden]> wrote:
>
>> SG16 will hold a telecon on Wednesday, March 24th at 19:30 UTC (timezone
>> conversion
>> <https://www.timeanddate.com/worldclock/converter.html?iso=20210324T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cet>
>> ).
>>
>> *For participants in North America, please note that daylight savings
>> time went into effect this past weekend, so this telecon will start one
>> hour later than our last telecon (Mexico doesn't observe DST until April
>> 4th).*
>>
>> The agenda is:
>>
>> - Continue discussion from the last telecon concerning:
>> - D2314R1: Character sets and encodings
>> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>
>> - D2297R1: Wording improvements for encodings and character sets
>> <https://isocpp.org/files/papers/D2297R1.pdf>
>> - Discuss priorities and goals for C++23.
>>
>> For D2314R1 <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>
>> and D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>, discussion
>> will be limited to new information that might help to break the stalemate
>> regarding use of an abstract character set or UCS scalar values as the
>> specification tool for describing translation. If consensus is not
>> reached, we'll poll forwarding D2314R1
>> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html> with
>> direction that EWG and/or CWG choose the wording mechanism.
>>
>> Per P1000 <https://wg21.link/p1000>, papers targeting C++23 must be
>> forwarded by EWG/LEWG to CWG/LWG by the February, 2022 meeting (Portland).
>> However, the deadline for initial papers proposing new language features is
>> ~November, 2021. Time is running short, and competition for time in
>> EWG/LEWG will increase.
>>
>> The following lists the current state of SG16 related papers and our
>> C++23 effort to date. This is presented as food for thought. What story
>> does this tell? How will that story be received by the C++ community?
>> What should we do with our remaining time to either strengthen or change
>> that story? What can we realistically do to bring more direct benefits to
>> the C++ community? It may be interesting to review what we were
>> thinking about during our March 13th, 2019 telecon
>> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2019.md#march-13th-2019>
>> .
>>
>> These papers have been accepted for C++23:
>>
>> - P2029 <https://wg21.link/p2029>: Proposed resolution for core
>> issues 411, 1656, and 2333; numeric and universal character escapes in
>> character and string literals
>>
>> These papers have been approved by EWG and are in the pipeline for CWG:
>>
>> - P1949 <https://wg21.link/p1949>: C++ Identifier Syntax using
>> Unicode Standard Annex 31
>> - P2201 <https://wg21.link/p2201>: Mixed string literal concatenation
>> - P2223 <https://wg21.link/p2223>: Trimming whitespaces before line
>> splicing
>>
>> These papers have been approved by SG16 and are in the pipeline for
>> EWG/LEWG:
>>
>> - P1885 <https://wg21.link/p1885>: Naming Text Encodings to Demystify
>> Them
>> - P2093 <https://wg21.link/p2093>: Formatted output
>> - P2246 <https://wg21.link/p2246>: Character encoding of diagnostic
>> text
>> - P2316 <https://wg21.link/p2316>: Consistent character literal
>> encoding
>>
>> These papers are in the pipeline for EWG/LEWG, but require a revision to
>> make progress:
>>
>> - P2071 <https://wg21.link/p2071>: Named universal character escapes
>>
>>
> I would like us to make progress on that! Afaik there isn't a lot of work
> remaining, right?
>
> I need to review notes, but from what I remember, only minor updates are
> needed to the paper; doing that is on my plate and it is realistic that I
> could get to it soon.
>
> Implementing it in a compiler would help to reduce some concerns. I'm
> afraid I won't have time to do that for a while though.
>
>
>
>>
>> -
>>
>> These papers are currently active in SG16:
>>
>> - D2314R1
>> <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r1.html>:
>> Character sets and encodings
>> - D2297R1 <https://isocpp.org/files/papers/D2297R1.pdf>: Wording
>> improvements for encodings and character sets
>>
>> With that summary of what we have been doing above in mind, the following
>> lists provide some options for what we could work on next.
>>
>> These are existing papers available for SG16 to prioritize: (Some of
>> these, such as P1629, are awaiting revisions).
>>
>> - P1628 <https://wg21.link/p1628>: Unicode character properties
>>
>> As the author I do not expect to do further work on this in the 23 cycle
>
> That matches my expectations, thanks for confirming.
>
>
>> - P1629 <https://wg21.link/p1629>: Standard Text Encoding
>> - P1729 <https://wg21.link/p1729>: Text Parsing
>> - P1859 <https://wg21.link/p1859>: Standard terminology for execution
>> character set encodings
>>
>> This is mostly superseded by 2314/2297 - we should make sure the
> direction are consistent
>
>
>>
>> - P1953 <https://wg21.link/p1953>: Unicode Identifiers And Reflection
>>
>> This is ending progress in SG-7
>
> That isn't the effect I would expect this paper to have on SG-7. "pending"
> on the other hand... ;)
>
>
>
>>
>> - P2295 <https://wg21.link/p2295>: Correct UTF-8 handling during
>> phase 1 of translation
>>
>> Expect a revision of that soon
>
>>
>> -
>>
>> And finally, here are some ideas that have been discussed, but that we do
>> not currently have papers covering:
>>
>> - UTF-8 as a portable source file encoding (the paper Tom started and
>> has long intended to complete).
>>
>> See also P2295
>
> Yes, clearly related.
>
>
>
>>
>> - Requiring wchar_t to represent all members of the execution wide
>> character set does not match existing practice
>> <https://github.com/sg16-unicode/sg16/issues/9>
>>
>> Please let's investigate that!
>
> I had started a paper on this a while back. Yet another unfinished
> paper. I'd like to see this done, but it will have no meaningful impact to
> the C++ community, so we should consider that when prioritizing.
>
>
>> - WG21 P1854: Source to Execution encoding conversion should not lead
>> to loss of information
>> <https://github.com/sg16-unicode/sg16/issues/50>
>>
>> Expect further work on that in the coming months
>
>>
>> - Deprecate std::regex
>> <https://github.com/sg16-unicode/sg16/issues/57>
>> - Make wide multicharacter character literals ill-formed
>> <https://github.com/sg16-unicode/sg16/issues/65>
>>
>> I'll write a paper
>
>>
>> -
>> - Improve portable ingestion of command-line arguments
>> <https://github.com/sg16-unicode/sg16/issues/66>
>> - Alias barriers; a replacement for the ICU hack
>> <https://github.com/sg16-unicode/sg16/issues/67>
>>
>> This seems very important - the char8_t adoption story isn't great right
> now.
>
> I agree, and providing this would be useful for the story we tell, but I
> suspect won't impact actual adoption.
>

Speaking of which, could we possibly support format with utf format strings
in the 23 cycle? I don't think it would be that much work

> Tom.
>
>
>>
>> Our efforts will need to be balanced with any effort expended to align
>> C23 with changes made for C++20 and C++23:
>>
>> - WG14 N2231: char8_t: A type for UTF-8 characters and strings
>> <https://github.com/sg16-unicode/sg16/issues/5>
>> - WG14: Make char16_t/char32_t string literals be UTF-16/32
>> <https://github.com/sg16-unicode/sg16/issues/54>
>> - WG14: Improve support for Unicode characters in identifiers
>> <https://github.com/sg16-unicode/sg16/issues/56>
>> - WG14: numerical & universal character escapes in char & string
>> literals <https://github.com/sg16-unicode/sg16/issues/63>
>> - WG14: Trimming whitespace before line splicing
>> <https://github.com/sg16-unicode/sg16/issues/64>
>>
>> Tom.
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
>

Received on 2021-03-18 04:33:02