C++ Logo

sg16

Advanced search

Re: Agenda for the 2024-02-21 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 20 Feb 2024 17:44:43 -0500
This is your friendly reminder that this meeting is taking place tomorrow.

Tom.

On 2/19/24 10:56 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting on Wednesday, February 21st, at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20240221T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>
> The agenda follows.
>
> * CWG 2843: Undated reference to Unicode makes C++ a moving target
> <https://cplusplus.github.io/CWG/issues/2843.html>
> o Identify updates needed for UAX #31 changes in Unicode 15.1.0.
> * LWG 4043: "ASCII" is not a registered character encoding
> <https://wg21.link/lwg4043>
> * LWG 4044: Confusing requirements for std::print on POSIX platforms
> <https://wg21.link/lwg4044>
>
> We reached consensus to recommend Unicode 15.1.0 as the minimum
> Unicode version and normative reference during the 2024-02-07 SG16
> meeting
> <https://github.com/sg16-unicode/sg16-meetings?tab=readme-ov-file#february-7th-2024>.
> I thought this last discussion brought this issue to a conclusion for
> us, but an email sent to the WG14 mailing list by Joseph Myers (on
> 2024-02-13 with subject "D.2.1 and UAX#31 revision 39") reminded me of
> an earlier email Corentin sent to the SG16 mailing list
> <https://lists.isocpp.org/sg16/2024/01/4041.php> (on 2024-01-06 with
> subject "UAX Profiles"). Changes made to UAX #31 (Unicode Identifiers
> and Syntax) <https://unicode.org/reports/tr31/> for Unicode 15.1.0
> will require us to make a decision regarding accepting new character
> allowances in identifiers or adopting a profile to retain the Unicode
> 15.0.0 allowances. In either case, changes to annex E (Conformance
> with UAX #31) <http://eel.is/c++draft/uaxid> will be required to
> reflect that rule UAX31-R1a (Restricted Format Characters) has been
> removed <https://www.unicode.org/reports/tr31/tr31-39.html#R1a>.
>
> A summary of the UAX #31 changes for Unicode 15.1.0 is provided in the
> "Modifications" section
> <https://www.unicode.org/reports/tr31/tr31-39.html#Modifications>. A
> diff of the changes relative to 15.0.0
> <https://www.unicode.org/reports/tr31/tr31-38.html> is also available.
> My understanding of the changes is that U+200C (ZERO WIDTH NON-JOINER)
> and U+200D (ZERO WIDTH JOINER) have been added to XID_continue
> <https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5B%3AXID_continue%3A%5D&g=&i=>
> to allow for characters that native speakers of some languages (e.g.,
> Persian) would expect to be able to use in identifiers. Spoofing
> concerns, including those that depend on the presence of the ZWNJ and
> ZWJ characters, remain the subject matter of UTS #39 (Unicode Security
> Mechanisms) <https://unicode.org/reports/tr39/>. I expect that Robin,
> Corentin, and Steve will be able to provide more details of the change
> and its motivation. As I understand things, our choices will be to:
>
> 1. Accept the changes to XID_continue, or
> 2. Reject the changes to XID_continue by adjusting the profile
> specified in [uaxid.def.general]
> <http://eel.is/c++draft/uaxid.def.general>, possibly by including
> the Default-Ignorable Exclusion Profile
> <https://www.unicode.org/reports/tr31/tr31-39.html#Default_Ignorable_Exclusion_Profile>,
> though that would exclude many code points
> <https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5B%3AXID_continue%3A%5D%26%5B%3ADefault_Ignorable_Code_Point%3A%5D&g=&i=>
> beyond ZWNJ and ZWJ.
>
> Once consensus for a direction is established, a volunteer will be
> needed to draft wording changes for [uaxid]
> <http://eel.is/c++draft/uaxid>.
>
> LWG 4043 was recently filed by Jonathan Wakely. It reports a
> straightforward concern; that the set of encodings recognized by
> std::text_encoding does not include "ASCII" despite that name being
> unambiguous and recognized by common encoding libraries. The proposed
> resolution is to add "ASCII" to the set of aliases for that IANA
> specified "US-ASCII" encoding despite the fact that the IANA character
> set registry
> <https://www.iana.org/assignments/character-sets/character-sets.xhtml>
> does not do so.
>
> LWG 4044 was also recently filed by Jonathan Wakely while working to
> implement std::print() in libstdc++. Jonathan's initial implementation
> attempted to do what the C++ standard wording stated and detected
> ill-formed code units written to a stream that is directed to a
> terminal so that they could be diagnosed. He found that the overhead
> of calling isatty() on Linux to determine if a stream is directed to a
> terminal was prohibitively expensive and started questioning why the
> standard was directing him to do this. In private correspondence, it
> was clarified that the intent of the "native Unicode API" terminology
> was to generically refer to the Windows WriteConsoleW() function and
> that there is no need to do anything special on POSIX systems. That
> discussion also questioned what it means to diagnose invalid code
> units written to a console at run-time. Jonathan has been kind enough
> to draft a proposed resolution to clarify the intent.
>
> Tom.
>
>

Received on 2024-02-20 22:44:46