C++ Logo

sg16

Advanced search

[SG16] Draft of d2071 - Named escape sequences

From: Steve Downey <sdowney_at_[hidden]>
Date: Tue, 19 Oct 2021 22:47:14 -0400
Find attached a draft of Named universal character escapes. It provides
rebased wording based on EWGs last directions, alternatively provides
wording for UCNs rather than escapes, in light of p2290 Delimited Escape
Sequences, and additionally emphasizes we are choosing to ignore how the
Unicode standard recommends that Names should be matched against the
Unicode Database, and provides a wording alternative for that.

Wording has not been reviewed, but doesn't change significantly from past
versions, except that changes to 5.13.5 [lex.string] paragraph 14
<http://eel.is/c++draft/lex.string#14>: no longer seem to be necessary, as
the new machinery takes care of everything. I believe.

We have a slot in EWG on the 7th, unless something has changed, for this
paper. I can do more wording work after this, although the primary goal, in
my opinion, is to get design agreement for '23.


On Tue, Oct 19, 2021 at 3:32 PM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:

> My apologies for being so late in communicating this agenda for tomorrow's
> SG16 telecon.
>
> *Please note that there has been a schedule change.* Tomorrow's telecon
> was originally scheduled for 2021-10-27 but was moved forward a week to
> avoid conflicts with CppCon. The shared calendar has been updated (which
> triggered the sending of new meeting invitations).
>
> SG16 will hold a telecon on Wednesday, October *20th* (not the 27th) at
> 19:30 UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20211020T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>
> ).
>
> The agenda is:
>
>
> - D2071R1: Named universal character escapes
> - Add named escape sequences to *universal-character-name* so that
> these escape sequences can be used everywhere, not just in string literals.
> - Use Unicode rules for matching names rather than requiring exact
> case-sensitive names.
> - P1885R8: Naming Text Encodings to Demystify Them
> <https://wg21.link/p1885r8>
> - Continue discussions of issues raised on the LEWG and SG16
> mailing lists.
> - Prohibit mapping to IANA encodings when CHAR_BIT is not 8?
> - Address special cases for IANA mapping purposes:
> - Is UTF-16 valid for ordinary strings when CHAR_BIT is >= 16?
> - Is UTF-16 valid for wide strings when CHAR_BIT is >= 16 and
> sizeof(wchar_t) is 1?
> - Is the underlying representation of a wide string required to
> match an encoding scheme for the encoding form when
> sizeof(wchar_t) is not 1?
> - Limit mapping of wide strings when sizeof(wchar_t) is not 1 to
> other, unknown, and the UCS/UTF variants?
>
> A draft of D2071R1 is not yet available, but is expected to be sent to the
> SG16 mailing list later today. That draft will address EWG feedback from
> Prague <https://wiki.edg.com/bin/view/Wg21prague/P2071R0-EWG> except that
> it will specify relaxed checking of character names for implementation
> reasons (the prototype implementation that Corentin did demonstrated how
> relaxed checking enables a smaller database of valid character names).
>
> P1885 is back on the agenda to settle questions that have continued to be
> raised on the LEWG and SG16 mailing lists. Corentin indicated intent to
> change the encoding querying functions to always return unknown when
> CHAR_BIT is not 8; we'll discuss the ramifications of that intent.
> Concerns about intended behavior for wide strings continue to be raised, so
> we'll discuss and poll various approaches for dealing with them.
>
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2021-10-19 21:47:31