sg16: Re: [SG16] Updated draft revision: D2029R3 (Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Mon, 24 Aug 2020 08:57:45 +0200

On Mon, Aug 24, 2020, 06:13 Tom Honermann via SG16 <sg16_at_[hidden]>
wrote:

> Perhaps it is time to re-title this paper to "Rewrite [lex.ccon] and
> [lex.string]" :)
>
> An update of D2029R3 is available at
> https://rawgit.com/sg16-unicode/sg16/master/papers/d2029r3.html. This
> addresses the feedback provided at the July 20th, 2020, core issues
> processing telecon. The relevant changes include:
>
> - Changed the introductory and proposed resolution overview to remove
> incorrect uses of "narrowing integer conversion" in prose intended to
> describe something more like an "integral conversion".
> - [lex.ccon]pY: Added footnotes explaining that, for nonencodable
> character literals and multicharacter literals, the associated character
> encoding may be used solely to determine encodability and not to actually
> encode values.
> - [lex.ccon]pZ: Merged Z.1 into the introductory text to make the
> short circuiting behavior clear and renumbered the remaining subparagraphs.
> - [lex.ccon]pZ: Specified that the determination of a character
> literal value during translation phase 4 uses the range of representable
> values of the character literal's type in translation phase 7.
> - [lex.string]pX: Added normative wording stating that the *n* that
> appears in the string literal array type corresponds to the number of
> encoded code units.
> - [lex.string]pZ: Elevated notes related to stateful character
> encodings to normative encouragement.
>
> I believe this is ready for CWG review again. Jens, Richard, and Hubert,
> if you can provide any additional change requests before the next core
> issues processing telecon, I would appreciate it. I'm hoping that the next
> CWG review will be the last one!
>

Y1 and Y3 are very tautological and don't make sense to me.
It assumes that there are multiple encodings in the same string or
character, which will not be the case when the string is consumed. And of
course, things that cannot be encoded, cannot be encoded.
Saying that the value of non encodable codepoints was
implementation-defined was much clearer.

> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2020-08-24 02:01:25