sg16: Re: [SG16] Updated draft revision: D2029R3 (Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 24 Aug 2020 18:54:41 -0400

On 8/24/20 2:57 AM, Corentin Jabot via SG16 wrote:
>
>
> On Mon, Aug 24, 2020, 06:13 Tom Honermann via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>
> Perhaps it is time to re-title this paper to "Rewrite [lex.ccon]
> and [lex.string]" :)
>
> An update of D2029R3 is available at
> https://rawgit.com/sg16-unicode/sg16/master/papers/d2029r3.html.
> This addresses the feedback provided at the July 20th, 2020, core
> issues processing telecon. The relevant changes include:
>
> * Changed the introductory and proposed resolution overview to
> remove incorrect uses of "narrowing integer conversion" in
> prose intended to describe something more like an "integral
> conversion".
> * [lex.ccon]pY: Added footnotes explaining that, for
> nonencodable character literals and multicharacter literals,
> the associated character encoding may be used solely to
> determine encodability and not to actually encode values.
> * [lex.ccon]pZ: Merged Z.1 into the introductory text to make
> the short circuiting behavior clear and renumbered the
> remaining subparagraphs.
> * [lex.ccon]pZ: Specified that the determination of a character
> literal value during translation phase 4 uses the range of
> representable values of the character literal's type in
> translation phase 7.
> * [lex.string]pX: Added normative wording stating that the /n/
> that appears in the string literal array type corresponds to
> the number of encoded code units.
> * [lex.string]pZ: Elevated notes related to stateful character
> encodings to normative encouragement.
>
> I believe this is ready for CWG review again. Jens, Richard, and
> Hubert, if you can provide any additional change requests before
> the next core issues processing telecon, I would appreciate it.
> I'm hoping that the next CWG review will be the last one!
>
>
>
> Y1 and Y3 are very tautological and don't make sense to me.
> It assumes that there are multiple encodings in the same string or
> character, which will not be the case when the string is consumed. And
> of course, things that cannot be encoded, cannot be encoded.
> Saying that the value of non encodable codepoints was
> implementation-defined was much clearer.

An associated character encoding is needed to determine if a given
literal is encodable or not (to distinguish between normal and
non-encodable character literals). The table does not address the value
of a character literal; later paragraphs establish that as
implementation-defined for these cases. The footnotes were requested
during CWG review to help clarify why these can't be
implementation-defined (at least for non-encodable character literals).

Jens has suggested new wording for them that I think is much better than
what I came up with. Do his suggested updates resolve your concern?

Tom.

> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
>

Received on 2020-08-24 17:58:10