C++ Logo


Advanced search

Subject: Re: Updated draft revision: D2029R3 (Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals
From: Tom Honermann (tom_at_[hidden])
Date: 2020-08-24 17:54:41

On 8/24/20 2:57 AM, Corentin Jabot via SG16 wrote:
> On Mon, Aug 24, 2020, 06:13 Tom Honermann via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
> Perhaps it is time to re-title this paper to "Rewrite [lex.ccon]
> and [lex.string]" :)
> An update of D2029R3 is available at
> https://rawgit.com/sg16-unicode/sg16/master/papers/d2029r3.html.
> This addresses the feedback provided at the July 20th, 2020, core
> issues processing telecon.  The relevant changes include:
> * Changed the introductory and proposed resolution overview to
> remove incorrect uses of "narrowing integer conversion" in
> prose intended to describe something more like an "integral
> conversion".
> * [lex.ccon]pY: Added footnotes explaining that, for
> nonencodable character literals and multicharacter literals,
> the associated character encoding may be used solely to
> determine encodability and not to actually encode values.
> * [lex.ccon]pZ: Merged Z.1 into the introductory text to make
> the short circuiting behavior clear and renumbered the
> remaining subparagraphs.
> * [lex.ccon]pZ: Specified that the determination of a character
> literal value during translation phase 4 uses the range of
> representable values of the character literal's type in
> translation phase 7.
> * [lex.string]pX: Added normative wording stating that the /n/
> that appears in the string literal array type corresponds to
> the number of encoded code units.
> * [lex.string]pZ: Elevated notes related to stateful character
> encodings to normative encouragement.
> I believe this is ready for CWG review again.  Jens, Richard, and
> Hubert, if you can provide any additional change requests before
> the next core issues processing telecon, I would appreciate it. 
> I'm hoping that the next CWG review will be the last one!
> Y1 and Y3 are very tautological and don't make sense to me.
> It assumes that there are multiple encodings in the same string or
> character, which will not be the case when the string is consumed. And
> of course, things that cannot be encoded, cannot be encoded.
> Saying that the value of non encodable codepoints was
> implementation-defined was much clearer.

An associated character encoding is needed to determine if a given
literal is encodable or not (to distinguish between normal and
non-encodable character literals).  The table does not address the value
of a character literal; later paragraphs establish that as
implementation-defined for these cases.  The footnotes were requested
during CWG review to help clarify why these can't be
implementation-defined (at least for non-encodable character literals).

Jens has suggested new wording for them that I think is much better than
what I came up with.  Do his suggested updates resolve your concern?


> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

SG16 list run by sg16-owner@lists.isocpp.org