On Mon, Aug 24, 2020, 06:13 Tom Honermann via SG16 <email@example.com> wrote:
Perhaps it is time to re-title this paper to "Rewrite [lex.ccon] and [lex.string]" :)
An update of D2029R3 is available at https://rawgit.com/sg16-unicode/sg16/master/papers/d2029r3.html. This addresses the feedback provided at the July 20th, 2020, core issues processing telecon. The relevant changes include:
- Changed the introductory and proposed resolution overview to remove incorrect uses of "narrowing integer conversion" in prose intended to describe something more like an "integral conversion".
- [lex.ccon]pY: Added footnotes explaining that, for nonencodable character literals and multicharacter literals, the associated character encoding may be used solely to determine encodability and not to actually encode values.
- [lex.ccon]pZ: Merged Z.1 into the introductory text to make the short circuiting behavior clear and renumbered the remaining subparagraphs.
- [lex.ccon]pZ: Specified that the determination of a character literal value during translation phase 4 uses the range of representable values of the character literal's type in translation phase 7.
- [lex.string]pX: Added normative wording stating that the n that appears in the string literal array type corresponds to the number of encoded code units.
- [lex.string]pZ: Elevated notes related to stateful character encodings to normative encouragement.
I believe this is ready for CWG review again. Jens, Richard, and Hubert, if you can provide any additional change requests before the next core issues processing telecon, I would appreciate it. I'm hoping that the next CWG review will be the last one!
Y1 and Y3 are very tautological and don't make sense to me.It assumes that there are multiple encodings in the same string or character, which will not be the case when the string is consumed. And of course, things that cannot be encoded, cannot be encoded.Saying that the value of non encodable codepoints was implementation-defined was much clearer.
An associated character encoding is needed to determine if a
given literal is encodable or not (to distinguish between normal
and non-encodable character literals). The table does not address
the value of a character literal; later paragraphs establish that
as implementation-defined for these cases. The footnotes were
requested during CWG review to help clarify why these can't be
implementation-defined (at least for non-encodable character
Jens has suggested new wording for them that I think is much
better than what I came up with. Do his suggested updates resolve