sg16: Re: [SG16] [isocpp-core] Updated draft revision: D2029R2 (Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals)

From: Richard Smith <richardsmith_at_[hidden]>
Date: Tue, 14 Jul 2020 00:23:42 -0700

On Mon, Jul 13, 2020 at 9:03 PM Tom Honermann via Core <
core_at_[hidden]> wrote:

> On 7/8/20 1:54 PM, Tom Honermann wrote:
>
> On 7/8/20 6:43 AM, Alisdair Meredith wrote:
>
> Minor nit: I dislike normatively stating that a null character is
> appended after string concatenation in two places. I do like
> the addition of this directly to the phase 6 wording, so suggest
> that the original in [lex.string]p12 with its extra flowery language
> be demoted to a note.
>
> That seems reasonable to me, I'll do so.
>
> After looking at this again, I elected to go in a different direction.
>
> [lex.phases] describes at a high level what is to be done in each phase
> and more-or-less defers to other sections for elaboration. From this lens,
> changing the normative text in [lex.string] into a note felt like the wrong
> direction. Instead, I chose to update the wording in [lex.string] to read
> a little nicer and to omit the flowery language. I then updated
> [lex.phases] to be less precise and to explicitly direct the reader to
> [lex.string] for details. I hope this acceptably satisfies the (very
> reasonable) concern about the previous normative duplication.
>
> This paper has now been submitted for the upcoming mailing and can be
> found at https://isocpp.org/files/papers/P2029R2.html. The previous
> links to the draft will no longer work.
>
Apologies for not looking through this earlier.

"""
conditional-escape-sequence-char:
any member of the basic source character set other than u, U, x, and
the members of octal-digit and simple-escape-sequence-char
"""

I don't like talking about "members of" grammar productions. How about:

any member of the basic source character set that is not an
*octal-digit*, a *simple-escape-sequence-char*, or u, U, or x

5.13.3/Z.2.1:
"""
— If v does not exceed the range of the character-literal's type, then the
value is v.
"""

What does "the range of the character-literal's type" mean? Do you mean the
range of representable values? Or do you mean [0,0xFFFF] for char16_t and
[0,0x10FFFF] for char32_t?

5.13.3/Z.2.2:
"""
— Otherwise, if the character-literal's encoding-prefix is absent or L,
then the value is implementation-defined.
"""

I appreciate that your wording reflects the behavior of the prior wording,
but while we're here: do we really want '\ff' to have an
implementation-defined value rather than being required to be (char)0xff
(assuming 'char' is signed and 8-bit)? Now we guarantee 2s complement,
perhaps we should just say you always get the result of converting the
given value to char / wchar_t? (Similarly in 5.3.15/Z.2.)

Thanks!

> Tom.
>
> In the normative text, AFAICT, in C++20 wide multi character
> literals must be supported, with an implementation-defined value,
> but after this paper they will be conditionally supported. I don’t
> see that design change addressed in the front matter. Same
> applies to non-encodable wide characters .
>
> That is addressed in the "Proposed resolution overview" section. I can
> add a statement about this to the introduction if you like.
>
> I've been under the impression that the lack of conditionally-supported
> for these is an oversight. My understanding (and someone please correct me
> if I'm mistaken; I don't recall where I was informed of this) is that, in
> the C standard, implementation-defined includes an allowance for rejecting
> the code as ill-formed, but in the C++ standard, implementation-defined
> implies well-formed; hence the addition of conditionally-supported. If
> that understanding is correct, then the updated wording corrects alignment
> with the intent of the C standard.
>
> (I thought this also applied to ordinary multi character literals,
> but it turns out they are already conditionally supported.)
>
> Yup, in [lex.ccon]p1 <http://eel.is/c++draft/lex.ccon#1.sentence-4>.
>
> Tom.
>
> AlisdairM
>
>
> On Jul 7, 2020, at 16:33, Tom Honermann via Core <core_at_[hidden]> <core_at_[hidden]> wrote:
>
> An update of D2029R2 (Proposed resolution for core issues 411, 1656, and 2333; numeric and universal character escapes in character and string literals) is now available at https://rawgit.com/sg16-unicode/sg16/master/papers/d2029r2.html. This addresses the feedback provided on the core mailing list in the thread starting at https://lists.isocpp.org/core/2020/06/9455.php.
>
> Wording review feedback prior to the next Core issues processing teleconference would be much appreciated!
>
> Tom.
>
> _______________________________________________
> Core mailing listCore_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2020/07/9545.php
>
>
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2020/07/9570.php
>

Received on 2020-07-14 02:27:10