sg16: Re: [SG16] Feedback on P1854: Conversion to literal encoding should not lead to loss of meaning

From: Corentin <corentin.jabot_at_[hidden]>
Date: Sat, 6 Nov 2021 09:17:13 +0100

On Sat, Nov 6, 2021 at 3:05 AM Hubert Tong <hubert.reinterpretcast_at_[hidden]>
wrote:

> The current R2 draft has this:
>
>> A multicharacter literal shall not have an encoding prefix. Each
>> character represented by a *basic-c-char* or a *universal-character-name*
>> in a multicharacter literal shall be encodable as a single code unit in the
>> narrow literal encoding.
>
>
> The above does not provide a restriction on *conditional-escape-sequence*s
> and *numeric-escape-sequence*s in multicharacter literals. We presumably
> only want to allow ones that are valid as the sole *c-char* in a
> *character-literal* with no encoding prefix. Indeed, that general
> description may be sufficient for all forms of *c-char*.
>

Why should it?
My only goal is to forbid multi characters literals visually
indistinguishable from single character literals, in scenarios where
multiple codepoints results in a single glyph.
Given the implementation-defined nature of multi characters, I do not think
adding further restrictions on *numeric-escape-sequence*s has any value in
this scenario. What would be the gain / pitfall avoided by further
restriction?

>
> Also, the title of the paper is not particularly helpful in terms of
> indicating what it proposes. I think something like "Support only
> straightforward multicharacter literals and encodable string literals"
> would be better.
>
> -- HT
>

Received on 2021-11-06 03:17:26