Date: Mon, 27 Jan 2025 22:01:38 +0100
On Mon, Jan 27, 2025 at 9:31 PM Alisdair Meredith <alisdairm_at_[hidden]> wrote:
>
>
> On Jan 27, 2025, at 3:22 PM, Corentin Jabot via SG16 <
> sg16_at_[hidden]> wrote:
>
> On Mon, Jan 27, 2025 at 7:00 PM Tom Honermann via SG16 <
> sg16_at_[hidden]> wrote:
>
>>
>> - Handling of escape sequences needs more specification. *This does
>> not appear to have been addressed.*
>>
>> Meh. Should we want restrictions, they would be the same for
> static_assert, and I tend to see it's purely QOI
> https://compiler-explorer.com/z/nresK1qrd
> <https://compiler-explorer.com/z/nresK1qrd>
>
>
> My thoughts are that we treat the supplied message like an unevaluated
> string.
>
> Quoting [lex.string.uneval]:
>
> > Each universal-character-name and each simple-escape-sequence in an
> unevaluated-string is replaced by the
> > member of the translation character set it denotes. An
> unevaluated-string that contains a numeric-escape-
> > sequence or a conditional-escape-sequence is ill-formed.
>
> My suggestion would be to add “and is processed as an unevaluated-string.”
> To the end of p2 of the library wording.
>
> That should handle making numeric and conditional escape sequences
> ill-formed, and map the universal names and simple escape sequences to
> their corresponding sequence of UTF-8 encoded code points.
>
No, unevaluated strings are always UTF-8, and are lexical constructs (they
are constrained string literals) - So we can't talk about
unevaluated-strings in library wording.
What barry proposes takes an expression, which is post phase 6 (ie the
string is converted to the literal encoding (not necessarily UTF-8), escape
sequences are replaced by their respective control characters, numeric
escape sequences are converted to code units and injected, etc)
So I should have been more clear that the question is really whether we
want to restrict control/non-printable characters rather than escape
sequences. Sorry about that
>
> AlisdairM
>
>
>
>
> On Jan 27, 2025, at 3:22 PM, Corentin Jabot via SG16 <
> sg16_at_[hidden]> wrote:
>
> On Mon, Jan 27, 2025 at 7:00 PM Tom Honermann via SG16 <
> sg16_at_[hidden]> wrote:
>
>>
>> - Handling of escape sequences needs more specification. *This does
>> not appear to have been addressed.*
>>
>> Meh. Should we want restrictions, they would be the same for
> static_assert, and I tend to see it's purely QOI
> https://compiler-explorer.com/z/nresK1qrd
> <https://compiler-explorer.com/z/nresK1qrd>
>
>
> My thoughts are that we treat the supplied message like an unevaluated
> string.
>
> Quoting [lex.string.uneval]:
>
> > Each universal-character-name and each simple-escape-sequence in an
> unevaluated-string is replaced by the
> > member of the translation character set it denotes. An
> unevaluated-string that contains a numeric-escape-
> > sequence or a conditional-escape-sequence is ill-formed.
>
> My suggestion would be to add “and is processed as an unevaluated-string.”
> To the end of p2 of the library wording.
>
> That should handle making numeric and conditional escape sequences
> ill-formed, and map the universal names and simple escape sequences to
> their corresponding sequence of UTF-8 encoded code points.
>
No, unevaluated strings are always UTF-8, and are lexical constructs (they
are constrained string literals) - So we can't talk about
unevaluated-strings in library wording.
What barry proposes takes an expression, which is post phase 6 (ie the
string is converted to the literal encoding (not necessarily UTF-8), escape
sequences are replaced by their respective control characters, numeric
escape sequences are converted to code units and injected, etc)
So I should have been more clear that the question is really whether we
want to restrict control/non-printable characters rather than escape
sequences. Sorry about that
>
> AlisdairM
>
>
Received on 2025-01-27 21:01:58