Thank you for the comments on this thread. There hasn't been much engagement, so we'll discuss this proposal further in a regular SG16 meeting; hopefully the one scheduled for next week.

On Tuesday, LEWG approved forwarding the paper contingent on further SG16 review.

Tom.

On 1/27/25 4:01 PM, Corentin Jabot via SG16 wrote:


On Mon, Jan 27, 2025 at 9:31 PM Alisdair Meredith <alisdairm@me.com> wrote:


On Jan 27, 2025, at 3:22 PM, Corentin Jabot via SG16 <sg16@lists.isocpp.org> wrote:

On Mon, Jan 27, 2025 at 7:00 PM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:
  • Handling of escape sequences needs more specification. This does not appear to have been addressed.
Meh. Should we want restrictions, they would be the same for static_assert, and I tend to see it's purely QOI

My thoughts are that we treat the supplied message like an unevaluated string.

Quoting [lex.string.uneval]:

> Each universal-character-name and each simple-escape-sequence in an unevaluated-string is replaced by the
> member of the translation character set it denotes. An unevaluated-string that contains a numeric-escape-
> sequence or a conditional-escape-sequence is ill-formed.

My suggestion would be to add “and is processed as an unevaluated-string.” To the end of p2 of the library wording.

That should handle making numeric and conditional escape sequences ill-formed, and map the universal names and simple escape sequences to their corresponding sequence of UTF-8 encoded code points.

No, unevaluated strings are always UTF-8, and are lexical constructs (they are constrained string literals) - So we can't talk about unevaluated-strings in library wording.
What barry proposes takes an expression, which is post phase 6 (ie the string is converted to the literal encoding (not necessarily UTF-8), escape sequences are replaced by their respective control characters, numeric escape sequences are converted to code units and injected, etc)

So I should have been more clear that the question is really whether we want to restrict control/non-printable characters rather than escape sequences. Sorry about that

 

AlisdairM