sg16: Re: [SG16] Comments on P2361R3 Unevaluated strings

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Wed, 3 Nov 2021 22:56:35 +0100

On 03/11/2021 19.29, Jens Maurer via SG16 wrote:
>
> Just a few random things I ran across.

Some more:

The "asm declaration" wording should probably say

"The string-literal is an unevaluated operand."

otherwise we allocate a static object for it.
Hm... But the rules in lex.string how to interpret
a string-literal only apply if it is actually
evaluated. Maybe that needs more fixing in
[lex.string] to decouple the creation of the code
unit sequence from the actual static object.

And in the added text,
"encoding-prefix" needs to be hyphenated and
italicized (it's a grammar non-terminal).

Language linkage:

Previously, we were talking about a string-literal,
so referring to that using "C" or "C++" (with quotation
marks) made sense.

Now, we define language linkage as a sequence of
characters, and the new wording reads as if the
quotation marks are part of the language linkage.

[over.literal]
The change marks seem broken; "unevaluated-string" is
struck out.

Also, unevaluated-strings don't have a trailing "\0"
(because that is added only when we create a string
literal object), so
"other than the implicit terminating '\0'"
seems misleading / wrong.

[cpp.pragma.op]
as originally specified, it seems that escape sequences
such as \n would be left alone (as two characters)
after destringizing. The new specification seems to
want to defer to lex.string for interpretation of those
escape sequences, but fails. ("semantic constraints"
doesn't mean "is interpreted as".) Is it intentional
that we now map "\n" in the new wording for _Pragma?

If so, instead of talking about deleting double-quotes,
maybe it would be better to talk about the sequence of
translation set characters that the string-literal
represents.

Jens

> translation set -> translation character set
> (appears more than once)
>
>
> The wording doesn't work, because it ignores lex.phases p1.7:
>
> "Each preprocessing token is converted into a token (5.6)."
>
> The "token" grammar non-terminal cannot represent
> an "unevaluated-string", because the latter is not a literal.
>
> Also, as shown in the prose part, we don't know until phase 7
> whether the context needs an object-producing string-literal
> or an unevaluated-string. Thus, unevaluated-string is not
> a phase 3-6 concept, but a phase 7 concept, and thus should
> not appear in [lex].
>
> Note that phases 1-6 do not convert string-literals to
> objects and thus don't interpret escape sequences just yet,
> so it seems safe to assume this is postponed to phase 7.
>
> Suggestion:
>
> - Introduce unevaluated-string in [dcl.pre]
>
> "An unevaluated-string is never evaluated and its interpretation
> depends on the context in which it appears."
>
> ->
>
> "An unevaluated-string is never evaluated.
> [ Note: Thus, a string literal object is never created for an
> unevaluated-string ([lex.string]). ]"
>
> And add a note with cross-references to all places where
> unevaluated-string appears.
>
> Jens
>

Received on 2021-11-03 16:56:40