Damn, I finally see the issue.

Terribly sorry it took this long

Which leads me to think that the current order of operation is a better place to be in, unless we find a better mechanism

I think that the status quo in terms of observable behavior pertaining to escape sequences is correct

I don't feel so good about the idea of introducing weird wording hacks such as more abstract characters to achieve that behavior while swapping operations.

I think we already decided that in phase 5, each character is encoded individually, and in any case there cannot be partial code unit sequences anywhere in each string.

Therefore, maybe the current order of operation makes sense as it cannot be observed.

One of the issues we had was stateful encodings; I am of the mind that this can be left in the realm of implementation discretion and that there seems to be limited value in the standard specifying a behavior there. Seems to be working fine currently.

TL;DR: I feel like I have been wrong

On Mon, Jan 4, 2021 at 11:42 PM Steve Downey via SG16 <sg16@lists.isocpp.org> wrote:

Also, if the change in behavior happens anywhere, it's likely to be in code that is nearly impossible to fix, because someone is doing something legal but terrible with the preprocessor.

On Mon, Jan 4, 2021 at 5:28 PM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:
On 1/4/21 9:54 AM, Peter Brett via SG16 wrote:
> Please could someone remind me of the *downsides* of allowing escape sequences to be synthesized into string literals through pre-processor concatenation?

The only way to end a hexadecimal escape sequence is with the end of the
string literal, or a character other than a hex digit (a-f, A-F, or
0-9). If concatentation was performed before recognition of escape
sequences, then encoding any of the hex digits following a hexadecimal
escape sequence would require specifying them using an escape sequence.
Similar concerns exist for octal escape sequences, but could be avoided
by always using a maximal length octal escape sequence.

Tom.

>
> Many thanks,
>
> Peter
>
>> -----Original Message-----
>> From: SG16 <sg16-bounces@lists.isocpp.org> On Behalf Of Jens Maurer via SG16
>> Sent: 19 December 2020 22:45
>> To: Corentin Jabot <corentinjabot@gmail.com>; SG16 <sg16@lists.isocpp.org>
>> Cc: Jens Maurer <Jens.Maurer@gmx.net>
>> Subject: Re: [SG16] Handling literals throughout the translation phases
>>
>> EXTERNAL MAIL
>>
>>
>> On 18/12/2020 10.33, Corentin Jabot wrote:
>>> On Thu, Dec 17, 2020, 22:33 Jens Maurer via SG16 <sg16@lists.isocpp.org
>> <mailto:sg16@lists.isocpp.org>> wrote:
>>>
>>> I'm working on a paper that switches C++ to a modified "model B"
>> approach for
>>> universal-character-names as described in the C99 Rationale v5.10,
>> section 5.2.1.
>>>
>>> I thought sg16 agreed to not replace ucn until phase 5 a few meetings ago,
>> did I completely missunderstood what sg16 agreed ?
>>
>> The difference is that we do not produce UCNs is phase 1.
>> Instead, phase 1 simply produces Unicode scalar values.
>> Any UCNs that appeared in the original source are replaced later.
>>
>>> My current idea is to focus on the creation of the string literal
>>> object; that's when transcoding to execution (literal) encoding
>>> happens. All other uses of string-literals don't produce objects,
>>> so aren't transcoded.
>>>
>>> In order to be able to interpret escape-sequences in phase 5/6,
>>> we need a "tunnel" for numeric-escape-sequences. One idea would
>>> be to add "code unit characters" to the translation character set,
>>> where each such character represents a code unit coming from a
>>> numeric-escape-sequence. The sole purpose is to keep the
>>> code units safe until we produce the initializer for the
>>> string literal object.
>>>
>>> The alternative would be to delay all interpretation of escape-
>>> sequences to when we produce the initializer for the string
>>> literal object, but that also means we need to delay string
>>> literal concatenation until that time (see first item above).
>>>
>>>
>>> Would that cause any issue? This would otherwise be my preferred solution!
>> We currently support operator "" "" "" in [over.literal], for example.
>> We'd need to make string-literal concatenation first-class citizens
>> in phase 7 (e.g. making it a constant expression or so), which is a fairly
>> large hammer.
>>
>> Jens
>>
>>
>> --
>> SG16 mailing list
>> SG16@lists.isocpp.org
>> https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg
>> 16__;!!EHscmS1ygiU1lA!UD-
>> 5R2q135Y6KFqLCSPTdN4MoF1skMz9Clm4f_oANDvBoEzgrct6vMkc9NQQMw$

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16