sg16: Re: [SG16] Handling literals throughout the translation phases

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Mon, 4 Jan 2021 17:53:35 +0100

On Mon, Jan 4, 2021, 16:35 Steve Downey via SG16 <sg16_at_[hidden]>
wrote:

> Allowing escape sequences to be synthesized would be a surprising change
> in behavior. If we were designing this de novo it's just a choice, but the
> preprocessor is old, shared with C, and baked into lots of tools.
>

I feel like I ask this question a lot, but
how much code would be impacted in practice?

Either way, I don't think it matters much:
- replacing escape sequences
- concatenating
- encoding

Seem to be a reasonable order of operations to me.
Might even be less surprising than doing the concatenation first!

However, if operator"" "" ends up being an issue, i would again ask how
useful that feature is!

>
> On Mon, Jan 4, 2021 at 9:54 AM Peter Brett via SG16 <sg16_at_[hidden]>
> wrote:
>
>> Please could someone remind me of the *downsides* of allowing escape
>> sequences to be synthesized into string literals through pre-processor
>> concatenation?
>>
>> Many thanks,
>>
>> Peter
>>
>> > -----Original Message-----
>> > From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Jens Maurer
>> via SG16
>> > Sent: 19 December 2020 22:45
>> > To: Corentin Jabot <corentinjabot_at_[hidden]>; SG16 <
>> sg16_at_[hidden]>
>> > Cc: Jens Maurer <Jens.Maurer_at_[hidden]>
>> > Subject: Re: [SG16] Handling literals throughout the translation phases
>> >
>> > EXTERNAL MAIL
>> >
>> >
>> > On 18/12/2020 10.33, Corentin Jabot wrote:
>> > > On Thu, Dec 17, 2020, 22:33 Jens Maurer via SG16 <
>> sg16_at_[hidden]
>> > <mailto:sg16_at_[hidden]>> wrote:
>> > >
>> > >
>> > > I'm working on a paper that switches C++ to a modified "model B"
>> > approach for
>> > > universal-character-names as described in the C99 Rationale v5.10,
>> > section 5.2.1.
>> > >
>> > >
>> > > I thought sg16 agreed to not replace ucn until phase 5 a few meetings
>> ago,
>> > did I completely missunderstood what sg16 agreed ?
>> >
>> > The difference is that we do not produce UCNs is phase 1.
>> > Instead, phase 1 simply produces Unicode scalar values.
>> > Any UCNs that appeared in the original source are replaced later.
>> >
>> > > My current idea is to focus on the creation of the string literal
>> > > object; that's when transcoding to execution (literal) encoding
>> > > happens. All other uses of string-literals don't produce objects,
>> > > so aren't transcoded.
>> > >
>> > > In order to be able to interpret escape-sequences in phase 5/6,
>> > > we need a "tunnel" for numeric-escape-sequences. One idea would
>> > > be to add "code unit characters" to the translation character set,
>> > > where each such character represents a code unit coming from a
>> > > numeric-escape-sequence. The sole purpose is to keep the
>> > > code units safe until we produce the initializer for the
>> > > string literal object.
>> > >
>> > > The alternative would be to delay all interpretation of escape-
>> > > sequences to when we produce the initializer for the string
>> > > literal object, but that also means we need to delay string
>> > > literal concatenation until that time (see first item above).
>> > >
>> > >
>> > > Would that cause any issue? This would otherwise be my preferred
>> solution!
>> >
>> > We currently support operator "" "" "" in [over.literal], for
>> example.
>> > We'd need to make string-literal concatenation first-class citizens
>> > in phase 7 (e.g. making it a constant expression or so), which is a
>> fairly
>> > large hammer.
>> >
>> > Jens
>> >
>> >
>> > --
>> > SG16 mailing list
>> > SG16_at_[hidden]
>> >
>> https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg
>> > 16__;!!EHscmS1ygiU1lA!UD-
>> > 5R2q135Y6KFqLCSPTdN4MoF1skMz9Clm4f_oANDvBoEzgrct6vMkc9NQQMw$
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2021-01-04 10:53:50