C++ Logo


Advanced search

Re: [SG16] Handling literals throughout the translation phases

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Sat, 19 Dec 2020 23:45:28 +0100
On 18/12/2020 10.33, Corentin Jabot wrote:
> On Thu, Dec 17, 2020, 22:33 Jens Maurer via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
> I'm working on a paper that switches C++ to a modified "model B" approach for
> universal-character-names as described in the C99 Rationale v5.10, section 5.2.1.
> I thought sg16 agreed to not replace ucn until phase 5 a few meetings ago, did I completely missunderstood what sg16 agreed ?

The difference is that we do not produce UCNs is phase 1.
Instead, phase 1 simply produces Unicode scalar values.
Any UCNs that appeared in the original source are replaced later.

> My current idea is to focus on the creation of the string literal
> object; that's when transcoding to execution (literal) encoding
> happens. All other uses of string-literals don't produce objects,
> so aren't transcoded.
> In order to be able to interpret escape-sequences in phase 5/6,
> we need a "tunnel" for numeric-escape-sequences. One idea would
> be to add "code unit characters" to the translation character set,
> where each such character represents a code unit coming from a
> numeric-escape-sequence. The sole purpose is to keep the
> code units safe until we produce the initializer for the
> string literal object.
> The alternative would be to delay all interpretation of escape-
> sequences to when we produce the initializer for the string
> literal object, but that also means we need to delay string
> literal concatenation until that time (see first item above).
> Would that cause any issue? This would otherwise be my preferred solution!

We currently support operator "" "" "" in [over.literal], for example.
We'd need to make string-literal concatenation first-class citizens
in phase 7 (e.g. making it a constant expression or so), which is a fairly
large hammer.


Received on 2020-12-19 16:45:33