On Thu, 9 Jul 2020 at 22:16, Jens Maurer <Jens.Maurer@gmx.net> wrote:

On 09/07/2020 22.12, Corentin Jabot wrote:
>
>
> On Thu, Jul 9, 2020, 21:44 Tom Honermann via SG16 <sg16@lists.isocpp.org <mailto:sg16@lists.isocpp.org>> wrote:
>
> On 7/9/20 3:16 PM, Jens Maurer wrote:
> > On 09/07/2020 18.28, Tom Honermann wrote:
> >> On 7/8/20 3:15 PM, Jens Maurer wrote:
> >>> Since all four well-known C++ implementations appear to
> >>> produce an error for the test cases at
> >>> https://compiler-explorer.com/z/4NDo-4
> >>> I'm fine with specifying these as ill-formed.
> >> I'm fine with that as well.
> >>
> >> Jens, would you consider such a change as evolutionary given that we don't know of any implementations (so far) that actually support these concatenations?
> > I'm not the one to make the call here.
> I know, I was just looking for an opinion from a CWG regular. Thank you.
> > Strictly speaking, it changes the standard for some feature from
> > "conditionally-supported" to "ill-formed", which does sound a bit
> > evolutionary, in particular since we depart a little further from
> > C here.
> >
> > However, personally, I'm ok with this going to Core right away.
> >
> > JF should make the call here.
>
> Agreed.
>
> We don't have a paper for this yet. If we have a volunteer to write a
> paper to make concatenations involving mixed L"", u8"", u"", and U""
> concatenations ill-formed, I'll be happy to discuss with JF with
> encouragement to take it straight to Core.
>
>
> There is one

If we want to maintain the option of going straight to Core,
we can't mix this isolated issue with anything else that
might be more controversial.

True.

There is no urgency though, no point even trying until we can word against P2029

> and as Jens said we can't do the wording for that right now.
> The wording paper should also make sure that the order of operations is correct.
>
> ( Replacement of escape sequences, concatenation, encoding)

Unfortunately, it's not that easy, because numeric-escape-sequences
produce individual code units, not to-be-encoded characters.

Indeed. Very good point

Let me try that again:

They should actually be encoded separately, but the encoding is determined before.

And then concatenated, and then a null-terminator added.

(that still has the issue of potentially generated unnecessary shift-state, but i think that doesn't need to be dealt with normatively.

Jens