C++ Logo

sg16

Advanced search

Re: [SG16] Concatenating unicode string literals

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Thu, 9 Jul 2020 22:16:28 +0200
On 09/07/2020 22.12, Corentin Jabot wrote:
>
>
> On Thu, Jul 9, 2020, 21:44 Tom Honermann via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>
> On 7/9/20 3:16 PM, Jens Maurer wrote:
> > On 09/07/2020 18.28, Tom Honermann wrote:
> >> On 7/8/20 3:15 PM, Jens Maurer wrote:
> >>> Since all four well-known C++ implementations appear to
> >>> produce an error for the test cases at
> >>> https://compiler-explorer.com/z/4NDo-4
> >>> I'm fine with specifying these as ill-formed.
> >> I'm fine with that as well.
> >>
> >> Jens, would you consider such a change as evolutionary given that we don't know of any implementations (so far) that actually support these concatenations?
> > I'm not the one to make the call here.
> I know, I was just looking for an opinion from a CWG regular. Thank you.
> > Strictly speaking, it changes the standard for some feature from
> > "conditionally-supported" to "ill-formed", which does sound a bit
> > evolutionary, in particular since we depart a little further from
> > C here.
> >
> > However, personally, I'm ok with this going to Core right away.
> >
> > JF should make the call here.
>
> Agreed.
>
> We don't have a paper for this yet. If we have a volunteer to write a
> paper to make concatenations involving mixed L"", u8"", u"", and U""
> concatenations ill-formed, I'll be happy to discuss with JF with
> encouragement to take it straight to Core.
>
>
> There is one

If we want to maintain the option of going straight to Core,
we can't mix this isolated issue with anything else that
might be more controversial.

> and as Jens said we can't do the wording for that right now.
> The wording paper should also make sure that the order of operations is correct.
>
> ( Replacement of escape sequences, concatenation, encoding)

Unfortunately, it's not that easy, because numeric-escape-sequences
produce individual code units, not to-be-encoded characters.

Jens

Received on 2020-07-09 15:19:48