C++ Logo

sg16

Advanced search

Re: [SG16] Concatenating unicode string literals

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Wed, 8 Jul 2020 21:15:31 +0200
On 08/07/2020 13.09, Alisdair Meredith via SG16 wrote:
> After taking another look over P2029 resolving a few core issues,
> I am further concerned by [lex.string]p11, which states (among
> other things) that concatenation of unicode string literals with
> different encoding-prefixes is conditionally supported with
> implementation-defined behavior. That seems a little to free for
> my tastes.
>
> I can buy conditionally supported, although see no harm in
> requiring it for any combination of unicode encoding prefixes.
> I am concerned about the implementation-defined behavior:
> the end result should be the result of concatenating the
> transcoded representation of each of the strings into a common
> encoding, corresponding to one of the involved encoding
> prefixes.

That's not how it works. You first pick a common
encoding-prefix for the concatenation (whatever it is),
and then you encode the entire (concatenated) string
using that encoding-prefix.

> I am happy to defer to implementations to choose
> between UTF8/16/32, or we could define a canonical prefered
> ordering among those choices.

Since all four well-known C++ implementations appear to
produce an error for the test cases at
https://compiler-explorer.com/z/4NDo-4
I'm fine with specifying these as ill-formed.

There is no (technical) need to support these cases,
and nobody has written code like that (because
no compiler accepts it), so let's nix it.

From a procedural standpoint, P2029 produces enough
churn in the general area that I'd like to see P2029
hit the working draft before future papers in that
area are processed.

Jens

Received on 2020-07-08 14:18:59