C++ Logo

SG16

Advanced search

Subject: Concatenating unicode string literals
From: Alisdair Meredith (alisdairm_at_[hidden])
Date: 2020-07-08 06:09:39


After taking another look over P2029 resolving a few core issues,
I am further concerned by [lex.string]p11, which states (among
other things) that concatenation of unicode string literals with
different encoding-prefixes is conditionally supported with
implementation-defined behavior. That seems a little to free for
my tastes.

I can buy conditionally supported, although see no harm in
requiring it for any combination of unicode encoding prefixes.
I am concerned about the implementation-defined behavior:
the end result should be the result of concatenating the
transcoded representation of each of the strings into a common
encoding, corresponding to one of the involved encoding
prefixes. I am happy to defer to implementations to choose
between UTF8/16/32, or we could define a canonical prefered
ordering among those choices.

Does this seem worth calling out (yet another SG16 paper) or
better left alone, as we already have way too much busy work
on this groups plate, and implementation will most likely do the
right thing anyway?

(I am not overly concerned about specifying concatenation for
narrow/wide string literals with unicode string literals, which
can remain conditionally supported with implementation-defined
values.)

AlisdairM


SG16 list run by sg16-owner@lists.isocpp.org