On 4/6/21 4:49 PM, Corentin Jabot wrote:


On Tue, Apr 6, 2021 at 10:17 PM Jens Maurer via SG16 <sg16@lists.isocpp.org> wrote:
On 06/04/2021 05.33, Tom Honermann via SG16 wrote:
> FYI, WG14 will be considering N2688 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2688.pdf>, a paper proposing to clarify how characters in character and string literals are handled when the execution character set (literal encoding) does not support representation for the source character (/basic-c-char/ or /universal-character-name/ in C++) or escape sequence (presumably just /simple-escape-sequence/ in C++, but the paper doesn't explicitly exclude numeric escape sequences).

In C++, we say implementation-defined for character literals
and, for string literals:

If the string-literal’s encoding-prefix is absent or L, then
the string-literal is conditionally-supported
and an implementation-defined code unit sequence is encoded.
Otherwise, the string-literal is ill-formed.

So, C++ allows to pick a fresh random character every time
the situation occurs, except that UTF-x string literals
are ill-formed if they contain a non-encodable character
(can those even exist?)

Should SG16 offer a C++ perspective to WG14, e.g. via SG22?

CCing Aaron.

I'd rather not spend SG16 telecon time on this, but If WG14 elects to make a change, then I think SG22 can handle propagating the change to WG21 or otherwise resolving any incompatibility, consulting SG16 as Aaron feels necessary.


We might want to postpone that discussion - Peter and I will bring an updated revision of wg21.link/P1854 which makes that ill-formed in C++.
I will note that "non-encodable character" is certainly a better term than "sterile character" (better yet: non representable abstract character)

And that would be good information to share with SG22.

Tom.