Subject: Re: WG14 N2688: Sterile characters
From: Aaron Ballman (aaron_at_[hidden])
Date: 2021-04-08 10:24:16
On Tue, Apr 6, 2021 at 5:47 PM Tom Honermann <tom_at_[hidden]> wrote:
> On 4/6/21 4:49 PM, Corentin Jabot wrote:
> On Tue, Apr 6, 2021 at 10:17 PM Jens Maurer via SG16 <sg16_at_[hidden]> wrote:
>> On 06/04/2021 05.33, Tom Honermann via SG16 wrote:
>> > FYI, WG14 will be considering N2688 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2688.pdf>, a paper proposing to clarify how characters in character and string literals are handled when the execution character set (literal encoding) does not support representation for the source character (/basic-c-char/ or /universal-character-name/ in C++) or escape sequence (presumably just /simple-escape-sequence/ in C++, but the paper doesn't explicitly exclude numeric escape sequences).
>> In C++, we say implementation-defined for character literals
>> and, for string literals:
>> If the string-literalâs encoding-prefix is absent or L, then
>> the string-literal is conditionally-supported
>> and an implementation-defined code unit sequence is encoded.
>> Otherwise, the string-literal is ill-formed.
>> So, C++ allows to pick a fresh random character every time
>> the situation occurs, except that UTF-x string literals
>> are ill-formed if they contain a non-encodable character
>> (can those even exist?)
>> Should SG16 offer a C++ perspective to WG14, e.g. via SG22?
> CCing Aaron.
> I'd rather not spend SG16 telecon time on this, but If WG14 elects to make a change, then I think SG22 can handle propagating the change to WG21 or otherwise resolving any incompatibility, consulting SG16 as Aaron feels necessary.
My hope is that SG22 would see the paper first as this would have
impact on string literals in header files shared between C and C++,
and the SG22 can give recommendation to WG14 and WG21 on whether
that's likely to be an issue or not. However, the order in which
papers arrive in SG22 is still a bit up-in-the-air, so whether WG14
sees it and then SG22 or vice versa isn't a big concern.
>> We might want to postpone that discussion - Peter and I will bring an updated revision of wg21.link/P1854 which makes that ill-formed in C++.
>> I will note that "non-encodable character" is certainly a better term than "sterile character" (better yet: non representable abstract character)
> And that would be good information to share with SG22.
Agreed! Hopefully I'll remember to share it with WG14 if they see this
paper before SG22 does.
>> SG16 mailing list
SG16 list run by email@example.com