C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] [isocpp-core] Feedback on P1139: Address wording issues related to ISO 10646
From: Jens Maurer (Jens.Maurer_at_[hidden])
Date: 2019-01-30 16:32:37


On 01/30/2019 11:11 PM, Hubert Tong wrote:
> Where the text says:
> If a /universal-character-name/ does not correspond to any character in ISO/IEC 10646 [ ... ].

That in itself is undesirable: We expressly do not want to have an ever-changing table
of valid characters in the compiler, so the normative statement here is undecidable.

What we want to say is that UCNs having values outside of 0x0-0x10FFFF are ill-formed.

Which tokens to pick from the Unicode vocabulary to say that is another matter...

Jens

> The note indicates:
> ISO/IEC 10646 code points are within the range 0x0-0x10FFFF, inclusive.
>
> I am not certain what the text is meant to say.
> Is the intention merely that the numeric value shall be within the UCS codespace (0x0-0x10FFFF, inclusive)?
> Is the intention that the code point shall be one of those that are assigned to characters by ISO/IEC 10646 (in which case, the additional limitation on surrogate code points is redundant)?
> Is the intention that the code point shall not be one whose basic type is the noncharacter type?
>
> Also, the paper should probably including something like the following:
> Modify in [lex.string],
> <del>Within char32_t and char16_t string literals, any /universal-character-names/ shall be within the range 0x0 to 0x10FFFF.</del>
>
>
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: http://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/01/5500.php
>


SG16 list run by sg16-owner@lists.isocpp.org