Date: Wed, 30 Jan 2019 23:32:37 +0100
On 01/30/2019 11:11 PM, Hubert Tong wrote:
> Where the text says:
> If a /universal-character-name/ does not correspond to any character in ISO/IEC 10646 [ ... ].
That in itself is undesirable: We expressly do not want to have an ever-changing table
of valid characters in the compiler, so the normative statement here is undecidable.
What we want to say is that UCNs having values outside of 0x0-0x10FFFF are ill-formed.
Which tokens to pick from the Unicode vocabulary to say that is another matter...
Jens
> The note indicates:
> ISO/IEC 10646 code points are within the range 0x0-0x10FFFF, inclusive.
>
> I am not certain what the text is meant to say.
> Is the intention merely that the numeric value shall be within the UCS codespace (0x0-0x10FFFF, inclusive)?
> Is the intention that the code point shall be one of those that are assigned to characters by ISO/IEC 10646 (in which case, the additional limitation on surrogate code points is redundant)?
> Is the intention that the code point shall not be one whose basic type is the noncharacter type?
>
> Also, the paper should probably including something like the following:
> Modify in [lex.string],
> <del>Within char32_t and char16_t string literals, any /universal-character-names/ shall be within the range 0x0 to 0x10FFFF.</del>
>
>
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: http://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/01/5500.php
>
> Where the text says:
> If a /universal-character-name/ does not correspond to any character in ISO/IEC 10646 [ ... ].
That in itself is undesirable: We expressly do not want to have an ever-changing table
of valid characters in the compiler, so the normative statement here is undecidable.
What we want to say is that UCNs having values outside of 0x0-0x10FFFF are ill-formed.
Which tokens to pick from the Unicode vocabulary to say that is another matter...
Jens
> The note indicates:
> ISO/IEC 10646 code points are within the range 0x0-0x10FFFF, inclusive.
>
> I am not certain what the text is meant to say.
> Is the intention merely that the numeric value shall be within the UCS codespace (0x0-0x10FFFF, inclusive)?
> Is the intention that the code point shall be one of those that are assigned to characters by ISO/IEC 10646 (in which case, the additional limitation on surrogate code points is redundant)?
> Is the intention that the code point shall not be one whose basic type is the noncharacter type?
>
> Also, the paper should probably including something like the following:
> Modify in [lex.string],
> <del>Within char32_t and char16_t string literals, any /universal-character-names/ shall be within the range 0x0 to 0x10FFFF.</del>
>
>
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: http://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/01/5500.php
>
Received on 2019-01-30 23:37:52