C++ Logo


Advanced search

Subject: Re: [SG16-Unicode] Feedback on P1139: Address wording issues related to ISO 10646
From: martinho.fernandes (martinho.fernandes_at_[hidden])
Date: 2019-02-05 04:45:46

> I am not certain what the text is meant to say.
> Is the intention merely that the numeric value shall be within the UCS
> codespace (0x0-0x10FFFF, inclusive)?
> Is the intention that the code point shall be one of those that are
> assigned to characters by ISO/IEC 10646 (in which case, the additional
> limitation on surrogate code points is redundant)?
> Is the intention that the code point shall not be one whose basic type is
> the noncharacter type?

Jens is correct; the only limitation intended is constraining the range
of values to 0..D7FF + E000..10FFFF (i.e. 0..10FFFF with surrogates
carved out). I should not have used "character" there. The note uses
correct unambiguous wording with "ISO/IEC 10646 code points", so I will
rephrase the normative text as follows.

> If a /universal-character-name/ does not correspond to a code point
> in ISO/IEC 10646 or if a /universal-character-name/ corresponds to a
> surrogate code point [...]

Note that I omitted the notes here for brevity and clarity; I think they
should remain.

Once I have updated the paper I will put it on the wiki (and also
include the changes from the PR directly in the paper so it is

Martinho Fernandes

SG16 list run by herb.sutter at gmail.com