If invoking the native Unicode API requires transcoding,
implementations should substitute invalid code units
with U+fffd replacement character per
The Unicode Standard Version 14.0 - Core Specification, Chapter 3.9
We can update that document, the chapter number would stay the same.
And we need to refer to this document as ISO 10646 does not specify a replacement character mechanism.
====
Normative reference:
Floating reference
This describes the list of properties (such as XID_Start, XID_Continue, Grapheme_Extend, General_Category) used in both the core of library wording.
Neither the name, status, or possible values of the properties have changed.
Note that the value of these properties for individual codepoints is not governed by that annex, which just describes the list of properties as whole, and their possible values.
====
Bibliography entry:
Changing that reference has no impact on [lex.identifier] which is only referring to XID_Start,
XID_Continue. SG16 is currently either modifying [uaxid] or removing it. we should make sure [uaxid] conforms to the last version of the annex and update the bibliography entry accordingly.
======
Normative reference:
The Unicode Standard,
Derived Core Properties.
This one is interesting. It's a floating reference, pointing to 15.0 and used in the grammar of identifiers.
Which is great.
The issue is that for the name of identifiers, we refer to ISO 10646, which is Unicode 13 based
This means that
void f() {
// same character spelled differently and introduced in 14.
auto \u{16A70} = 0;
\N{TANGSA LETTER OZ} = 1;
}
is, I guess, technically ill-formed. Ie, the set of identifiers that can be spelled by their name is smaller than the set of valid identifiers.
This is why there is a separate NB comment asking for the name of identifiers and
the
XID_ properties to be extracted from the same version of unicode,
instead of one from Unicode and one from ISO 10646.