Date: Wed, 3 Nov 2021 09:21:12 +0100
On 03/11/2021 03.07, Steve Downey via SG16 wrote:
> Updated paper with wording for UCN form of named unicode characters, with changes as suggested by Jens. This reflects the strongest consensus in EWG for exact matches.
Thanks.
Prose text:
"the ISO/IEC standard that specifies a subset of what is specified in the Unicode standard and is kept synchronized with it."
Could we please limit that synchronization claim to the assigned character set?
Something like the definition of UTF-16 is not fully in sync between Unicode
and ISO 10646.
Wording:
Grammar for universal-character-name:
The third option "named-universal-character" was added and needs to be green
(and preferably underlined).
\u hex-quad or \U hex-quad hex-quad
The \u and \U needs to be fixed-width font.
A named-universal-character designates the character in the translation character set whose associated character name as assigned by ISO 10646 matches the given n-char-sequence.
named-universal-character needs to be italics (grammar non-terminal).
"whose associated character name ... matches"
This seems to talk about names only, not about aliases.
"whose" further deepens the impression that each character
has a single name.
"matches": We don't have fuzzy match, so just say "is".
Suggestion:
"A named-universal-character designates the character in the translation character set
whose associated character name or character name alias is the given n-char-sequence.
The program is ill-formed if there is no such character."
Turn the drafting note into a proper note.
Meanwhile, ISO 10646:2020 is out, so we should refer to that.
Note that a "character name alias" per ISO 10646 is only used
for typo corrections and similar, and only these aliases are
deemed normative by ISO 10646.
Other aliases, such as "bang" for "EXCLAMATION MARK",
are called "informative aliases" and are non-normative.
Those are currently excluded from C++ per the proposed normative
wording. Is that intentional?
(See ISO 10646:2020 section 34.3.)
Jens
> Updated paper with wording for UCN form of named unicode characters, with changes as suggested by Jens. This reflects the strongest consensus in EWG for exact matches.
Thanks.
Prose text:
"the ISO/IEC standard that specifies a subset of what is specified in the Unicode standard and is kept synchronized with it."
Could we please limit that synchronization claim to the assigned character set?
Something like the definition of UTF-16 is not fully in sync between Unicode
and ISO 10646.
Wording:
Grammar for universal-character-name:
The third option "named-universal-character" was added and needs to be green
(and preferably underlined).
\u hex-quad or \U hex-quad hex-quad
The \u and \U needs to be fixed-width font.
A named-universal-character designates the character in the translation character set whose associated character name as assigned by ISO 10646 matches the given n-char-sequence.
named-universal-character needs to be italics (grammar non-terminal).
"whose associated character name ... matches"
This seems to talk about names only, not about aliases.
"whose" further deepens the impression that each character
has a single name.
"matches": We don't have fuzzy match, so just say "is".
Suggestion:
"A named-universal-character designates the character in the translation character set
whose associated character name or character name alias is the given n-char-sequence.
The program is ill-formed if there is no such character."
Turn the drafting note into a proper note.
Meanwhile, ISO 10646:2020 is out, so we should refer to that.
Note that a "character name alias" per ISO 10646 is only used
for typo corrections and similar, and only these aliases are
deemed normative by ISO 10646.
Other aliases, such as "bang" for "EXCLAMATION MARK",
are called "informative aliases" and are non-normative.
Those are currently excluded from C++ per the proposed normative
wording. Is that intentional?
(See ISO 10646:2020 section 34.3.)
Jens
Received on 2021-11-03 03:21:19