C++ Logo


Advanced search

Re: [SG16] The Unicode Standard vs 10646 (which is defective)

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Sat, 6 Nov 2021 09:46:30 +0100
I would personally prefer to take a reference to unicode (in addition of
ISO 10646), or to another Unicode document describing names and aliases,
rather than
putting on the c++ standard the burden to maintain a list of aliases.
Is that something we can consider?

On Sat, Nov 6, 2021 at 5:25 AM Steve Downey via SG16 <sg16_at_[hidden]>

> From 24.1 "Character Names List" of the Unicode Standard 14.0 (the
> upstream document that seems to be well maintained)
> Normative Aliases
> A normative character name alias is a formal, unique, and stable alternate
> name for a character. In limited circumstances, characters are given
> normative character name aliases where there is a defect in the character
> name. These normative aliases do not replace the character name, but rather
> allow users to refer formally to the character without requiring the use of
> a defective name. For more information, see Section 4.8, Name.
> Normative aliases which provide information about corrections to defective
> character names or which provide alternate names in wide use for a Unicode
> format character are printed in the character names list, preceded by a
> special symbol ". Normative aliases serving other purposes, if listed, are
> shown by convention in all caps, following an “=”. Normative aliases of
> type “figment” for control codes are not listed. Normative aliases which
> represent commonly used abbreviations for control codes or format
> characters are shown in all caps, enclosed in parentheses. In contrast,
> informative aliases are shown in lowercase. For the definitive list of
> normative aliases, also including their type and suitable for machine
> parsing, see NameAliases.txt in the UCD.
> So, according to this, the parts in parenthesis are abbreviations, the ALL
> CAPS are normative aliases, which includes the ones listed for control
> codes.
> Some of this is captured in the NamesList.txt, and some of it is captured
> in the software that normatively (for the unicode standard) processes that
> file.
> I am not going to claim that we can read that out of 10646. I think 10646
> is not actually fit for purpose. The description of the code charts is
> insufficient, and in any case is not machine readable which is actually
> required for fidelity here.
> I am intending to use the "normative aliases" for control codes as
> described in the Unicode standard to produce a table to be included in our
> standard. I believe this captures the intent of what we agreed.
> Pedantically, 10646 references
> http://www.unicode.org/versions/Unicode9.0.0/ch04.pdf normatively which
> refers to section 24.1 of the unicode standard via undated reference to
> describe character names in the Unicode Database. Which is not terribly
> sane.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2021-11-06 03:46:42