Date: Wed, 17 Jun 2020 12:04:49 -0400
I would greatly prefer the definition you wrote rather than the 10646
definition. My working definition for text encodings is that a code
point is a value in the codespace of your defined character set
(through the encoding object), and your code unit is the individual
objects that make up an encoded code point. Encodings can traffic
things into and out of UCS, but they can also traffic things into and
out of other character sets.
Restriction to the UCS -- and the restrictions of ISO 10646 -- are
prohibitive to growth in this area and requiring every library section
thereafter to have to mark up and explain that this is a special code
point that is not exactly like the 10646 code point is not useful and
we should override the definition in [intro.defs].
On Wed, Jun 17, 2020 at 11:07 AM Tom Honermann via Core
<core_at_[hidden]> wrote:
>
> ISO/IEC 10646:2017 3.10 defines the term code point as "value in the UCS codespace".
>
> Note that the definition restricts the term to the Unicode character set (UCS).
>
> SG16 may be interested in adopting this term as a generic term applicable to any character set, perhaps as if it were defined as "a value in the codespace of a character set".
>
> The question is how or whether the C++ standard can overload or override terms from its normative references in this way. [intro.defs] already provides overloaded definitions for terms used in different portions of the standard. Would it be reasonable to provide a definition of code point in this section that differs from that in ISO/IEC 10646? If so, would some form of disambiguation be required?
>
> Tom.
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2020/06/9404.php
definition. My working definition for text encodings is that a code
point is a value in the codespace of your defined character set
(through the encoding object), and your code unit is the individual
objects that make up an encoded code point. Encodings can traffic
things into and out of UCS, but they can also traffic things into and
out of other character sets.
Restriction to the UCS -- and the restrictions of ISO 10646 -- are
prohibitive to growth in this area and requiring every library section
thereafter to have to mark up and explain that this is a special code
point that is not exactly like the 10646 code point is not useful and
we should override the definition in [intro.defs].
On Wed, Jun 17, 2020 at 11:07 AM Tom Honermann via Core
<core_at_[hidden]> wrote:
>
> ISO/IEC 10646:2017 3.10 defines the term code point as "value in the UCS codespace".
>
> Note that the definition restricts the term to the Unicode character set (UCS).
>
> SG16 may be interested in adopting this term as a generic term applicable to any character set, perhaps as if it were defined as "a value in the codespace of a character set".
>
> The question is how or whether the C++ standard can overload or override terms from its normative references in this way. [intro.defs] already provides overloaded definitions for terms used in different portions of the standard. Would it be reasonable to provide a definition of code point in this section that differs from that in ISO/IEC 10646? If so, would some form of disambiguation be required?
>
> Tom.
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2020/06/9404.php
Received on 2020-06-17 11:08:11