C++ Logo

sg16

Advanced search

Re: [SG16] [isocpp-lib] format.string.std references to "Unicode encoding" unclear

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 26 Feb 2020 10:22:11 -0500
On 2/26/20 2:39 AM, Corentin Jabot via Lib wrote:
>
>
> On Wed, 26 Feb 2020 at 02:38, Hubert Tong via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>
> In [format.string.std], the meaning of "Unicode encoding" added by
> P1868R2 is unclear. One interpretation of what was meant by
> "Unicode encoding" is "UCS encoding scheme" (as defined by ISO/IEC
> 10646). Another interpretation is an encoding scheme capable of
> encoding all UCS code points that have been assigned to
> characters. Yet another interpretation is an encoding scheme
> capable of encoding all UCS scalar values.
>
>
> Any encoding scheme capable of encoding any UCS codepoint - where a
> UCS codepoint is tautologically any value in the UCS codespace (U+0 -
> U+10FFFF)

I'd like Victor to weigh in, but I think Hubert's last interpretation
meets minimum requirements. It might be simplified to "any encoding
scheme that encodes (only) UCS scalar values" (encoding schemes that
support escape sequences or some other form of markup should be excluded).

I think the definition/description of "Unicode Character Encoding
Schemes" in UTR #17 section 5
(https://www.unicode.org/reports/tr17/#CharacterEncodingScheme) fits
what we're after.

Tom.

> --
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
>
> _______________________________________________
> Lib mailing list
> Lib_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib
> Link to this post: http://lists.isocpp.org/lib/2020/02/15467.php



Received on 2020-02-26 09:24:55