C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] code_unit_sequence and code_point_sequence
From: Tom Honermann (tom_at_[hidden])
Date: 2018-06-19 20:52:05


On 06/19/2018 04:19 PM, Lyberta wrote:
> keld_at_[hidden]:
>> Is your code point advisory the same as codepoints in 10646/Unicode, also
>> called characters in 10646?
> Yes. A code point is unsigned 32 bit integer with the values in the
> range of 0-10FFFF. Modern C and C++ have type char32_t which is most
> suitable for holding code points.
>
>> And why not just treat these as 32-bit wchar-t?
>> I believe this is what we do in C.
> Because wide execution character set is implementation defined. So far
> nobody has expressed opinion of changing that and Windows violates the
> standard by having 16 bit wchar_t.

Technically, Windows doesn't violate the standard by having a 16-bit
wchar_t.  It violates the standard by using a wide execution character
set that defines code points that do not fit in it's (16-bit) wchar_t
type.  We have an issue (https://github.com/sg16-unicode/sg16/issues/9)
to track modifying the standard to enable Microsoft's implementation to
be conforming.

Tom.


SG16 list run by herb.sutter at gmail.com