C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] code_unit_sequence and code_point_sequence

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 19 Jun 2018 21:52:05 -0400
On 06/19/2018 04:19 PM, Lyberta wrote:
> keld_at_[hidden]:
>> Is your code point advisory the same as codepoints in 10646/Unicode, also
>> called characters in 10646?
> Yes. A code point is unsigned 32 bit integer with the values in the
> range of 0-10FFFF. Modern C and C++ have type char32_t which is most
> suitable for holding code points.
>
>> And why not just treat these as 32-bit wchar-t?
>> I believe this is what we do in C.
> Because wide execution character set is implementation defined. So far
> nobody has expressed opinion of changing that and Windows violates the
> standard by having 16 bit wchar_t.

Technically, Windows doesn't violate the standard by having a 16-bit
wchar_t. It violates the standard by using a wide execution character
set that defines code points that do not fit in it's (16-bit) wchar_t
type. We have an issue (https://github.com/sg16-unicode/sg16/issues/9)
to track modifying the standard to enable Microsoft's implementation to
be conforming.

Tom.

Received on 2018-06-20 03:52:08