C++ Logo

sg16

Advanced search

Re: [SG16] Agenda for the 2021-10-06 SG16 telecon

From: Roger Orr <rogero_at_[hidden]>
Date: Wed, 6 Oct 2021 12:00:29 +0000
"11.5 UTF-16

The UTF-16 encoding scheme serializes a UTF-16 code unit sequence by ordering octets in a way that either the less significant octet precedes or follows the more significant octet.

In the UTF-16 encoding scheme, the initial signature read as indicates that the more significant octet precedes the less significant octet, and the reverse. The signature is not part of the textual data.

In the absence of signature, the octet order of the UTF-16 encoding scheme is that the more significant octet precedes the less significant octet."

Roger.
________________________________
From: SG16 <sg16-bounces_at_[hidden]> on behalf of Peter Brett via SG16 <sg16_at_[hidden]>
Sent: 06 October 2021 12:24
To: sg16_at_[hidden] <sg16_at_[hidden]>
Cc: Peter Brett <pbrett_at_[hidden]>; Tom Honermann <tom_at_[hidden]>
Subject: Re: [SG16] Agenda for the 2021-10-06 SG16 telecon

Well, don't keep us in suspense, Jens.

What *does* ISO 10646 define as the UTF-16 encoding scheme?

Best regards,

              Peter

> -----Original Message-----
> From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Jens Maurer via SG16
> Sent: 06 October 2021 12:20
> To: sg16_at_[hidden]
>
> On 06/10/2021 11.35, Corentin Jabot via SG16 wrote:
> >
> >
> >
> > On Wed, Oct 6, 2021 at 8:31 AM Hubert Tong via SG16 <sg16_at_[hidden]
> <mailto:sg16_at_[hidden]>> wrote:
> >
> > Concern for SG 16 to evaluate:
> > The recommended practice re: UTF-16 and UTF-32 is not consistent with
> getting the correct treatment out of interfaces that attempt to read the
> wide character data as a byte stream (e.g., iconv) when there are invalid
> characters in a position to be confused as reverse-from-native-endian BOMs.
> >
> >
> > UTF-16 is synonymous to either UTF-16BE/UTF-16LE depending on the
> platform.
> > The endianness is implied by the platform, not by text_encoding.
>
> That is not what ISO 10646 defines as the UTF-16 encoding scheme.
--
SG16 mailing list
SG16_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2021-10-06 07:01:31