"11.5 UTF-16 

The UTF-16 encoding scheme serializes a UTF-16 code unit sequence by ordering octets in a way that either the less significant octet precedes or follows the more significant octet. 

In the UTF-16 encoding scheme, the initial signature read as indicates that the more significant octet precedes the less significant octet, and the reverse. The signature is not part of the textual data.

In the absence of signature, the octet order of the UTF-16 encoding scheme is that the more significant octet precedes the less significant octet."

Roger.

From: SG16 <sg16-bounces@lists.isocpp.org> on behalf of Peter Brett via SG16 <sg16@lists.isocpp.org>
Sent: 06 October 2021 12:24
To: sg16@lists.isocpp.org <sg16@lists.isocpp.org>
Cc: Peter Brett <pbrett@cadence.com>; Tom Honermann <tom@honermann.net>
Subject: Re: [SG16] Agenda for the 2021-10-06 SG16 telecon
 
Well, don't keep us in suspense, Jens.

What *does* ISO 10646 define as the UTF-16 encoding scheme?

Best regards,

              Peter

> -----Original Message-----
> From: SG16 <sg16-bounces@lists.isocpp.org> On Behalf Of Jens Maurer via SG16
> Sent: 06 October 2021 12:20
> To: sg16@lists.isocpp.org
>
> On 06/10/2021 11.35, Corentin Jabot via SG16 wrote:
> >
> >
> >
> > On Wed, Oct 6, 2021 at 8:31 AM Hubert Tong via SG16 <sg16@lists.isocpp.org
> <mailto:sg16@lists.isocpp.org>> wrote:
> >
> >     Concern for SG 16 to evaluate:
> >     The recommended practice re: UTF-16 and UTF-32 is not consistent with
> getting the correct treatment out of interfaces that attempt to read the
> wide character data as a byte stream (e.g., iconv) when there are invalid
> characters in a position to be confused as reverse-from-native-endian BOMs.
> >
> >
> > UTF-16 is synonymous to either UTF-16BE/UTF-16LE depending on the
> platform.
> > The endianness is implied by the platform, not by text_encoding.
>
> That is not what ISO 10646 defines as the UTF-16 encoding scheme.
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16