C++ Logo

sg16

Advanced search

Re: [SG16] Agenda for the 2021-10-06 SG16 telecon

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Wed, 6 Oct 2021 14:48:47 +0200
On 06/10/2021 13.24, Peter Brett wrote:
> Well, don't keep us in suspense, Jens.
>
> What *does* ISO 10646 define as the UTF-16 encoding scheme?

BOM galore, default is big-endian:


11.5 UTF-16

The UTF-16 encoding scheme serializes a UTF-16 code unit sequence by ordering octets in a way that either the
less significant octet precedes or follows the more significant octet.
In the UTF-16 encoding scheme, the initial signature read as <FE FF> indicates that the more significant octet
precedes the less significant octet, and <FF FE> the reverse. The signature is not part of the textual data.
In the absence of signature, the octet order of the UTF-16 encoding scheme is that the more significant octet
precedes the less significant octet.


Jens

Received on 2021-10-06 07:49:02