Date: Sat, 18 Dec 2021 09:54:55 +0100
On 18/12/2021 00.18, Tom Honermann via SG16 wrote:
> 8. Specify how invalid code unit sequences are to be handled. This includes specifying, at least for self synchronizing encodings like UTF-8, UTF-16, and UTF-32, how such sequences are delimited. References to the Unicode standard (as indicated in the editor notes in the linked meeting summary) and/or WhatWG Encoding Standard are advised. This also includes specifying how wide strings are handled; presumably each wchar_t value in an ill-formed code unit sequence would be formatted as a single hex escape.
Since C++ is an ISO standard, normative references to other ISO standards
(e.g. ISO 10646) as opposed to third-party standards (e.g. Unicode) are
preferred, per ISO policy.
Jens
> 8. Specify how invalid code unit sequences are to be handled. This includes specifying, at least for self synchronizing encodings like UTF-8, UTF-16, and UTF-32, how such sequences are delimited. References to the Unicode standard (as indicated in the editor notes in the linked meeting summary) and/or WhatWG Encoding Standard are advised. This also includes specifying how wide strings are handled; presumably each wchar_t value in an ill-formed code unit sequence would be formatted as a single hex escape.
Since C++ is an ISO standard, normative references to other ISO standards
(e.g. ISO 10646) as opposed to third-party standards (e.g. Unicode) are
preferred, per ISO policy.
Jens
Received on 2021-12-18 02:55:08