sg16

From: Steven R. Loomis <srl295_at_[hidden]> · Date: Mon, 24 Apr 2023 08:56:30 -0500

--
Steven R. Loomis
Code Hive Tx, LLC
https://codehivetx.us
> On Feb 8, 2023, at 7:59 PM, Corentin via SG16 <sg16_at_[hidden]> wrote:
> 
> Thank you for your quick reply.
> Does that mean that CESU-8 is not "a Unicode encoding form"? ie we want to make sure to filter out conforming-but-not-specified-in-Unicode encodings.
> 
> "supports" in the wording you quoted is somewhat ambiguous, it could arguably mean either "admits the existence of" or "these are the encodings in the standard but there may be others", so we weren't quite sure.
> 
> Thanks!
> 
> On Wed, Feb 8, 2023, 17:39 Robin Leroy <egg.robin.leroy_at_[hidden]> wrote:
> Dear Corentin,
> 
> I think you want to refer to the Unicode encoding forms.
> See, for instance:
> The Unicode Standard, Section 3.9, Unicode Encoding Forms:
> The Unicode Standard supports three character encoding forms: UTF-32, UTF-16, and UTF-8.
> Unicode Technical Report #17, Unicode Character Encoding Model, Section 5 Character Encoding Scheme (CES):
> Some of the Unicode encoding schemes have the same labels as the three Unicode encoding forms. 
> Note that Unicode encodings specified in the Unicode standard is a little bit ambiguous, because Unicode distinguishes the encoding forms (code points to code units) from the encoding schemes (code units to bytes; the Unicode Standard supports seven encoding schemes, with LE/BE/BOM for 16 and 32). Assuming that the context here is [format.string.escaped] in document P2736, it looks like you are indeed dealing with the interpretation of code units (represented by the types char8_t, char16_t, and char32_t, per [lex.string.literal] referenced in [format.string.escaped]), and thus with encoding forms.
> 
> Best regards,
> 
> Robin Leroy
> 
> Le mer. 8 févr. 2023 à 00:32, Corentin <corentin.jabot_at_[hidden]> a écrit :
> Hey Robin,
> How are you?
> 
> Does Unicode have a term to designate "UTF-8, UTF-16 and UTF-32", i.e. Unicode encodings specified in the Unicode standard - excluding things like CESU-8 for example?
> It's something we would find useful in the C++ specification
> 
> Thanks,
> 
> Corentin
> 
> -- 
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16