On 2/26/20 2:39 AM, Corentin Jabot via Lib wrote:

On Wed, 26 Feb 2020 at 02:38, Hubert Tong via SG16 <sg16@lists.isocpp.org> wrote:

In [format.string.std], the meaning of "Unicode encoding" added by P1868R2 is unclear. One interpretation of what was meant by "Unicode encoding" is "UCS encoding scheme" (as defined by ISO/IEC 10646). Another interpretation is an encoding scheme capable of encoding all UCS code points that have been assigned to characters. Yet another interpretation is an encoding scheme capable of encoding all UCS scalar values.

Any encoding scheme capable of encoding any UCS codepoint - where a UCS codepoint is tautologically any value in the UCS codespace (U+0 - U+10FFFF)

I'd like Victor to weigh in, but I think Hubert's last interpretation meets minimum requirements. It might be simplified to "any encoding scheme that encodes (only) UCS scalar values" (encoding schemes that support escape sequences or some other form of markup should be excluded).

I think the definition/description of "Unicode Character Encoding Schemes" in UTR #17 section 5 (https://www.unicode.org/reports/tr17/#CharacterEncodingScheme) fits what we're after.

Tom.

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16

_______________________________________________
Lib mailing list
Lib@lists.isocpp.org
Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib
Link to this post: http://lists.isocpp.org/lib/2020/02/15467.php