On 2/27/20 11:52 AM, Victor Zverovich wrote:
I agree that Hubert's last option is the one we want here.

> encoding schemes that support escape sequences or some other form of markup should be excluded

Could you give an example of such encoding scheme?

Only in theory.  I can imagine an ISO-2022 encoding that supports escape sequences to swap the active character set to UTF-8 or something.  I don't know of any real encodings that encode Unicode code points and support escape sequences or markup.

Tom.


- Victor

On Wed, Feb 26, 2020 at 7:22 AM Tom Honermann <tom@honermann.net> wrote:
On 2/26/20 2:39 AM, Corentin Jabot via Lib wrote:


On Wed, 26 Feb 2020 at 02:38, Hubert Tong via SG16 <sg16@lists.isocpp.org> wrote:
In [format.string.std], the meaning of "Unicode encoding" added by P1868R2 is unclear. One interpretation of what was meant by "Unicode encoding" is "UCS encoding scheme" (as defined by ISO/IEC 10646). Another interpretation is an encoding scheme capable of encoding all UCS code points that have been assigned to characters. Yet another interpretation is an encoding scheme capable of encoding all UCS scalar values.

Any encoding scheme capable of encoding any UCS codepoint - where  a UCS codepoint is tautologically any value in the UCS codespace (U+0 - U+10FFFF)

I'd like Victor to weigh in, but I think Hubert's last interpretation meets minimum requirements.  It might be simplified to "any encoding scheme that encodes (only) UCS scalar values" (encoding schemes that support escape sequences or some other form of markup should be excluded).

I think the definition/description of "Unicode Character Encoding Schemes" in UTR #17 section 5 (https://www.unicode.org/reports/tr17/#CharacterEncodingScheme) fits what we're after.

Tom.


_______________________________________________
Lib mailing list
Lib@lists.isocpp.org
Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib
Link to this post: http://lists.isocpp.org/lib/2020/02/15467.php