The first seems good, the second one seems to always have been wrong? It now reads that you can transcode from utf16 to utf16, where invalid utf16 is invalid ?!

I think the second UCS2 mention should have already read UTF8. But please do check.

Otherwise, :+1:

On Sat, Dec 3, 2022 at 3:11 PM Corentin via SG16 <sg16@lists.isocpp.org> wrote:
Hey folks.
We only mention UCS-2 in [depr.locale.stdcvt.req] 
http://eel.is/c++draft/depr#locale.stdcvt.req

Referring to UCS-2 forces to carry a reference to ISO 10646:2003, as this encoding has long been obsoleted.

We could remove it entirely by rewording slightly, without changing any effect.

For the facet codecvt_­utf8:
  • The facet shall convert between UTF-8 multibyte sequences and UCS-2 or UTF-32 (depending on the size of Elem) within the program.
=> 
  • The facet shall convert between UTF-8 multibyte sequences and UTF-16 or UTF-32 (depending on the size of Elem) within the program.
  • When converting to UTF-16 if any UTF-8 sequence encodes a codepoint outside of the BMP, the behavior is undefined.
For the facet codecvt_­utf16:
  • The facet shall convert between UTF-16 multibyte sequences and UCS-2 or UTF-32 (depending on the size of Elem) within the program
=>
  • The facet shall convert between UTF-16 multibyte sequences and UTF-16  or UTF-32 (depending on the size of Elem) within the program.
  • When converting to UTF-16 if any UTF-8 sequence encodes a codepoint outside of the BMP, the behavior is undefined.

What do you think?

Corentin
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16