C++ Logo

sg16

Advanced search

Re: Agenda for the 2023-10-25 SG16 telecon

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 24 Oct 2023 10:32:53 -0400
On 10/24/23 2:40 AM, Jens Maurer via SG16 wrote:
> Hi Tom,
>
> On 24/10/2023 07.11, Tom Honermann via SG16 wrote:
>> Hang on, this is going to be a bumpy ride.
> Thanks for the write-up.
>
> We should ask the implementers whether their basic_stream support
> for charN_t is intentional or accidental, and investigate a little
> more whether it works.

I agree.

I copied maintainers for libstdcxx, libc++, and the Microsoft
implementation on the original email hoping they would share their
thoughts. Libc++ clearly and intentionally eschews support. Jonathan has
mentioned elsewhere that libstdcxx support is not intentional. I'm
pretty sure none of the implementations test iostream support for
charN_t types, but my reviews for such have not been thorough.

I do intend to do more testing to identify what does and does not appear
to work as would be expected.

>
> Hyrum's law suggests that this will be used in the wild.
Of course :)
>
> My opinion:
>
> Let's make sure the ground is clear for a future extension
> to charN_t for basic_stream, but let's not try to address
> any of the deeper troubles (in particular the 1:N mapping for
> basic_fstream). In particular:
>
> Let's fix "int_type". The ABI of the standard library
> itself will not be broken, we just risk ABI breakage
> of user components, I think?
Assuming I did my homework correctly, that is right.
>
> Let's deprecate
>
> std::codecvt<char16_t, char8_t, std::mbstate_t>
> std::codecvt<char32_t, char8_t, std::mbstate_t>
> std::codecvt_byname<char16_t, char8_t, std::mbstate_t>
> std::codecvt_byname<char32_t, char8_t, std::mbstate_t>
>
> Those might come back when a proper solution arrives.
>
>
> std::codecvt<char16_t, char, std::mbstate_t> # Deprecated.
> std::codecvt<char32_t, char, std::mbstate_t> # Deprecated.
>
> "Since iostreams does not support charN_t in the standard today and since the char16_t and char32_t specializations have already been deprecated for two release cycles, perhaps it is even reasonable to change their behavior so that they convert to and from the locale encoding rather than UTF-8."
>
> That might work for
>
> std::codecvt<char32_t, char, std::mbstate_t>
>
> but
>
> std::codecvt<char16_t, char, std::mbstate_t>
>
> runs afoul of the 1:N mapping issue, unless on a platform where everything
> fits into 16-bit Unicode, right?
It runs into that issue regardless of whether the conversions are
between UTF-8 or another multibyte encoding.
>
> Best to leave those functions alone; I'm also ok with removing them.

Thank you for sharing your thoughts. I'm withholding my opinions for now
so as not to bias discussion prior to tomorrow's meeting. Everyone else
should feel free to share their opinions though!

Tom.

>
> Jens

Received on 2023-10-24 14:32:55