> We could decide to deprecate char_traits<T>::eof, but the whole of iostream is built on top of it. So as appealing as that would be, it's not a realistic option.

 

I think we can also start by not providing char_traits<char16_t>, lets just not have one. That would mean no iostream support for char16_t, but is that even a bad thing? I don’t use iostream at all.

And I have helped ship several commercial products with code bases where iostream was outright banned, and not because they were products that didn’t have features that iostream was meant to address, but because we have better replacements for every single function iostream is supposed to do. Perhaps the only thing missing in the standard is a better file i/o, but that is not something a paper cannot fix.

 

I don’t think you need decades to remove char_traits. But then again, I think the best strategy here is to acknowledge that char_traits is broken, and that we are not going to fix it, so let’s stop worrying about it, let’s not stop shipping features that are actually useful because you want to save the broken parts.

If you really need it just hardcode char_traits<cahr16_t>::eof = 0xFFFF and call it a day. Just sacrifice a code point and move on with it. It’s a valid UTF-16 code point? Who cares? Buyer beware. If you want to use iostream you get what you paid for, if you don’t want that as a problem, then maybe migrating to something else that doesn’t have that problem. In any case Unicode has permanently unassigned 0xFFFF, I’m pretty sure nobody will mind.

 

 

From: Corentin Jabot <corentinjabot@gmail.com>
Sent: Wednesday, June 24, 2026 20:44
To: Tiago Freire <tmiguelf@hotmail.com>
Cc: sg16@lists.isocpp.org; Hubert Tong <hubert.reinterpretcast@gmail.com>; dascandy@gmail.com; Steve Downey <sg16@sdowney.dev>
Subject: Re: [isocpp-sg16] SG16 Meeting Tomorrow at 3:30 NYC Time

 

 

 

On Wed, Jun 24, 2026 at 8:33PM Tiago Freire <tmiguelf@hotmail.com> wrote:

The problem is not UTF-16, the problem is the concept of char_traits<>::eof itself.

Algorithms that assume that eof exists are broken, and we have since made better libraries that don’t use eof at all.

 

We don’t need to fix the lack of char_traits<char16_t>::eof for UTF-16 it is not broken. We need to remove char_traits entirely, because that is the part that is broken.

 

sure, but removing char_traits, if it were possible, would take decades.

We could decide to deprecate char_traits<T>::eof, but the whole of iostream is built on top of it. So as appealing as that would be, it's not a realistic option.

We can decide to do _nothing_ on the basis that it has been broken for 30+ years but... if a targeted fix can help someone with limited disruption, we might as well try that.

 

 

 

From: SG16 <sg16-bounces@lists.isocpp.org> On Behalf Of Corentin Jabot via SG16
Sent: Wednesday, June 24, 2026 20:15
To: Hubert Tong <hubert.reinterpretcast@gmail.com>
Cc: Corentin Jabot <corentinjabot@gmail.com>; sg16@lists.isocpp.org; dascandy@gmail.com; Steve Downey <sg16@sdowney.dev>
Subject: Re: [isocpp-sg16] SG16 Meeting Tomorrow at 3:30 NYC Time

 

 

 

On Wed, Jun 24, 2026 at 5:37PM Hubert Tong <hubert.reinterpretcast@gmail.com> wrote:

On Wed, Jun 24, 2026 at 9:50AM Corentin Jabot via SG16 <sg16@lists.isocpp.org> wrote:

 

 

On Wed, Jun 24, 2026 at 3:17AM Steve Downey via SG16 <sg16@lists.isocpp.org> wrote:

 

  wg21:       TODO [[https://github.com/cplusplus/papers/issues/1572][LWG2959]] char_traits<char16_t>::eof is a valid UTF-16 code unit                                                                                    :sg16:

 

We should ask Louis/STL if they are happy to change int_type, or, alternatively explain LEWG the only solution is to change int_type, and see if they care.

 

Only for char16_t? Go beyond that, and I think we'll be back in the "committee broke std::string ABI" land.

 

Right, both UTF-8 and UTF-32 have invalid values - for example 0xFF and 0xFFFFFFFF can be used as sentinel - so they don;t suffer the issue described here - but UTF-16 has no such luck, the whole space is used, 

either for codepoints, or surrogates.

 

 

-- HT