C++ Logo

std-proposals

Advanced search

Re: [std-proposals] char8_t aliasing and Unicode

From: Simon Schröder <dr.simon.schroeder_at_[hidden]>
Date: Sun, 31 Aug 2025 14:02:00 +0200
> On Aug 31, 2025, at 12:56 PM, Tiago Freire via Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> 
> Just so I don't leave this point unanswered:
>
> >I'd add window text to that list since that's a large part of what software is developed for.
>
> But the C++ standard doesn't know what a "window" is, it's not likely to ever dictate how to render one. So C++ will not define the encoding of such API's.
>
I agree that it is highly unlikely that C++ will ever agree on a GUI library. But, still there have been several proposals to SG13. So, certainly there is a goal that C++ should dictate how to render a window. If I’m not mistaken this goal was already specified in D&E (Design and Evolution of C++).

As I mentioned before, libraries might want to propagate error descriptions through exceptions or std::expected. It would work with a mixture of different Unicode encodings, but it would also be much easier to teach and learn C++ if we didn’t have to constantly convert between encodings. I’m not saying to make a single encoding mandatory, but only to make it the preferred one.

I don’t like conversions between encodings. It is hard to get them right in real world scenarios. It is relatively easy to run into trouble with temporaries. I’ve had enough bugs because I’ve written ‘some_qstring.toUtf8().data()’. You might get lucky, but it is ultimately wrong (is it UB?).

Fewer conversions make for fewer errors. Everybody wins if all prefer (but not mandate) a single encoding. If all libraries agree on at least one encoding it is much easier to have them behave nicely together.

I also don’t agree that this discussion is “only” about filenames and the terminal. These two are quite fundamental in teaching C++. Let’s make it easy for beginners by not confusing them with different encodings. This currently works with char, but char (with a local encoding) does not translate well to real software. Unicode is a much more practical solution.

BTW, we might decide (I’m not sure if I’d personally agree with this) to not add support for Unicode to cout, but only to std::print. On the other hand I heavily rely on istreams in some of my projects. So, it would be nice if these support at least one Unicode encoding.

Received on 2025-08-31 12:02:14