C++ Logo

std-proposals

Advanced search

Re: [std-proposals] char8_t aliasing and Unicode

From: zxuiji <gb2985_at_[hidden]>
Date: Sun, 31 Aug 2025 13:23:40 +0100
On Sun, 31 Aug 2025 at 13:02, Simon Schröder <dr.simon.schroeder_at_[hidden]>
wrote:

>
>
> On Aug 31, 2025, at 12:56 PM, Tiago Freire via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
> 
> Just so I don't leave this point unanswered:
>
> >I'd add window text to that list since that's a large part of what
> software is developed for.
>
> But the C++ standard doesn't know what a "window" is, it's not likely to
> ever dictate how to render one. So C++ will not define the encoding of such
> API's.
>
> I agree that it is highly unlikely that C++ will ever agree on a GUI
> library. But, still there have been several proposals to SG13. So,
> certainly there is a goal that C++ should dictate how to render a window.
> If I’m not mistaken this goal was already specified in D&E (Design and
> Evolution of C++).
>
> As I mentioned before, libraries might want to propagate error
> descriptions through exceptions or std::expected. It would work with a
> mixture of different Unicode encodings, but it would also be much easier to
> teach and learn C++ if we didn’t have to constantly convert between
> encodings. I’m not saying to make a single encoding mandatory, but only to
> make it the preferred one.
>
> I don’t like conversions between encodings. It is hard to get them right
> in real world scenarios. It is relatively easy to run into trouble with
> temporaries. I’ve had enough bugs because I’ve written
> ‘some_qstring.toUtf8().data()’. You might get lucky, but it is ultimately
> wrong (is it UB?).
>
> Fewer conversions make for fewer errors. Everybody wins if all prefer (but
> not mandate) a single encoding. If all libraries agree on at least one
> encoding it is much easier to have them behave nicely together.
>
> I also don’t agree that this discussion is “only” about filenames and the
> terminal. These two are quite fundamental in teaching C++. Let’s make it
> easy for beginners by not confusing them with different encodings. This
> currently works with char, but char (with a local encoding) does not
> translate well to real software. Unicode is a much more practical solution.
>
> BTW, we might decide (I’m not sure if I’d personally agree with this) to
> not add support for Unicode to cout, but only to std::print. On the other
> hand I heavily rely on istreams in some of my projects. So, it would be
> nice if these support at least one Unicode encoding.
>

Well then you'll like what I'm working on, even if I'm taking ages to get
the core working the way I want. Basically I'm working on a libray and
launcher pair, with the library preferring a variant of of UTF (undecided
which as yet but leaning towards UTF-16LE or UTF-8). I'm calling it paw
(Platform ABI Wrapper) and the ultimate goal is that it doesn't try to
define typedefs like wchar_t, favouring void* & charN_t where text is
concerned and void*, size where positions etc are concerned.

Received on 2025-08-31 12:09:26