C++ Logo

std-proposals

Advanced search

Re: [std-proposals] TBAA and extended floating-point types

From: Thiago Macieira <thiago_at_[hidden]>
Date: Sat, 30 Aug 2025 13:58:40 -0700
On Saturday, 30 August 2025 13:46:59 Pacific Daylight Time Oliver Hunt wrote:
> I’ll prod folk again, but I’m not sure I understand why you seem so
> absolutely adamant that every does or should use utf16 internally when
> multiple people have said this is not true, and pointed to every API you
> reference correctly as “this API was introduced when ucs2 was thought to be
> sufficient, and then got utf16 bolted on after the fact and different
> rates”.

I'm not adamant on this any more. I think based on what you said that Swift
reimplemented the support for the Unicode Database. I just can't find it,
because I don't know how to navigate the source code. I've found where it
iterates over the UTF-8 string and returns UTF-32 code units/points, but not
where it looks up the collation value such that U+00E9 is less than U+0069.

The problem is of course that this means they've duplicated the access to the
Unicode Database, instead of using the OS. Then again, if Swift is cross-
platform to other OSes, it kind of has to if it doesn't want to depend on ICU.

> What you seem to be arguing is old ABI fixed APIs that were extended to
> support utf16, so despite the many problems of utf16 vs utf8, and the wide
> spread adoption of utf8 everywhere other than places that are stuck with
> utf16 due to aforementioned ABI constraints, all new systems languages
> being built on utf8 strings, we should make new APIs built around utf16 so
> we can continue to be required to maintain an encoding that is (what the
> domain experts have told me) is bad on every metric.

I'm arguing that because we have such a widespread use of UTF-16 in C and C++,
we need first-class UTF-16 support in the C++ Standard. I don't care about
other languages, because I'm not writing code for them. But the underlying
infrastructure for UTF-16 for C and C++ seems to be there.

So instead of talking about Rust or Swift, let's ask what libc++ would use to
implement collation.

> “AI” is just predictive text generation regurgitating existing content, so
> of course it will produce answers that are most like the above. The
> majority of the posts it regurgitates written about stuff like this are
> from _decades_ of objc + foundation. AI doesn’t magically know anything, it
> literally just regurgitates the work of others, periodically adding errors.
> There is no reason to use it in a technical forum.

Which is why I almost always ignore the AI and go straight for the sources,
because until it is 99% reliable or more, it's useless. But in this case,
since I can't pass the judgement either on the accuracy of the sources, the AI
answer suffices. It seemed plausible that, if you needed the exact same sorting
as Finder, you'd use the same function that Finder uses, not one that may be
slightly different due to a reimplementation, however correct it may be.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Principal Engineer - Intel Platform & System Engineering

Received on 2025-08-30 20:58:46