C++ Logo

std-proposals

Advanced search

Re: [std-proposals] TBAA and extended floating-point types

From: Thiago Macieira <thiago_at_[hidden]>
Date: Tue, 26 Aug 2025 18:30:51 -0700
On Tuesday, 26 August 2025 18:04:33 Pacific Daylight Time Oliver Hunt wrote:
> I don’t know about rust, but with the exception of the NSString bridge
> (which I assume does something exciting, because bridging), Swift operates
> entirely in utf-8. In addition to that I know many string implementations
> that are ostensibly utf16 or ucs2 use either ascii (not utf8 - this is
> generally because they’re often stuck with some char16* API and
> utf8->utf16->utf8 is expensive) or utf16 internally depending on the
> content.

Would you be able to find out how Swift implements collation? I don't know
where to begin the search. For Rust, it appears to be ICU4X [1] which is a
full reimplementation of ICU4C in Rust.

Either way, if I am writing C++ platform, Swift or Rust don't matter much. My
options are to use either ICU or an OS-provided framework. The latter case on
an Apple platform is either UCCompareText[2], which operates on UniChar
arrays, or NSString's localizedCompare[3]. Likewise for Windows: you'd either
use ICU or CompareStringEx[4] - a wchar_t-only API, there's no "A" version.

So if you're dealing with Unicode today, you are using UTF-16 almost without
exception. The C++ Standard's Unicode support needs to take this legacy into
account. Forcing everyone to use char8_t is not the way.

[1] <https://github.com/unicode-org/icu4x
[2]
https://developer.apple.com/documentation/coreservices/1390642-uccomparetext
[3] https://developer.apple.com/documentation/foundation/nsstring/
localizedcompare(_:)?language=objc
[4] https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-comparestringex

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Principal Engineer - Intel Platform & System Engineering

Received on 2025-08-27 01:31:00