C++ Logo

std-proposals

Advanced search

Re: [std-proposals] TBAA and extended floating-point types

From: Oliver Hunt <oliver_at_[hidden]>
Date: Mon, 25 Aug 2025 10:01:18 -0700
> On Aug 24, 2025, at 10:54 PM, Thiago Macieira via Std-Proposals <std-proposals_at_[hidden]> wrote:
>
>
> As I pointed out in the other email: the choice here should be UTF-16, not
> UTF-8. That would get immediate and unconverted access to ICU, Win32 and
> Cocoa/CoreFoundation APIs.

UTF8 is the preferred encoding by new code and languages - Foundation uses UTF16 for legacy API/ABI reasons, but Swift strings are UTF8 by default - the only exception is bridged NSStrings, because of Foundation’s UTF16 ABI requirements.

Web content is overwhelmingly UTF-8, and so browse engines would prefer UTF8 as well - the only reason for not doing so is that JS exposes a UCS2 interface, so internally their strings are either 8 or 16 bit. This is because the JS exposure of UCS2 means that there is no efficient way to use a different multibyte encoding, and the UCS2 nature of JS means encoding to utf 8 is not necessarily possible (hence the WTF8 encoding).

—Oliver

Received on 2025-08-25 17:01:30