C++ Logo

std-proposals

Advanced search

Re: [std-proposals] TBAA and extended floating-point types

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 26 Aug 2025 14:56:55 -0400
On 8/24/25 10:31 AM, Simon Schröder via Std-Proposals wrote:
>
>
>> On Aug 24, 2025, at 4:21 AM, Thiago Macieira via Std-Proposals
>> <std-proposals_at_[hidden]> wrote:
>>
>> On Saturday, 23 August 2025 19:55:14 Central Daylight Time Tom
>> Honermann
>> wrote:
>>
>>> * char is assumed to be UTF-8. This is very common on Linux and macOS.
>>> Support for this has improved in the Windows ecosystem, but rough
>>> corners remain. Note that compiling with MSVC's /utf-8 option is not
>>> sufficient by itself to enable an assumption of UTF-8 for text held
>>> in char-based storage; the environment encoding, console encoding,
>>> and locale region settings also have to be considered.
>>
>> Indeed and Microsoft's slow uptake on this is annoying, even if
>> completely
>> understandable due to the massive legacy it is dealing with.
>>>
> The real shame is that Microsoft has everything in place already. With
> a simple call to setlocale(LC_ALL, “.utf8”); all Windows functions
> support UTF-8
> (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#:~:text=To%20enable%20UTF%2D8%20mode,8%20for%20the%20code%20page.).
> For console output we have to set the appropriate code page in
> addition to that. One annoying thing is that we need to use the
> Windows functions to get the command line arguments as UTF-16 and then
> convert them to UTF-8. I work this way to make my code portable: UTF-8
> everywhere. However, as you have mentioned, right now for this I still
> use char and not char8_t. As all Windows functions already work with
> the .utf8 locale, it would be easy to provide support for char8_t as
> well. (Maybe they need to change functions under the hood to take the
> locale explicitly instead of relying on the global locale.)

setlocale() doesn't affect the encoding used by Win32 "ANSI" functions;
they will still use the Active Code Page. To fix that, you have to build
your application with a manifest that sets the activeCodePage property
to UTF-8
<https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page>
(or ensure that your users have set their region settings to enable the
(beta option that requires administrative access) use of UTF-8 for the
system locale).

We do have a proposal to improve access to command line arguments. See
P3474 (std::arguments) <https://wg21.link/p3474>.

Tom.

>
> I even have the code page globally set to UTF-8 on Windows.
>

Received on 2025-08-26 18:56:59