On Aug 24, 2025, at 4:21 AM, Thiago Macieira via Std-Proposals <std-proposals@lists.isocpp.org> wrote:

On Saturday, 23 August 2025 19:55:14 Central Daylight Time Tom Honermann
wrote:

 * char is assumed to be UTF-8. This is very common on Linux and macOS.
   Support for this has improved in the Windows ecosystem, but rough
   corners remain. Note that compiling with MSVC's /utf-8 option is not
   sufficient by itself to enable an assumption of UTF-8 for text held
   in char-based storage; the environment encoding, console encoding,
   and locale region settings also have to be considered.

Indeed and Microsoft's slow uptake on this is annoying, even if completely
understandable due to the massive legacy it is dealing with.

The real shame is that Microsoft has everything in place already. With a simple call to setlocale(LC_ALL, “.utf8”); all Windows functions support UTF-8 (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#:~:text=To%20enable%20UTF%2D8%20mode,8%20for%20the%20code%20page.). For console output we have to set the appropriate code page in addition to that. One annoying thing is that we need to use the Windows functions to get the command line arguments as UTF-16 and then convert them to UTF-8. I work this way to make my code portable: UTF-8 everywhere. However, as you have mentioned, right now for this I still use char and not char8_t. As all Windows functions already work with the .utf8 locale, it would be easy to provide support for char8_t as well. (Maybe they need to change functions under the hood to take the locale explicitly instead of relying on the global locale.)

I even have the code page globally set to UTF-8 on Windows.