C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Use cases for user construction of text_encoding by name

From: Thiago Macieira <thiago_at_[hidden]>
Date: Sun, 21 Jul 2024 08:38:36 -0700
On Sunday 21 July 2024 04:53:06 GMT-7 Henri Sivonen via SG16 wrote:
> Apart from user construction, what's std::text_encoding::environment()
> expected to return on Windows that was installed with Korean as the
> language (i.e. the Windows legacy code page is 949)?

I want to point out that there are likely to be three different possible
impementations only:

1) ICU
2) Windows Codepage API
3) Unix system's iconv library

The problem with #3 is that it can be quite inconsistent in quality from
system to system. I don't know what libstdc++ and libc++ plans are and if
requiring ICU is in the cards for them. iconv has the advantage of being part
of POSIX, so always being there on Unix systems. But given its
inconsistencies, I wouldn't take it into consideration about the full
interoperability and instead leave that as QoI problems for the
implementations to deal with.

In any case, the environment encoding is probably only going to have two
answers:

a) the Windows codepage or an identifier from it
b) UTF-8

The Standard can't rely on this or mandate it, but it's likely going to the
end result. So the answer to your question should be an identifier that can be
reconstructed properly on Windows with their API and with ICU.

Is there such an 1:1 mapping?

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Principal Engineer - Intel DCAI Platform & System Engineering

Received on 2024-07-21 15:38:41