C++ Logo

sg16

Advanced search

Clarify "Clarify handling of encodings in localized formatting of chrono types"

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Wed, 10 Jan 2024 23:24:18 +0000
What's the intended implementation strategy for
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2419r2.html on
POSIX?

My best guess is something like this, where loc is the formatting locale
([time.format] p2):

if (narrow literal encoding is UTF-8)
  if (locale_t cloc = ::newlocale(loc.name()))
    if (const char* enc = ::nl_langinfo_l(CODESET, cloc))
      if (/*enc is not UTF-8 */) {
        iconv_t ic = ::iconv_open("UTF-8", enc);
        if (ic != (iconv_t)-1) {
          // use ::iconv to convert from locale's encoding to UTF-8

But that seems pretty involved ... and {fmt} doesn't do any of that.

I tried testing the example from the paper on Linux, and fmt::format fails
with an exception. Debugging it shows that it tries to use the formatting
locale's std::codecvt<char32_t, char, mbstate_t> facet to convert the
string. But that's not right, because that codecvt specialization is
defined by the standard to convert between UTF-8 and UTF-32 only. So it can
only work if the input is ASCII, or the locale uses UTF-8, in which case
there's nothing that needs converting anyway.

AFAIK the standard doesn't provide a way to convert from an arbitrary
locale's encoding to the execution charset, or even to get the name of an
arbitrary locale's encoding (C++23 provides a way to get the name of the
execution environment's encoding, but not an arbitrary std::locale's
encoding).

Is the pseudocode above the intention? Or am I misinterpreting something in
P2419?

I've read the SG16 minutes when P2419 was discussed, and I don't see an
answer.

Received on 2024-01-10 23:25:35