Date: Wed, 23 Jun 2021 15:26:36 -0400
On Thu, Jun 17, 2021 at 4:57 PM Peter Brett via SG16 <sg16_at_[hidden]>
wrote:
> Hi all,
>
> The current proposed resolution for LWG3565 (https://wg21.link/LWG3565)
> involves transcoding from the locale encoding to UTF-8. This makes me a
> little uncomfortable.
>
> Is it possible instead to say that, if the string literal encoding is
> UTF-8, then the effective locale is _as if_ the specified or global
> locale was modified by replacing the associated codeset with UTF-8?
>
> So, the following code:
>
> std::locale l1("Russian.1251");
> auto s = std::format(l1, "День недели: {:L}", std::chrono::Monday);
>
> Would behave as if replaced by:
>
> std::locale l1("Russian.1251");
> std::locale l2(l1, std::locale("Russian.UTF-8"), locale::time);
> auto s = std::format(l2, "День недели: {:L}", std::chrono::Monday);
>
> This would permit an implementation that has UTF-8 locale data available
> to use it directly, rather than being required to use the 1251 codeset
> locale data and transcode in order to conform to the standard.
>
> Peter
>
> P.S. How would one go about writing a locale object that customizes
> chrono formatting with std::format? Does anyone have a code sample?
>
The nl_langinfo CODESET element is an aspect of the LC_CTYPE facet. I don't
think the C++ layer has much control over the mapping. The `localedef` is
used in POSIX to define custom locales for use with C. The "charmap" format
provides for custom names. Not all implementations require a mapping to the
UCS to be provided.
My understanding is that the implementation of iconv_open need not support
all charmaps used with locales on the system.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
wrote:
> Hi all,
>
> The current proposed resolution for LWG3565 (https://wg21.link/LWG3565)
> involves transcoding from the locale encoding to UTF-8. This makes me a
> little uncomfortable.
>
> Is it possible instead to say that, if the string literal encoding is
> UTF-8, then the effective locale is _as if_ the specified or global
> locale was modified by replacing the associated codeset with UTF-8?
>
> So, the following code:
>
> std::locale l1("Russian.1251");
> auto s = std::format(l1, "День недели: {:L}", std::chrono::Monday);
>
> Would behave as if replaced by:
>
> std::locale l1("Russian.1251");
> std::locale l2(l1, std::locale("Russian.UTF-8"), locale::time);
> auto s = std::format(l2, "День недели: {:L}", std::chrono::Monday);
>
> This would permit an implementation that has UTF-8 locale data available
> to use it directly, rather than being required to use the 1251 codeset
> locale data and transcode in order to conform to the standard.
>
> Peter
>
> P.S. How would one go about writing a locale object that customizes
> chrono formatting with std::format? Does anyone have a code sample?
>
The nl_langinfo CODESET element is an aspect of the LC_CTYPE facet. I don't
think the C++ layer has much control over the mapping. The `localedef` is
used in POSIX to define custom locales for use with C. The "charmap" format
provides for custom names. Not all implementations require a mapping to the
UCS to be provided.
My understanding is that the implementation of iconv_open need not support
all charmaps used with locales on the system.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
Received on 2021-06-23 14:27:06