C++ Logo

sg16

Advanced search

Re: [SG16] Alternative approach for LWG3565 "Handling of encodings in localized chrono formatting"

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Fri, 18 Jun 2021 10:33:51 +0200
On Fri, Jun 18, 2021 at 10:22 AM Peter Brett <pbrett_at_[hidden]> wrote:

> Hi Corentin,
>
>
>
> The requirement to perform transcoding makes me uncomfortable because I
> don’t think it’s actually implementable in the general case.
>
>
>
> Users of the standard library can create customized locale objects with
> bespoke time_put facets, and there is literally no way for the chrono
> formatter to know which codeset a user-specified locale facet is using or
> how to transcode its output.
>
>
>
> Totally happy for you to shoot down my alternative proposal, but I’m
> opposed to the current proposed resolution because std::locale just doesn’t
> work like that.
>

The locale objects themselves do have an encoding (with the assumption that
facets will respect that encoding)
The answer here is P1885 - which makes that information
publicly accessible. In absence of that, implementers have the information.
Well, some of them do (glibc, microsoft), but indeed on some platforms the
information does not exist because nl_langinfo is not part of the posix
spec, so P1885 will give you unknown information.
Is that an issue?

My understanding is that the set of scenario in which

   - There exists both a XXX an XXX.UTF-8 locale and the implementation
   knows how to go from one to the other
   - The implementation doesn't know the encoding of XXX

is empty or very small.

I think you are right that we probably don't say how custom facets behave
in respect to encodings but we certainly expect them to behave a certain
way!


>
>
> Best regards,
>
>
>
> Peter
>
>
>
> *From:* Corentin Jabot <corentinjabot_at_[hidden]>
> *Sent:* 18 June 2021 09:14
> *To:* SG16 <sg16_at_[hidden]>
> *Cc:* Peter Brett <pbrett_at_[hidden]>
> *Subject:* Re: [SG16] Alternative approach for LWG3565 "Handling of
> encodings in localized chrono formatting"
>
>
>
> On Thu, Jun 17, 2021 at 10:57 PM Peter Brett via SG16 <
> sg16_at_[hidden]> wrote:
>
> Hi all,
>
> The current proposed resolution for LWG3565 (https://wg21.link/LWG3565
> <https://urldefense.com/v3/__https:/wg21.link/LWG3565__;!!EHscmS1ygiU1lA!UImbHs51DLVC5_4iWd5hIcpUw4nbv7r2fAr3NVLyMFGjevk3CAeqq8cYQwVAug$>
> )
> involves transcoding from the locale encoding to UTF-8. This makes me a
> little uncomfortable.
>
>
>
> Can you clarify what makes you uncomfortable?
>
>
>
>
> Is it possible instead to say that, if the string literal encoding is
> UTF-8, then the effective locale is _as if_ the specified or global
> locale was modified by replacing the associated codeset with UTF-8?
>
> So, the following code:
>
> std::locale l1("Russian.1251");
> auto s = std::format(l1, "День недели: {:L}", std::chrono::Monday);
>
> Would behave as if replaced by:
>
> std::locale l1("Russian.1251");
> std::locale l2(l1, std::locale("Russian.UTF-8"), locale::time);
> auto s = std::format(l2, "День недели: {:L}", std::chrono::Monday);
>
> This would permit an implementation that has UTF-8 locale data available
> to use it directly, rather than being required to use the 1251 codeset
> locale data and transcode in order to conform to the standard.
>
>
>
> "associated codeset with UTF-8" is not really a thing.
>
> The ".UTF-8" locales merely exist by convention on some platforms
>
>
>
> There is no spec that says that
>
>
>
> * Russian.1251 is not UTF-8
>
> * Russian.1251.UTF-8 exists
>
> * Russian.1251 and Russian.1251.UTF-8 only differ by encoding if both exist
>
>
>
> Transcoding is therefore more generally applicable.
>
>
>
> Note that I have my own reservations about this issue, namely how much
> effort are we willing to put
>
> into mending a system that only works for a narrow subset of cultures,
> languages and circumstances?
>
> That being said, even if that issue amounts to putting duct tape over a
> giant crack in the wall,
>
> It also doesn't hurt.
>
> It is undoubtedly more correct than the status quo and it might make the
> life of our windows users a bit less painful
>
> as a stopgap solution
>
>
>
>
> Peter
>
> P.S. How would one go about writing a locale object that customizes
> chrono formatting with std::format? Does anyone have a code sample?
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
> <https://urldefense.com/v3/__https:/lists.isocpp.org/mailman/listinfo.cgi/sg16__;!!EHscmS1ygiU1lA!UImbHs51DLVC5_4iWd5hIcpUw4nbv7r2fAr3NVLyMFGjevk3CAeqq8cDfqp-Dw$>
>
>

Received on 2021-06-18 03:34:04