C++ Logo

SG16

Advanced search

Subject: Re: LWG issue: Time formatters should not be locale sensitive by default
From: Corentin (corentin.jabot_at_[hidden])
Date: 2021-04-29 01:37:15


On Thu, Apr 29, 2021 at 4:51 AM Charlie Barto <Charles.Barto_at_[hidden]>
wrote:

> If we were to adopt the “spirit” of the resolution presented today we’d
> probably end up with four (but really three since the alternate modes for
> the C locale are the same as the non-alternate modes) different “locale
> settings” for some specifiers.
>

The locale apply to the whole date

> I think that’s actually fine. (as long as we don’t break user’s code)
>
>
>
>
>
> While the posix strftime is impossible to use in a locale insensitive
> manner (since locales are global and another thread could change it), ours
> is possible to use in such a way, since you can always just pass
> locale::classic() to format.
>

If you have to mix localized and non localized content, as you can do with
numbers, the solution in C++20 would be

std::format("{:%r} {}", date, std::format(locale::classic(), "{:%r}",
date));

The solution in 23 would be, using Jens Idea

std::format("{0:%r} {0:N%r}", date ); // second is not localized

Mixing with numbers:

std::format("{} {:N%r}", 0.0 date ); // not localized
std::format("{:L} {:%r}", 0.0 date ); // localized

Really easy to understand and not surprising at all. (/s)

>
>
> I like Jens’ idea of fixing it with a new syntax.
>
>
>
> Charlie.
>
>
>
> *From:* Corentin <corentin.jabot_at_[hidden]>
> *Sent:* Wednesday, April 28, 2021 4:56 PM
> *To:* Victor Zverovich <victor.zverovich_at_[hidden]>; Charlie Barto <
> Charles.Barto_at_[hidden]>
> *Cc:* SG16 <sg16_at_[hidden]>
> *Subject:* Re: [SG16] LWG issue: Time formatters should not be locale
> sensitive by default
>
>
>
> I wanted to address the "locale-independent" specifications.
>
> There are *none*, in the POSIX strftime spec.
>
>
>
> https://pubs.opengroup.org/onlinepubs/009695399/functions/strftime.html
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpubs.opengroup.org%2Fonlinepubs%2F009695399%2Ffunctions%2Fstrftime.html&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647813149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=udip5RFBK9JXyCuDcESSgoQ5bZFsPQRVaIUgYw%2BRbDM%3D&reserved=0>
>
> The strftime() function shall place bytes into the array pointed to by s
> as controlled by the string pointed to by format. The format is a character
> string, beginning and ending in its initial shift state, if any. The format
> string consists of zero or more conversion specifications and ordinary
> characters. A conversion specification consists of a '%' character,
> possibly followed by an E or O modifier, and a terminating conversion
> specifier character that determines the conversion specification's
> behavior. All ordinary characters (including the terminating null byte) are
> copied unchanged into the array. If copying takes place between objects
> that overlap, the behavior is undefined. No more than maxsize bytes are
> placed into the array. Each conversion specifier is replaced by appropriate
> characters as described in the following list. The appropriate characters
> are determined using the LC_TIME category of the current locale and by the
> values of zero or more members of the broken-down time structure pointed to
> by timeptr, as specified in brackets in the description. If any of the
> specified values are outside the normal range, the characters stored are
> unspecified.
>
>
>
> The %O are an opt-in into the locale alternative numeral system.
> You might want to have dates with arabic numerals and names in hindi, for
> example.
>
> so "1 AM" can be either "१ पूर्वाह्न", or " 1 पूर्वाह्न," depending on
> whether you want to use the devanagari numerals or not.
>
>
> See also
>
>
> https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html#tag_07_03_05
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpubs.opengroup.org%2Fonlinepubs%2F009695399%2Fbasedefs%2Fxbd_chap07.html%23tag_07_03_05&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647813149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rt%2BBjyugBCGQGt1vBY5PPpQGOM5yjigWIjLGF7j337k%3D&reserved=0>
>
> *alt_digits*
>
> Define alternative symbols for digits, corresponding to the %O modified
> conversion specification. [...] The %O modifier shall indicate that the
> string corresponding to the value specified via the conversion
> specification shall be used instead of the value.
>
>
>
> The way this is handled by CLDR ( and therefore most PL), is that the
> desired numbering system is attached to the locale name, or is provided as
> part of supplementary options
>
>
>
>
>
> Here is an example using javascript (tested locally with node)
>
>
>
> Notice that
>
> - The concern of numeral system vs formatting is separate
> - Most locales defaults to latin number ( but not arabic in this
> example), I am not exactly sure why
> - Few programming languages offer a per specifier choice of numbering
> systems, these things are not usually mixed.
>
>
>
> Now whether the %O specifier of POSIX makes sense or not is an interesting
> question, but I wanted to point out they are no less or more depending on
> locale than other specifiers.
>
>
>
> {:L%u} formats a week day number using the locale primary numeral system
>
> {:L%Ou} formats a week day number using the locale alternative numeral
> system
>
>
>
> What if you pass the C locale ?
>
> Well, the C locale numeral primary system is arabic numbers, it does not
> have an alternative numeric system
>
>
>
> In all cases, It does what it says it does
>
>
>
> Sorry I didn't catch that concern during the meeting.
>
> *I hope you will reconsider the second poll as we clearly missed some
> pretty critical information! *
>
>
>
>
>
>
>
> PS:
>
> You will notice that this brings more questions than it answers.
>
> What if the globale locale uses a non-arabic numeral system? What is the
> default numeral system? Why is there a primary and alternative. What if you
> need a third?
>
> Why does time formatting care about that when none of the other locale
> facilities seem to?
>
>
>
> But this is clearly out of scope of this issue!
>
>
>
>
>
> More reference
>
>
>
> http://cldr.unicode.org/translation/-core-data/numbering-systems
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcldr.unicode.org%2Ftranslation%2F-core-data%2Fnumbering-systems&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647823107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=KBT0l0vSxN8kKycr0o9P6jQWVeWuwWeLFD41PBLKLVk%3D&reserved=0>
>
>
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/NumberFormat/NumberFormat
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdeveloper.mozilla.org%2Fen-US%2Fdocs%2FWeb%2FJavaScript%2FReference%2FGlobal_Objects%2FIntl%2FNumberFormat%2FNumberFormat&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647823107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=10iIo1jp0omsARhXosIX4eeB0jJ6WpVMPavhTP9Baas%3D&reserved=0>
>
> https://lh.2xlibre.net/values/alt_digits/
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flh.2xlibre.net%2Fvalues%2Falt_digits%2F&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647823107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=1DRNXkhVCmnwJA1DSyb98o09Hcm3vY2lWSqA%2BrnPKvc%3D&reserved=0>
>
> https://unicode-org.github.io/icu/userguide/locale/
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Funicode-org.github.io%2Ficu%2Fuserguide%2Flocale%2F&data=04%7C01%7CCharles.Barto%40microsoft.com%7C3471cd1c4f354e26e64e08d90aa12c8e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637552509647833058%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=tLHh8710Rfac6WTQyd%2Fn0aLqGEFZWaTSZpGmdwX3gSk%3D&reserved=0>
>
>
>




image001.png

SG16 list run by sg16-owner@lists.isocpp.org