C++ Logo

SG16

Advanced search

Subject: Re: LWG issue: Time formatters should not be locale sensitive by default
From: Daniel Krügler (daniel.kruegler_at_[hidden])
Date: 2021-04-29 00:39:38


Am Do., 29. Apr. 2021 um 01:56 Uhr schrieb Corentin via SG16
<sg16_at_[hidden]>:
>
> I wanted to address the "locale-independent" specifications.
> There are *none*, in the POSIX strftime spec.

Corentin, if your intention was to make an update to the issue
content, it is completely unclear to me, what precisely you wish to
change. If you want to discuss the issue unrelated to making changes
to the issue, I would like to suggest to keep the lwgchair address out
of the discussion, if possible - Thanks!

- Daniel

> https://pubs.opengroup.org/onlinepubs/009695399/functions/strftime.html
> The strftime() function shall place bytes into the array pointed to by s as controlled by the string pointed to by format. The format is a character string, beginning and ending in its initial shift state, if any. The format string consists of zero or more conversion specifications and ordinary characters. A conversion specification consists of a '%' character, possibly followed by an E or O modifier, and a terminating conversion specifier character that determines the conversion specification's behavior. All ordinary characters (including the terminating null byte) are copied unchanged into the array. If copying takes place between objects that overlap, the behavior is undefined. No more than maxsize bytes are placed into the array. Each conversion specifier is replaced by appropriate characters as described in the following list. The appropriate characters are determined using the LC_TIME category of the current locale and by the values of zero or more members of the broken-down time structure pointed to by timeptr, as specified in brackets in the description. If any of the specified values are outside the normal range, the characters stored are unspecified.
>
> The %O are an opt-in into the locale alternative numeral system.
> You might want to have dates with arabic numerals and names in hindi, for example.
>
> so "1 AM" can be either "१ पूर्वाह्न", or " 1 पूर्वाह्न," depending on whether you want to use the devanagari numerals or not.
>
>
> See also
> https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html#tag_07_03_05
> alt_digitsDefine alternative symbols for digits, corresponding to the %O modified conversion specification. [...] The %O modifier shall indicate that the string corresponding to the value specified via the conversion specification shall be used instead of the value.
>
> The way this is handled by CLDR ( and therefore most PL), is that the desired numbering system is attached to the locale name, or is provided as part of supplementary options
>
>
> Here is an example using javascript (tested locally with node)
>
>
> Notice that
>
> The concern of numeral system vs formatting is separate
> Most locales defaults to latin number ( but not arabic in this example), I am not exactly sure why
> Few programming languages offer a per specifier choice of numbering systems, these things are not usually mixed.
>
>
> Now whether the %O specifier of POSIX makes sense or not is an interesting question, but I wanted to point out they are no less or more depending on locale than other specifiers.
>
> {:L%u} formats a week day number using the locale primary numeral system
> {:L%Ou} formats a week day number using the locale alternative numeral system
>
> What if you pass the C locale ?
> Well, the C locale numeral primary system is arabic numbers, it does not have an alternative numeric system
>
> In all cases, It does what it says it does
>
> Sorry I didn't catch that concern during the meeting.
> I hope you will reconsider the second poll as we clearly missed some pretty critical information!
>
>
>
> PS:
> You will notice that this brings more questions than it answers.
> What if the globale locale uses a non-arabic numeral system? What is the default numeral system? Why is there a primary and alternative. What if you need a third?
> Why does time formatting care about that when none of the other locale facilities seem to?
>
> But this is clearly out of scope of this issue!
>
>
> More reference
>
> http://cldr.unicode.org/translation/-core-data/numbering-systems
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/NumberFormat/NumberFormat
> https://lh.2xlibre.net/values/alt_digits/
> https://unicode-org.github.io/icu/userguide/locale/
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16


SG16 list run by sg16-owner@lists.isocpp.org