C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Hidden locale dependency in [time.duration.io]?

From: Jean-Marc Bourguet <jm_at_[hidden]>
Date: Mon, 04 Nov 2019 09:57:32 +0100
On 04.11.2019 09:45, Tom Honermann wrote:
> On 11/4/19 7:18 AM, Howard Hinnant wrote:
>> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom_at_[hidden]> wrote:
>>> I suggest the following wording: (using terminology from P1859R0)
>>>
>>> If Period​::​type is micro, but the character U+00B5 <del>cannot be
>>> represented in the encoding used</del><ins>lacks representation in
>>> the execution character set</ins> for charT, the unit suffix "us" is
>>> used instead of "μs". <ins>If
>>> "μs" is used but the dynamic encoding lacks representation for U+00B5
>>> and the stream is associated with a terminal or console, or if the
>>> stream is imbued with a std::codecvt facet that lacks conversion
>>> support for the character, then the result is unspecified.</ins>
>>>
>> I’ve no objection to an issue, but your proposed wording explicitly
>> involves two things I’m strongly against:
>>
>> 1. Now the code has to check the locale, for this precision only.
>>
>> 2. Now the code has different behavior between cout and
>> ostringstream. And the result of ostringstream is very commonly
>> subsequently sent to cout (ostringstream is a common formatting aid).
>>
>> Imo, the proposed wording is much, much worse than the status-quo and
>> I would vote strongly against it.
>
> No, the wording I proposed doesn't check for locale. The execution
> character set is the character set used for string literals and is
> known
> at compile time; it is not the locale dependent run-time character set.

lex.charset/3 states

     The values of the members of the execution character sets and the
sets of additional members are locale-specific.

apparently making the execution character sets run-time dependent.

But lex.ccon/2 states

     An ordinary character literal that contains a single c-char
representable in the execution character set has type char, with value
equal to the numerical value of the encoding of the c-char in the
execution character set.

apparently making it fixed.

I've not looked at that more in-depth to see which interpretation is the
more pervasive.

Yours,

-- Jean-Marc Bourguet

Received on 2019-11-04 10:33:33