C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Hidden locale dependency in [time.duration.io]?

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 4 Nov 2019 10:08:27 +0000
On 11/4/19 10:05 AM, Howard Hinnant wrote:
> On Nov 4, 2019, at 9:59 AM, Tom Honermann <tom_at_[hidden]> wrote:
>> On 11/4/19 9:03 AM, Howard Hinnant wrote:
>>> On Nov 4, 2019, at 8:45 AM, Tom Honermann <tom_at_[hidden]> wrote:
>>>> On 11/4/19 7:18 AM, Howard Hinnant wrote:
>>>>> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom_at_[hidden]> wrote:
>>>>>> I suggest the following wording: (using terminology from P1859R0)
>>>>>>
>>>>>> If Period​::​type is micro, but the character U+00B5 <del>cannot be represented in the encoding used</del><ins>lacks representation in the execution character set</ins> for charT, the unit suffix "us" is used instead of "μs". <ins>If
>>>>>> "μs" is used but the dynamic encoding lacks representation for U+00B5 and the stream is associated with a terminal or console, or if the stream is imbued with a std::codecvt facet that lacks conversion support for the character, then the result is unspecified.</ins>
>>>>>>
>>>>> I’ve no objection to an issue, but your proposed wording explicitly involves two things I’m strongly against:
>>>>>
>>>>> 1. Now the code has to check the locale, for this precision only.
>>>>>
>>>>> 2. Now the code has different behavior between cout and ostringstream. And the result of ostringstream is very commonly subsequently sent to cout (ostringstream is a common formatting aid).
>>>>>
>>>>> Imo, the proposed wording is much, much worse than the status-quo and I would vote strongly against it.
>>>> No, the wording I proposed doesn't check for locale. The execution character set is the character set used for string literals and is known at compile time; it is not the locale dependent run-time character set.
>>> Here is the processed form of what you wrote (the deletes deleted, the inserts inserted):
>>>
>>> If Period​::​type is micro, but the character U+00B5 lacks representation in the execution character set for charT, the unit suffix "us" is used instead of "μs". If "μs" is used but the dynamic encoding lacks representation for U+00B5 and the stream is associated with a terminal or console, or if the stream is imbued with a std::codecvt facet that lacks conversion support for the character, then the result is unspecified.
>>>
>>> The phrase "or if the stream is imbued with a std::codecvt facet that…” implies that the implementation gets the locale of the stream, extracts the codecvt facet from it, and does something with it.
>> That isn't what I intended. My intent was to state that the behavior is unspecified in those cases to make it clear that the implementation does *not* have to consult the locale.
>>> I do not believe the streaming of durations of any precision should involve the stream’s locale.
>> I strongly agree.
>>> For microseconds precision the suffix should be “μs”, but at the vendor’s discretion may be “us” instead.
>>>
>>> I’m open to better ways of saying the sentence above. The above sentence doesn’t (and shouldn’t) be stream-dependent or locale dependent. It should not involve properties of the codecvt facet.
>> I think we are in agreement and just struggling with wording.
> Are you in Belfast? It might be useful to bring this issue up in LWG, pending LWG scheduling restrictions of course.

I am. I'll file an LWG issue with proposed wording that attempts to
specify the intent above and see if Marshal wants to schedule it for
this week.

Tom.

>
> Howard
>

Received on 2019-11-04 11:08:32