sg16: Re: [SG16-Unicode] Hidden locale dependency in [time.duration.io]?

From: Steve Downey <sdowney_at_[hidden]>
Date: Mon, 4 Nov 2019 09:40:01 +0000

I believe the wording around locale is merely warning that if μs isn't
supported by the locale associated with a stream, then the results are
unspecified, which is true, but unhelpful, and probably does not need to be
in the normative wording for this.

I'm unaware of any implementation that supports checking if string literals
are actually encodable. All implementations are requirged to at least track
\u00b5 until literals are encoded. This sound like an implementation that
supports targeting non-unicode encodings of literals, such as MSVC, will
have to use "us".

On Mon, Nov 4, 2019 at 9:03 AM Howard Hinnant <howard.hinnant_at_[hidden]>
wrote:

>
> On Nov 4, 2019, at 8:45 AM, Tom Honermann <tom_at_[hidden]> wrote:
> >
> > On 11/4/19 7:18 AM, Howard Hinnant wrote:
> >> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom_at_[hidden]> wrote:
> >>> I suggest the following wording: (using terminology from P1859R0)
> >>>
> >>> If Period::type is micro, but the character U+00B5 <del>cannot be
> represented in the encoding used</del><ins>lacks representation in the
> execution character set</ins> for charT, the unit suffix "us" is used
> instead of "μs". <ins>If
> >>> "μs" is used but the dynamic encoding lacks representation for U+00B5
> and the stream is associated with a terminal or console, or if the stream
> is imbued with a std::codecvt facet that lacks conversion support for the
> character, then the result is unspecified.</ins>
> >>>
> >> I’ve no objection to an issue, but your proposed wording explicitly
> involves two things I’m strongly against:
> >>
> >> 1. Now the code has to check the locale, for this precision only.
> >>
> >> 2. Now the code has different behavior between cout and
> ostringstream. And the result of ostringstream is very commonly
> subsequently sent to cout (ostringstream is a common formatting aid).
> >>
> >> Imo, the proposed wording is much, much worse than the status-quo and I
> would vote strongly against it.
> >
> > No, the wording I proposed doesn't check for locale. The execution
> character set is the character set used for string literals and is known at
> compile time; it is not the locale dependent run-time character set.
>
>
> Here is the processed form of what you wrote (the deletes deleted, the
> inserts inserted):
>
> If Period::type is micro, but the character U+00B5 lacks representation
> in the execution character set for charT, the unit suffix "us" is used
> instead of "μs". If "μs" is used but the dynamic encoding lacks
> representation for U+00B5 and the stream is associated with a terminal or
> console, or if the stream is imbued with a std::codecvt facet that lacks
> conversion support for the character, then the result is unspecified.
>
> The phrase "or if the stream is imbued with a std::codecvt facet that…”
> implies that the implementation gets the locale of the stream, extracts the
> codecvt facet from it, and does something with it.
>
> I do not believe the streaming of durations of any precision should
> involve the stream’s locale.
>
> For microseconds precision the suffix should be “μs”, but at the vendor’s
> discretion may be “us” instead.
>
> I’m open to better ways of saying the sentence above. The above sentence
> doesn’t (and shouldn’t) be stream-dependent or locale dependent. It should
> not involve properties of the codecvt facet.
>
> Howard
>
>

Received on 2019-11-04 10:40:15