C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Hidden locale dependency in [time.duration.io]?

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 4 Nov 2019 10:06:05 +0000
On 11/4/19 9:40 AM, Steve Downey wrote:
> I believe the wording around locale is merely warning that if μs isn't
> supported by the locale associated with a stream, then the results are
> unspecified, which is true, but unhelpful, and probably does not need
> to be in the normative wording for this.
I think it is helpful to make it clear that the implementation does not
(should not) make such cases "work".
>
> I'm unaware of any implementation that supports checking if string
> literals are actually encodable. All implementations are requirged to
> at least track \u00b5 until literals are encoded. This sound like an
> implementation that supports targeting non-unicode encodings of
> literals, such as MSVC, will have to use "us".

I believe gcc at least will warn in cases where the source encoding and
execution encoding are not the same.

I would argue that MSVC can use "μs" when compiling with the
/execution-charset:utf-8 or /utf-8 options (implicitly or explicitly)
enabled.

Tom.

>
> On Mon, Nov 4, 2019 at 9:03 AM Howard Hinnant
> <howard.hinnant_at_[hidden] <mailto:howard.hinnant_at_[hidden]>> wrote:
>
>
> On Nov 4, 2019, at 8:45 AM, Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
> >
> > On 11/4/19 7:18 AM, Howard Hinnant wrote:
> >> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
> >>> I suggest the following wording: (using terminology from P1859R0)
> >>>
> >>> If Period​::​type is micro, but the character U+00B5
> <del>cannot be represented in the encoding used</del><ins>lacks
> representation in the execution character set</ins> for charT, the
> unit suffix "us" is used instead of "μs". <ins>If
> >>> "μs" is used but the dynamic encoding lacks representation for
> U+00B5 and the stream is associated with a terminal or console, or
> if the stream is imbued with a std::codecvt facet that lacks
> conversion support for the character, then the result is
> unspecified.</ins>
> >>>
> >> I’ve no objection to an issue, but your proposed wording
> explicitly involves two things I’m strongly against:
> >>
> >> 1. Now the code has to check the locale, for this precision only.
> >>
> >> 2. Now the code has different behavior between cout and
> ostringstream. And the result of ostringstream is very commonly
> subsequently sent to cout (ostringstream is a common formatting aid).
> >>
> >> Imo, the proposed wording is much, much worse than the
> status-quo and I would vote strongly against it.
> >
> > No, the wording I proposed doesn't check for locale. The
> execution character set is the character set used for string
> literals and is known at compile time; it is not the locale
> dependent run-time character set.
>
>
> Here is the processed form of what you wrote (the deletes deleted,
> the inserts inserted):
>
> If Period​::​type is micro, but the character U+00B5 lacks
> representation in the execution character set for charT, the unit
> suffix "us" is used instead of "μs". If "μs" is used but the
> dynamic encoding lacks representation for U+00B5 and the stream is
> associated with a terminal or console, or if the stream is imbued
> with a std::codecvt facet that lacks conversion support for the
> character, then the result is unspecified.
>
> The phrase "or if the stream is imbued with a std::codecvt facet
> that…” implies that the implementation gets the locale of the
> stream, extracts the codecvt facet from it, and does something
> with it.
>
> I do not believe the streaming of durations of any precision
> should involve the stream’s locale.
>
> For microseconds precision the suffix should be “μs”, but at the
> vendor’s discretion may be “us” instead.
>
> I’m open to better ways of saying the sentence above. The above
> sentence doesn’t (and shouldn’t) be stream-dependent or locale
> dependent. It should not involve properties of the codecvt facet.
>
> Howard
>


Received on 2019-11-04 11:06:09