C++ Logo

sg16

Advanced search

Re: [SG16] Skipped polls from today's SG16 meeting

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Thu, 27 May 2021 00:49:51 -0400
On Wed, May 26, 2021 at 5:01 PM Peter Brett via SG16 <sg16_at_[hidden]>
wrote:

> Hi all,
>
> I prepared these 2 polls for tonight's meeting, but we did not have the
> opportunity to vote on them. I hope we will be able to vote on them next
> time. Please suggest alternative or additional polls that you think would
> be informative!
>
> Peter
>
>
> If, and only if, the literal encoding is UTF-8, <print> facilities should
> assume
> that their formatted results are UTF-8 text.
>

Specific nit about the poll: "<print> facilities" may be too broad. Perhaps
the scope should only cover the "default user interface" to the "<print>
facilities". That is, the scope probably should not cover the "non-Unicode"
functions.

General comments about presentation:
The poll would benefit in terms of informed consensus if presented
alongside what it does mean (in terms of alternatives rejected by adopting
this direction) and what it does not mean (to emphasize/justify the
special-casing). In particular, this direction rejects the use of locale in
the determination of what a "plain" string's encoding supposedly is. Also,
this wording does not go so far as to say that the literal encoding is, in
general, a good indicator of the encoding of the formatted result.

Additionally, under the status quo, agreeing to this direction is to say
that non-UTF-8 locales should not be used with the "<print> facilities"
when the literal encoding is UTF-8 and locale-specific forms are requested.
I believe we should have this as a separate poll first.

Specific comment about the direction: I believe the point made during the
call about compile-time checks is a strong point for using the literal
encoding (as opposed to the locale). As for the special-casing, my current
thoughts are that the literal encoding is not a great indicator either, but
using it in the case of UTF-8 is less objectionable than using it more
generally.

As for locale-specific forms from non-Unicode locales, I do think it would
be nice to have the transcoding occur in the UTF-8 literal encoding case. I
think we need to understand what the story is about whether the necessary
converter may be "missing".


>
> SF F N A SA
>
> Attendance:
>
> Consensus:
>
> Author's position:
>
>
> If a <print> facility assumes that the result of formatting is UTF-8 text,
> but
> it is not, then the program is ill-formed, no diagnostic required.
>

I'm inclined to support something like this, but I do note that this could
actually have the effect that users or communities would avoid using UTF-8
as their literal encoding just to escape this UB.


>
> SF F N A SA
>
> Attendance:
>
> Consensus:
>
> Author's position:
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2021-05-26 23:50:24