sg16: Re: [SG16] P1885 polling

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Thu, 23 Sep 2021 12:17:40 +0200

On 23/09/2021 11.55, Corentin wrote:
>
>
> On Thu, Sep 23, 2021 at 8:21 AM Jens Maurer <Jens.Maurer_at_[hidden] <mailto:Jens.Maurer_at_[hidden]>> wrote:
>
> On 23/09/2021 07.21, Corentin via SG16 wrote:
> > I would like to know if you have sustained objections such that you do not want to see this paper polled, because that's currently not clear to me.
>
> At this time (thanks Hubert for the digging), I think the normative wording
> is sufficiently unclear in its intent that I'm strong opposed to forwarding
> this paper to LWG.
>
> Maybe some notes or "Recommended practice" sections would help convey what
> we want implementations to do, if we can't describe that normatively with
> sufficient precision.
>
>
> Recommended practice: Implementations should return a value that represents an en-
> coding whose code unit size matches the size of a single wchar_t.

But, apparently there are very few of those.

If we want to stick to the IANA table, would it be a better direction to
say that "EBCDIC-US" can be used as both a narrow and wide encoding, with
the understanding that the wide encoding is (trivially) established from
the (specified) narrow encoding by taking the (unsigned) numerical values
of the narrow encoding and using those as the values of the wide encoding?
(We could even say that so in the normative wording.)

All the query functions are named so that narrow/wide is differentiated;
the mib id numbers would not represent that differentiation. That seems
an ok trade-off to me.

> > If so, I would like to know what direction you would like this paper to take.
> >
> > * We already made the wording as wide as possible, because it was always the intent of this paper to be on a best effort basis (I do not think a perfect solution can be found). I do believe the wording matches the intent sufficiently, please let me know if you think that's still not the case.
>
> See my separate e-mail. I can't divine the intent from the wording right now.
>
> > * We can remove wide methods. I'd argue that, at the very least, it's still useful for users to distinguish the few known and well-paved scenarios from everything else such that for example if an user expects utf-32 on posix they can check for that. Returning something like "x-ISO8859-1" is also useful on introspection, even if by definition this is very much none portable.
>
> I'd guess that Hubert has situations where some wide-EBCDIC encoding is used.
> Also, it feels asymmetric to talk about just narrow encodings, but not wide
> ones.
>
>
> Agreed.
> But I'd rather... find a way to move forward?
>
>
> > * We can stop pursuing this paper.
>
> * We can divorce ourselves from the obviously broken IANA registry
> (possibly just rely on their "character set definitions", but not claim
> those are actual encoding designations)
>
>
> "Obviously broken" is a rather big claim in the absence of suitable alternatives.
> I believe the use of the IANA registry is motivated by the paper and previous polls.

I think the polls are are not sufficiently precise to argue for the case
that what the IANA table describes (by implication) as a narrow encoding
cannot be re-used to designate a wide encoding trivially derived from the
narrow encoding.

> I did however modify the paper to use more correct terminology and added a note to explain that our terminology differs, which will hopefully avoid confusion

Good.

Jens

Received on 2021-09-23 05:17:48