sg16: Re: [SG16] [isocpp-lib-ext] Sending P1885R8 Naming Text Encodings to Demystify Them directly to electronic polling for C++23

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Fri, 15 Oct 2021 12:54:38 +0200

On 15/10/2021 12.41, Bryce Adelstein Lelbach aka wash wrote:
> Jens, this does not sound like a library design matter.
>
> Can we please stop holding this paper up in LEWG unless there are library design questions?
>
> If there are questions about the specifics of wording or text/Unicode details, there are groups that can deal with that (LWG and SG16).
>
> Just because LEWG says we approve this paper does not mean it automatically goes into the standard, it just means we are happy with the library design.

I am raising concerns I have about the current state of the paper.

If the chair of LEWG deems those concerns not to be relevant at the
level of LEWG, I'm fine with that, and I'll raise them again in LWG
and/or plenary, as need be.

Jens

> --
> Bryce Adelstein Lelbach aka wash (he/him/his)
> US Programming Language Standards (PL22) Chair
> ISO C++ Library Evolution Chair
> CppCon and C++Now Program Chair
> HPC Programming Models Architect @ NVIDIA
> --
>
> On Thu, Oct 14, 2021, 17:55 Jens Maurer <Jens.Maurer_at_[hidden] <mailto:Jens.Maurer_at_[hidden]>> wrote:
>
> On 14/10/2021 23.32, Bryce Adelstein Lelbach aka wash via Lib-Ext wrote:
> > Okay, LEWG, let's try this one again.
> >
> > P1885R8 Naming Text Encodings to Demystify Them has been once again
> > sent to us by SG16.
> >
> > It has been reviewed multiple times by LEWG. The last time we saw it,
> > we said we would send it to electronic polls, pending a revision that
> > included answers to some questions raised during the reviews.
> >
> > I would like to see if we have support for sending P1885 directly to
> > electronic polling for C++23. If you support this motion, please reply
> > with a +1. If you do not support this motion and wish to spend more
> > time discussing this paper in LEWG, please reply with -1.
>
> The paper is missing a normative definition of "encoding scheme"
> with particular attention to the fact that an octet is not a
> C++ byte. From such a definition, I would hope to gain clarity
> how UTF-16 should be handled on a platform with CHAR_BITS == 16.
> (The definition in ISO 10646 is carefully crafted to apply only
> to Unicode encodings.)
>
> It feels that SUBSTITUTE_UTF_ENCODING(E) (or the surrounding
> text) should say something about the handling of UCS2 and UCS4,
> which IANA appears to define as big-endian. It seems we want
> platform endianness (similar to UTF-16 and UTF-32) here, too.
>
> Whether LWG will be happy to fix that on their level,
> I don't know.
>
> Jens
>

Received on 2021-10-15 05:54:43