C++ Logo

sg16

Advanced search

Re: [SG16] [isocpp-lib-ext] Sending P1885R8 Naming Text Encodings to Demystify Them directly to electronic polling for C++23

From: Ville Voutilainen <ville.voutilainen_at_[hidden]>
Date: Fri, 15 Oct 2021 14:14:26 +0300
On Fri, 15 Oct 2021 at 13:55, Jens Maurer via Lib-Ext
<lib-ext_at_[hidden]> wrote:
>
> On 15/10/2021 12.41, Bryce Adelstein Lelbach aka wash wrote:
> > Jens, this does not sound like a library design matter.
> >
> > Can we please stop holding this paper up in LEWG unless there are library design questions?
> > If there are questions about the specifics of wording or text/Unicode details, there are groups that can deal with that (LWG and SG16).
> > Just because LEWG says we approve this paper does not mean it automatically goes into the standard, it just means we are happy with the library design.
>
> I am raising concerns I have about the current state of the paper.
>
> If the chair of LEWG deems those concerns not to be relevant at the
> level of LEWG, I'm fine with that, and I'll raise them again in LWG
> and/or plenary, as need be.

The following bit has a design question in it:

> > The paper is missing a normative definition of "encoding scheme"
> > with particular attention to the fact that an octet is not a
> > C++ byte. From such a definition, I would hope to gain clarity
> > how UTF-16 should be handled on a platform with CHAR_BITS == 16.

The last line is a design question. SG16 may be better-equipped to
process it than LEWG, of course,
but while such questions stand, the paper should not be forwarded to a
stage that verifies that the specification
matches the intent, when we don't know the intent.

> > It feels that SUBSTITUTE_UTF_ENCODING(E) (or the surrounding
> > text) should say something about the handling of UCS2 and UCS4,
> > which IANA appears to define as big-endian. It seems we want
> > platform endianness (similar to UTF-16 and UTF-32) here, too.

Same here.

Received on 2021-10-15 06:14:39