C++ Logo

sg16

Advanced search

Re: libu8ident 0.1 released

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Tue, 25 Jan 2022 14:24:35 +0100
On 25/01/2022 08.20, Ville Voutilainen via SG16 wrote:
> At any rate, going into C++23 with P1949 and then looking at this
> proposal later would mean that we have a breaking change
> in our hands. If we go with this proposal, we can relax the
> identifiers if we decide (again, tho, I might add) that we're not
> concerned (*) about
> bidi/homoglyph attacks.
>
> (*) Or, rather, that we leave that to Quality of Implementation and
> external tools, I guess.
>
> But still, the high-order bit, to me, seems to be that if we go with
> P1949 for C++23, the horses are out of the barn and the cats are out
> of the bag. Taking a step back and going with what's proposed here
> after C++23 has been published seems.. ..awkward. I would
> find the hypothetical reverse order much less awkward, or even just
> sticking with what's being proposed here.
>
> I must wonder, tho, whether the cats are already out of the bag,

They are. The situation pre-P1949 was already that all kinds of
Unicode identifiers were allowed, but just based on code point
ranges, and not on any principled categorization.

See section 4 of P1949: What changes is mostly the treatment of
emoji, which is probably less of a breaking change than disallowing
entire scripts.

In short, C++20 plus P1949 is a mild breakage (if any, in practice),
but C++20 plus Reini's approach is a large breakage, and so is C++20
plus P1949 plus Reini's approach.

In short, it doesn't matter whether we do P1949 first and then Reini's
approach, or we only do Reini's approach on top of C++20.
I don't feel a particular rush either way.

Or we decide that we don't want Reini's approach at all, with the
decision taken at a convenient non-rush time.

Jens

Received on 2022-01-25 13:24:41