ISOCPP sg16 List: Re: libu8ident 0.1 released

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Tue, 25 Jan 2022 09:13:40 +0100

On Tue, Jan 25, 2022, 08:20 Ville Voutilainen via SG16 <
sg16_at_[hidden]> wrote:

> On Tue, 25 Jan 2022 at 08:38, Reini Urban via SG16
> <sg16_at_[hidden]> wrote:
> >> Thank you, Reini. I will get these scheduled for review in SG16. Please
> note that we are now beyond the deadline for new papers for C23 and C++23,
> so review will be directed towards later standards. Our immediate priority
> is to finalize features that have been accepted for C++23. As a result, it
> may be a few months before these papers get scheduled in SG16. Though an
> argument could be made that your proposal constitutes modification of a
> feature accepted for C++23 (P1949) and therefore in scope for that
> standard, I see your proposal as more of a competing one rather than a
> modification. P1949 effectively brought the standard up to date with more
> recent Unicode versions without changing the design intent; the changes you
> propose are a change in direction and more disruptive.
> >
> >
> > In my point of view is that C11 made identifiers insecure by making them
> non-identifiable, and adopting TR39 will fix that spec bug. So a bugfix,
> not a feature.
>
> At any rate, going into C++23 with P1949 and then looking at this
> proposal later would mean that we have a breaking change
> in our hands. If we go with this proposal, we can relax the
> identifiers if we decide (again, tho, I might add) that we're not
> concerned (*) about
> bidi/homoglyph attacks.
>
> (*) Or, rather, that we leave that to Quality of Implementation and
> external tools, I guess.
>
> But still, the high-order bit, to me, seems to be that if we go with
> P1949 for C++23, the horses are out of the barn and the cats are out
> of the bag. Taking a step back and going with what's proposed here
> after C++23 has been published seems.. ..awkward. I would
> find the hypothetical reverse order much less awkward, or even just
> sticking with what's being proposed here.
>
> I must wonder, tho, whether the cats are already out of the bag,
> considering that gcc and clang already allow all sorts of identifiers
> and gcc has that bidi warning. Seems like we're damned if we do,
> damned if we don't.
>

There is no way that we can accommodate both stability and safety
simultaneously for any Unicode related things, nor can we react fast enough
to security concerns. In fact there have been a lot of developments in the
last few weeks alone.

Now, these security concerns are both important, and to handle with care
as... It's easy to preclude legitimate use cases.

We need as an industry to:
* Understand where the diagnostic needs to be produced (ide, compilers,
code review tools...)
* Find the right way to handle security without excluding legitimate use
cases, and a lot of that will come from the Unicode consortium
* Have some flexibility to adapt to new best practices

I don't think the C++ standard is the place to try to solve these matters,
but we could (should?) offer some recommendations.

But long cycles and a desire to provide more stability than Unicode
guarantees, as well as limited bandwidth, makes me think this is ultimately
a QoI concern.

Anyway, P1949 is a breaking change, this is another breaking change. P1949
was designed to guarantee stability over time but security consideration on
top won't. So maybe which cycle this gets in doesn't matter terribly,
except that we should either consider security matters swiftly, or
recognize that we are not the right body to deal with that.

I mentioned somewhere else that the rust compiler does confusable detection
yet the spec only mentions UAX.

For me the question is: does the sudden interest for Unicode related
security matters invalidates p1949's initial analysis that confusable
detection is better left in the hands of implementations?

-- 
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2022-01-25 08:13:52