Le sam. 6 janv. 2024 à 15:05, Jens Maurer via SG16 <sg16@lists.isocpp.org> a écrit :

... Tom Honermann, ..., Robin Leroy, ..., Richard Smith."

These names sound familiar to me, and at least two of them are
probably on this e-mail reflector.

Could one of them say something about what happened?

It sounds like it is time for the liaison officer to do some liaising :-)

Le sam. 6 janv. 2024 à 11:56, Corentin <corentin.jabot@gmail.com> a écrit :

It is very surprising to me that ZWJ is opt-out rather than opt-in given the security implications and the fact supporting them requires implementation of TR39 3.1.1.
I suppose this was done to better support Sanskrit?

This is because ZWJ and ZWNJ are needed orthographically in modern languages (Persian being one of them, see the example in Section 5.1.3 of UTS #55), and because this does not actually change the picture as far as security implications are concerned: the 260 variation selectors have always been allowed (and must always remain allowed), and those even more rarely have a visible effect. Besides, the security issue of visually indistinguishable identifiers is not usefully mitigated by prohibiting invisible characters: you still have good old confusables, e.g., НТТР vs. HTTP, and bidi confusables, aא1 vs. a1א.

Anyway, it might be useful to either:
- Mandate the Default-Ignorable Exclusion Profile
- Mandate some form of TR39 conformance (wich is going to be quite the burden for implementers given our bandwidth)

I would not recommend mandating either of these things.

It might be useful to review the new UTS #55, Unicode Source Code Handling.

Mostly the gist of that document is that legality according to a language specification, which requires some level of stability, is the wrong tool to mitigate the security issues arising from invisible characters in identifiers (and more broadly with issues arising from visually indistinguishable identifiers).

Instead we encourage implementations to deal with that using linters, compiler warnings, and various editor squiggles (which are important things, but not within the scope of what the standard mandates).

See https://www.unicode.org/reports/tr55/#Display.

For confusability (which deals with the issue of invisible characters among many others), see https://www.unicode.org/reports/tr55/#Confusability.

Note that other diagnostics are recommended by UTS #55 that affect default ignorable code points in general and ZWJ and ZWNJ in particular, but many of them really mostly address usability rather than security issues.

Best regards,

Robin Leroy