C++ Logo

sg16

Advanced search

UAX Profiles

From: Corentin <corentin.jabot_at_[hidden]>
Date: Sat, 6 Jan 2024 11:56:37 +0100
Hey folks.

ZWJ, and variations selectors seem to have been added to XID_Continue as
of Unicode 15.1,
despite UAX31 recognizing that "The use of default-ignorable characters in
identifiers is problematic".

The solution to that is to define that we implement the Default-Ignorable
Exclusion Profile
<https://unicode.org/reports/tr31/#Default_Ignorable_Exclusion_Profile>,
which I believe restores the Unicode 15 / C++23/ SG16 consensus on
Identifier grammar behavior.

It is very surprising to me that ZWJ is opt-out rather than opt-in given
the security implications and the fact supporting them requires
implementation of TR39 3.1.1
<https://www.unicode.org/reports/tr39/#Joining_Controls>.
I suppose this was done to better support Sanskrit?

Anyway, it might be useful to either:
 - Mandate the Default-Ignorable Exclusion Profile
 - Mandate some form of TR39 conformance (wich is going to be quite the
burden for implementers given our bandwidth)
 - Or allow/recommend either of the above options.

Thanks,

Corentin

Received on 2024-01-06 10:56:56