C++ Logo

sg16

Advanced search

Re: UAX Profiles

From: Robin Leroy <eggrobin_at_[hidden]>
Date: Sat, 6 Jan 2024 19:11:28 +0100
Le sam. 6 janv. 2024 à 18:51, Corentin <corentin.jabot_at_[hidden]> a écrit :

> To clarify because I want to be sure I understand correctly.
>
> It is your opinion/that of Unicode that allowing arbitrary ZWJ in
> identifiers in an implementation that supports neither UTS #39 nor UTS #55
> does not constitute an increased security/usability issue?
>
Indeed. In fact it resolves a usability issue for Persian programmers.

Supporting some form of UTS #55 in c++ compilers would certainly be nice in
> the long term, however I am not aware of anyone having the bandwidth to do
> that work (at least in clang) and in the meantime I want to make sure we
> can update Unicode without introducing
> vulnerabilities.
>
Note that lots of UTS #55 things are happening at the editor level, without
clang needing to do anything.

>
> Thanks.
>
> On Sat, Jan 6, 2024 at 4:27 PM Robin Leroy <eggrobin_at_[hidden]> wrote:
>
>> On a point of standards interpretation:
>>
>> Le sam. 6 janv. 2024 à 11:56, Corentin <corentin.jabot_at_[hidden]> a
>> écrit :
>>
>>> It is very surprising to me that ZWJ is opt-out rather than opt-in given
>>> […] the fact supporting them requires implementation of TR39 3.1.1
>>> <https://www.unicode.org/reports/tr39/#Joining_Controls>.
>>>
>> Supporting these characters requires no such thing.
>>
>> It is indeed the case that most of this section used to be in UAX #31
>> with requirement UAX31-R1a.
>> That requirement was removed to UTS #39 for the reasons mentioned above
>> (it falsely suggested that default identifiers do not otherwise contain
>> invisible characters, and its complexity incentivized implementers to not
>> allow these linguistically important characters in identifiers, even though
>> they allow the other less useful default ignorables).
>> Note that the application of the contextual checks in Section 3.1.1 of
>> UTS #39 is described in the example in Section 5.1.3 of UTS #55
>> <https://www.unicode.org/reports/tr55/#General-Security-Profile> as a
>> mitigation to a *usability* issue; the security issue is dealt with as
>> part of the more general confusable detection, which ignores default
>> ignorables <https://www.unicode.org/reports/tr39/#Confusable_Detection>.
>>
>

Received on 2024-01-06 18:11:44