Le mer. 4 juin 2025 à 08:26, Jens Maurer <jens.maurer@gmx.net> a écrit :

The previous version of the paper appeared to do the minimum necessary to reflect
Unicode 16 and kept all the detailed "we don't conform" statements in. Now, the ask
was to rebase on 15.1. What is the rationale for changing the approach towards the
"we don't conform" utterances for such a rather trivial rebase?

That was my suggestion.

The reason is twofold:

We keep adding requirements there, so this is a maintenance burden as the version of Unicode keeps getting bumped up.
In this instance, we added R3c, which is not part of R3. (I would say that the same problem applies to a lesser extent to the [uaxid.nonobservance] subclause, which is, in fact, missing R3c.)
The νεῶν κατάλογος of requirements not met is neither required of a conformance statement, nor useful to an implementer.
This is an informative annex (in the standardizing sense), so we can try to make it informative (in the general sense).
If I am trying to write a tool that interoperates with various implementations that conform to UAX #31 (in Unicode terminology; here the C++ standard itself is one such « implementation »), say because I am writing something that generates C++ code and code in other programming languages, it is useful to know that I can produce default identifiers, just like I can in Java or Python. It is a lot less useful to read paragraph that summarizes what R3 is about, only to find that I cannot do anything with it anyway. The less said about hashtags the better.

From that standpoint, as I had noted to Steve earlier, claiming conformance to R4 may be technically correct—the best kind of correct—on the space of well-formed programs, but I find it misleading at best. I would expect that if I am interoperating with implementations that conform to R4, I need to check that I don’t rely on equivalent identifiers (compatibility or canonical, depending on the normalization form used in R4), but that I don’t need to normalize the identifiers I produce, because the normalization-equivalent identifiers are treated as equivalent.

Thinking about it some more, since R1 does not mention the restriction from [lex.name] §1, I now think the conformance claim to R4 is actually just incorrect: Non-normalized identifiers are identifiers (both by definition in lex.name and for the purposes of the conformance claim to R1 as worded), but they are not « treated equivalently by the implementation », since they cause the program to be ill-formed, whereas their normalized counterparts do not.

Best regards,

Robin Leroy