Date: Sat, 7 Jun 2025 15:30:47 +0200
Le mer. 4 juin 2025 à 08:26, Jens Maurer <jens.maurer_at_[hidden]> a écrit :
> The previous version of the paper appeared to do the minimum necessary to
> reflect
> Unicode 16 and kept all the detailed "we don't conform" statements in.
> Now, the ask
> was to rebase on 15.1. What is the rationale for changing the approach
> towards the
> "we don't conform" utterances for such a rather trivial rebase?
>
That was my suggestion.
The reason is twofold:
1. We keep adding requirements there, so this is a maintenance burden as
the version of Unicode keeps getting bumped up.
In this instance, we added R3c, which is not part of R3. (I would say
that the same problem applies to a lesser extent to the
[uaxid.nonobservance] subclause, which is, in fact, missing R3c.)
2. The νεῶν κατάλογος of requirements not met is neither required of a
conformance statement, nor useful to an implementer.
This is an informative annex (in the standardizing sense), so we can try
to make it informative (in the general sense).
If I am trying to write a tool that interoperates with various
implementations that conform to UAX #31 (in Unicode terminology; here the
C++ standard itself is one such « implementation »), say because I am
writing something that generates C++ code and code in other programming
languages, it is useful to know that I can produce default identifiers,
just like I can in Java or Python. It is a lot less useful to read
paragraph that summarizes what R3 is about, only to find that I cannot do
anything with it anyway. The less said about hashtags the better.
From that standpoint, as I had noted to Steve earlier, claiming
conformance to R4 may be technically correct—the best kind of correct—*on
the space of well-formed programs*, but I find it misleading at best. I
would expect that if I am interoperating with implementations that conform
to R4, I need to check that I don’t rely on equivalent identifiers
(compatibility or canonical, depending on the normalization form used in
R4), but that I don’t need to normalize the identifiers I produce, because
the normalization-equivalent identifiers are treated as equivalent.
Thinking about it some more, since R1 does not mention the restriction
from [lex.name] §1 <https://eel.is/c++draft/lex.name#1>, I now think the
conformance claim to R4 is actually just incorrect: Non-normalized
identifiers are identifiers (both by definition in lex.name
<https://eel.is/c++draft/lex.name#nt:identifier> and for the purposes of
the conformance claim to R1 as worded), but they are not « treated
equivalently by the implementation », since they cause the program to be
ill-formed, whereas their normalized counterparts do not.
Best regards,
Robin Leroy
> The previous version of the paper appeared to do the minimum necessary to
> reflect
> Unicode 16 and kept all the detailed "we don't conform" statements in.
> Now, the ask
> was to rebase on 15.1. What is the rationale for changing the approach
> towards the
> "we don't conform" utterances for such a rather trivial rebase?
>
That was my suggestion.
The reason is twofold:
1. We keep adding requirements there, so this is a maintenance burden as
the version of Unicode keeps getting bumped up.
In this instance, we added R3c, which is not part of R3. (I would say
that the same problem applies to a lesser extent to the
[uaxid.nonobservance] subclause, which is, in fact, missing R3c.)
2. The νεῶν κατάλογος of requirements not met is neither required of a
conformance statement, nor useful to an implementer.
This is an informative annex (in the standardizing sense), so we can try
to make it informative (in the general sense).
If I am trying to write a tool that interoperates with various
implementations that conform to UAX #31 (in Unicode terminology; here the
C++ standard itself is one such « implementation »), say because I am
writing something that generates C++ code and code in other programming
languages, it is useful to know that I can produce default identifiers,
just like I can in Java or Python. It is a lot less useful to read
paragraph that summarizes what R3 is about, only to find that I cannot do
anything with it anyway. The less said about hashtags the better.
From that standpoint, as I had noted to Steve earlier, claiming
conformance to R4 may be technically correct—the best kind of correct—*on
the space of well-formed programs*, but I find it misleading at best. I
would expect that if I am interoperating with implementations that conform
to R4, I need to check that I don’t rely on equivalent identifiers
(compatibility or canonical, depending on the normalization form used in
R4), but that I don’t need to normalize the identifiers I produce, because
the normalization-equivalent identifiers are treated as equivalent.
Thinking about it some more, since R1 does not mention the restriction
from [lex.name] §1 <https://eel.is/c++draft/lex.name#1>, I now think the
conformance claim to R4 is actually just incorrect: Non-normalized
identifiers are identifiers (both by definition in lex.name
<https://eel.is/c++draft/lex.name#nt:identifier> and for the purposes of
the conformance claim to R1 as worded), but they are not « treated
equivalently by the implementation », since they cause the program to be
ill-formed, whereas their normalized counterparts do not.
Best regards,
Robin Leroy
Received on 2025-06-07 13:31:23