C++ Logo

sg16

Advanced search

Re: Undated reference to Unicode Standard and UAX #29

From: Jens Maurer <jens.maurer_at_[hidden]>
Date: Sat, 6 Jan 2024 22:33:29 +0100
On 06/01/2024 19.35, Ville Voutilainen wrote:
> On Sat, 6 Jan 2024 at 19:37, Jens Maurer via SG16 <sg16_at_[hidden]> wrote:
>>> I think I'd prefer if we just somehow say that implementations can define which Unicode standard they conform to. That way if a conforming C++23 implementation uses Unicode 15.1.0 (the latest version today) then it doesn't become non-conforming overnight when a new Unicode standard is published. We can recommend that implementations pin themselves to a recent Unicode standard, and even recommend that implementations should (if possible) update to use newer Unicode standards as they become available.
>>
>> Hm... That's not how normative references are supposed to work in an ISO world,
>> I think ("pick the version you want" -- no), but we could certainly try that.
>
> Well, that's how our normative reference to C works, for a standard
> ISO considers obsoleted when we publish a new standard.

Not really. For C, we refer to a specific dated version, and even if
obsoleted by ISO, the contents stays the same.

The suggestion here was to permit the implementation to choose
(and document) a version of the Unicode standard of their liking,
and claim conformance to the C++ standard either way.
That part is what feels novel: it is a squishy normative reference.

> But how normative references work doesn't actually matter; what
> Jonathan is asking for is that we add weasel-wording that
> keeps old compilers conforming to the old standard, while allowing
> them to update to a newer one - which is what actually
> happens. An EOL compiler might not get updated, but it shouldn't just
> become non-conforming because things move underneath.
> The point isn't conformance in the strict ISO sense, the point is
> being able to say that not updating an EOL compiler isn't
> a conformance bug as such. It conformed to C++23, and remains so, and
> the publication of C++26 or a newer Unicode
> standard doesn't change that it conforms to C++23, which is
> practically meaningful even if ISO likes to pretend otherwise.

I think the current evidence that Unicode algorithms actually change
(not just the character repertoire), possibly in a way that requires
ABI-incompatible updates, gives me pause. I think well-defined
stability of our work product is more important than mid-term
updates for some features people have lived without for three decades,
for better or worse.

I've made

https://cplusplus.github.io/CWG/issues/2843.html

to get the normative reference fixed to Unicode 15.0,
which was the current one at the time of finalizing C++23.
This will be applied as a DR, so implementers are aware
this is what "C++23" means as far as character handling
is concerned.

A short paper suggesting to update to Unicode 15.1 (or v16)
relatively shortly before C++26 is finalized is very welcome,
even more so if it contains a summary of the changes
relevant for C++. Implementers know how to offer post-C++23
features to their audience without disturbing their C++23 mode.

The above is my (personal) suggestion how to handle the issue;
I've forwarded the core issue to both SG16 and LWG to gather
further feedback:

https://github.com/cplusplus/papers/issues/1736

Jens

Received on 2024-01-06 21:33:33