C++ Logo

sg16

Advanced search

Re: Undated reference to Unicode Standard and UAX #29

From: Mark de Wever <koraq_at_[hidden]>
Date: Fri, 5 Jan 2024 17:47:34 +0100
On Fri, Jan 05, 2024 at 04:26:49PM +0000, Jonathan Wakely via SG16 wrote:
> Since the adoption of P2736 C++23 and the current C++ working draft just
> refer to "the Unicode Standard", with a URL referring to the latest
> version. We removed the bibliography entry for TR29 revision 35. P2736
> gives the justification for this that the revision of #29 included in
> Unicode 15 (revision 41) is just a bug fix, so there's no problem referring
> to that instead.
>
> That might have been true last year, but the current Unicode Standard
> (15.1.0) includes revision 43 of UAX #29, which makes significant changes
> to the extended grapheme cluster breaking rules. A new state machine is
> needed (and new lookup tables of properties) to implement rule GB9c. That's
> not just a bug fix, is it?
>
> Are C++ implementations expected to implement rule GB9c, despite it not
> being part of the standard when C++23 was published?

AFAIK this was indeed intended. The Unicode Standard moves at a faster
pace than the C++ Standard. This allows C++ to always use the latest
Unicode features and backport them to older language versions.

Recently I wanted to update libc++ to Unicode 15.1.0 and noticed the
same changes you did. I put this on hold since I need to investigate the
required ABI tags. Otherwise I would have implemented these changes for
libc++18.

Cheers,
Mark

Received on 2024-01-05 16:47:39