C++ Logo

sg16

Advanced search

Re: Undated reference to Unicode Standard and UAX #29

From: Steve Downey <sdowney_at_[hidden]>
Date: Fri, 5 Jan 2024 11:54:25 -0500
On Fri, Jan 5, 2024 at 11:47 AM Mark de Wever via SG16 <
sg16_at_[hidden]> wrote:

> On Fri, Jan 05, 2024 at 04:26:49PM +0000, Jonathan Wakely via SG16 wrote:
> > Since the adoption of P2736 C++23 and the current C++ working draft just
> > refer to "the Unicode Standard", with a URL referring to the latest
> > version. We removed the bibliography entry for TR29 revision 35. P2736
> > gives the justification for this that the revision of #29 included in
> > Unicode 15 (revision 41) is just a bug fix, so there's no problem
> referring
> > to that instead.
> >
> > That might have been true last year, but the current Unicode Standard
> > (15.1.0) includes revision 43 of UAX #29, which makes significant changes
> > to the extended grapheme cluster breaking rules. A new state machine is
> > needed (and new lookup tables of properties) to implement rule GB9c.
> That's
> > not just a bug fix, is it?
> >
> > Are C++ implementations expected to implement rule GB9c, despite it not
> > being part of the standard when C++23 was published?
>
> AFAIK this was indeed intended. The Unicode Standard moves at a faster
> pace than the C++ Standard. This allows C++ to always use the latest
> Unicode features and backport them to older language versions.
>
> Recently I wanted to update libc++ to Unicode 15.1.0 and noticed the
> same changes you did. I put this on hold since I need to investigate the
> required ABI tags. Otherwise I would have implemented these changes for
> libc++18.
>
> Cheers,
> Mark
>
> Did we actually specify something for C++23 that depends on or provides
the breaking algorithms?

My recollection is that we kicked the can down the road, with the intent of
having implementations declare which version of the standard they support
at a particular time and version, with the option of a conforming
implementation providing a later one, because otherwise you can't process
current text.

Received on 2024-01-05 16:54:45