C++ Logo

sg16

Advanced search

Re: Undated reference to Unicode Standard and UAX #29

From: Corentin <corentin.jabot_at_[hidden]>
Date: Fri, 5 Jan 2024 17:58:43 +0100
std::format width estimation requires clustering

On Fri, Jan 5, 2024 at 5:54 PM Steve Downey <sdowney_at_[hidden]> wrote:

>
> On Fri, Jan 5, 2024 at 11:47 AM Mark de Wever via SG16 <
> sg16_at_[hidden]> wrote:
>
>> On Fri, Jan 05, 2024 at 04:26:49PM +0000, Jonathan Wakely via SG16 wrote:
>> > Since the adoption of P2736 C++23 and the current C++ working draft just
>> > refer to "the Unicode Standard", with a URL referring to the latest
>> > version. We removed the bibliography entry for TR29 revision 35. P2736
>> > gives the justification for this that the revision of #29 included in
>> > Unicode 15 (revision 41) is just a bug fix, so there's no problem
>> referring
>> > to that instead.
>> >
>> > That might have been true last year, but the current Unicode Standard
>> > (15.1.0) includes revision 43 of UAX #29, which makes significant
>> changes
>> > to the extended grapheme cluster breaking rules. A new state machine is
>> > needed (and new lookup tables of properties) to implement rule GB9c.
>> That's
>> > not just a bug fix, is it?
>> >
>> > Are C++ implementations expected to implement rule GB9c, despite it not
>> > being part of the standard when C++23 was published?
>>
>> AFAIK this was indeed intended. The Unicode Standard moves at a faster
>> pace than the C++ Standard. This allows C++ to always use the latest
>> Unicode features and backport them to older language versions.
>>
>> Recently I wanted to update libc++ to Unicode 15.1.0 and noticed the
>> same changes you did. I put this on hold since I need to investigate the
>> required ABI tags. Otherwise I would have implemented these changes for
>> libc++18.
>>
>> Cheers,
>> Mark
>>
>> Did we actually specify something for C++23 that depends on or provides
> the breaking algorithms?
>
> My recollection is that we kicked the can down the road, with the intent
> of having implementations declare which version of the standard they
> support at a particular time and version, with the option of a conforming
> implementation providing a later one, because otherwise you can't process
> current text.
>
>

Received on 2024-01-05 16:59:01