On 05/01/2024 18.35, Jonathan Wakely via SG16 wrote:
>
>
> On Fri, 5 Jan 2024, 16:47 Mark de Wever, <koraq@xs4all.nl <mailto:koraq@xs4all.nl>> wrote:
>
> On Fri, Jan 05, 2024 at 04:26:49PM +0000, Jonathan Wakely via SG16 wrote:
> > Since the adoption of P2736 C++23 and the current C++ working draft just
> > refer to "the Unicode Standard", with a URL referring to the latest
> > version. We removed the bibliography entry for TR29 revision 35. P2736
> > gives the justification for this that the revision of #29 included in
> > Unicode 15 (revision 41) is just a bug fix, so there's no problem referring
> > to that instead.
> >
> > That might have been true last year, but the current Unicode Standard
> > (15.1.0) includes revision 43 of UAX #29, which makes significant changes
> > to the extended grapheme cluster breaking rules. A new state machine is
> > needed (and new lookup tables of properties) to implement rule GB9c. That's
> > not just a bug fix, is it?
> >
> > Are C++ implementations expected to implement rule GB9c, despite it not
> > being part of the standard when C++23 was published?
>
> AFAIK this was indeed intended. The Unicode Standard moves at a faster
> pace than the C++ Standard. This allows C++ to always use the latest
> Unicode features and backport them to older language versions.
>
>
> Maybe the intent was to allow that, but the way I read it we *require* that. Is there wording that says that an implementation can choose which version to conform to?
>
> If not, what stops all existing implementations become non-conforming when a new version of unicode gets published?
Nothing, if the new version of Unicode changes behavior that C++
refers to (as seems to be the case here).
My understanding is that this was intentional; ISO wants us to refer
to undated standard if possible, too.
If we feel we should "freeze" the Unicode version for each C++ standard
release, we could do that. Implementer feedback is certainly welcome
for that decision.
I think I'd prefer if we just somehow say that implementations can define which Unicode standard they conform to. That way if a conforming C++23 implementation uses Unicode 15.1.0 (the latest version today) then it doesn't become non-conforming overnight when a new Unicode standard is published. We can recommend that implementations pin themselves to a recent Unicode standard, and even recommend that implementations should (if possible) update to use newer Unicode standards as they become available. But there's no way that a discontinued/EOL compiler version can get updated to a newer Unicode standard, which is what we seem to be requiring as a condition of being a conforming implementation.