C++ Logo

sg16

Advanced search

Re: Referencing the Unicode Standard

From: Corentin <corentin.jabot_at_[hidden]>
Date: Thu, 15 Dec 2022 02:48:27 +0100
On Thu, Dec 15, 2022, 02:21 Tom Honermann <tom_at_[hidden]> wrote:

> On 12/4/22 9:16 AM, Corentin via SG16 wrote:
>
> Hey folks.
> First draft updating some references to the Unicode standard (and more
> importantly replacing ISO-10646).
> I'm hoping to get early feedback :)
>
> https://isocpp.org/files/papers/D2736R0.pdf
>
> Thanks for the paper, Corentin. I'm sorry I failed to notice the link here
> and to get this paper scheduled.
>
> I spent some time reading it tonight. It looks good so far.
> I think the "Control code aliases" table
> <http://eel.is/c++draft/tab:lex.charset.ucn> in [lex.charset]p5
> <http://eel.is/c++draft/lex.charset#5> can be (and should be) removed
> with these changes and p(5.2) <http://eel.is/c++draft/lex.charset#5.2>
> updated accordingly. Actually, I think p(5.2)
> <http://eel.is/c++draft/lex.charset#5.2> can be removed and p(5.1)
> <http://eel.is/c++draft/lex.charset#5.1> merged with p5
> <http://eel.is/c++draft/lex.charset#5>. The listed control aliases are
> present in NameAliases.txt
> <https://www.unicode.org/Public/15.0.0/ucd/NameAliases.txt> as control
> names.
>

I considered that but we would need some wording that says that names in
namealiases.txt with the control label should be supported. We'd get rid of
the table but we'd have additional wording.
And we can't say "just support name aliases.txt" because figments,
abbreviations and alternates are current not supported and that would be a
design change.
I would support this change but I'm not sure it's in scope for this paper.




Tom.
>
>
> A careful examination of the 3 standards do not reveal anything I think we
> should be concerned about besides what I've highlighted in the paper but
> please let me know if you have specific questions we need to address.
>
> I would like to point out the mess that is __STDC_ISO_10646__. and whose
> value currently depends on an ISO-10646 version.
> In the paper I propose to make that value implementation-defined as it
> cannot be relied upon except to check if some piece of code has been
> updated in the past 20+ years.
>
> I've also reworded the deprecated codecvt facilities to not mention UCS-2
> and getting rid of one more reference.
>
> I've massaged a few places to improve how we reference unicode properties.
> The other thing that is not 100% clear to me is whether we should
> reference UAX44, the Derived Core properties and UAX 29 (which we do
> currently),
> or if referencing the Unicode standard implies all of that (I think it
> does).
>
> I've noticed that the Unicode standard incorrectly references version 14.0
> of itself when it means 15.0 but hopefully we understand what is meant.
>
> Thanks,
> Corentin
>
>

Received on 2022-12-15 01:48:41