C++ Logo


Advanced search

Re: Referencing the Unicode Standard

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 14 Dec 2022 20:21:36 -0500
On 12/4/22 9:16 AM, Corentin via SG16 wrote:
> Hey folks.
> First draft updating some references to the Unicode standard (and more
> importantly replacing ISO-10646).
> I'm hoping to get early feedback :)
> https://isocpp.org/files/papers/D2736R0.pdf

Thanks for the paper, Corentin. I'm sorry I failed to notice the link
here and to get this paper scheduled.

I spent some time reading it tonight. It looks good so far.

I think the "Control code aliases" table
<http://eel.is/c++draft/tab:lex.charset.ucn> in [lex.charset]p5
<http://eel.is/c++draft/lex.charset#5> can be (and should be) removed
with these changes and p(5.2) <http://eel.is/c++draft/lex.charset#5.2>
updated accordingly. Actually, I think p(5.2)
<http://eel.is/c++draft/lex.charset#5.2> can be removed and p(5.1)
<http://eel.is/c++draft/lex.charset#5.1> merged with p5
<http://eel.is/c++draft/lex.charset#5>. The listed control aliases are
present in NameAliases.txt
<https://www.unicode.org/Public/15.0.0/ucd/NameAliases.txt> as control


> A careful examination of the 3 standards do not reveal anything I
> think we should be concerned about besides what I've highlighted in
> the paper but please let me know if you have specific questions we
> need to address.
> I would like to point out the mess that is __STDC_ISO_10646__. and
> whose value currently depends on an ISO-10646 version.
> In the paper I propose to make that value implementation-defined as it
> cannot be relied upon except to check if some piece of code has been
> updated in the past 20+ years.
> I've also reworded the deprecated codecvt facilities to not mention
> UCS-2 and getting rid of one more reference.
> I've massaged a few places to improve how we reference unicode properties.
> The other thing that is not 100% clear to me is whether we should
> reference UAX44, the Derived Core properties and UAX 29 (which we do
> currently),
> or if referencing the Unicode standard implies all of that (I think it
> does).
> I've noticed that the Unicode standard incorrectly references version
> 14.0 of itself when it means 15.0 but hopefully we understand what is
> meant.
> Thanks,
> Corentin

Received on 2022-12-15 01:21:38