On 12/15/22 5:44 AM, Corentin wrote:
Thanks folks.
I did update the paper to remove the control alias table https://isocpp.org/files/papers/P2736R0.pdf

Thanks, Corentin.

A wording nit; I think "control code aliases" should be struck in this note:

[ Note: None of the associated Unicode character names, associated Unicode character name aliases, or control code aliases have leading or trailing spaces. — end note ]

Suggested edit:

[ Note: None of the associated Unicode character names or, associated Unicode character name aliases, or control code aliases have leading or trailing spaces. — end note ]

I think the reference to chapter 4.8 ("Name") is fine despite the fact that the chapter number might change over time.

With regard to Jen's comments about the names being spread about the code charts, that does seem to be true. However, chapter 24 ("About the code charts") references both NamesList.txt and NameAliases.txt (chapter 4.8 also references the latter, but not the former). Perhaps a note stating that the names are provided in these other files is warranted.

I left __STDC_ISO_10646__ untouched, I'm not sure we reached a conclusion about it yesterday.

I agree. We can revisit this in January.


On Thu, Dec 15, 2022 at 9:17 AM Jens Maurer <jens.maurer@gmx.net> wrote:

On 15/12/2022 02.48, Corentin via SG16 wrote:
> On Thu, Dec 15, 2022, 02:21 Tom Honermann <tom@honermann.net <mailto:tom@honermann.net>> wrote:
>     On 12/4/22 9:16 AM, Corentin via SG16 wrote:
>>     Hey folks.
>>     First draft updating some references to the Unicode standard (and more importantly replacing ISO-10646).
>>     I'm hoping to get early feedback :)
>>     https://isocpp.org/files/papers/D2736R0.pdf <https://isocpp.org/files/papers/D2736R0.pdf>
>     Thanks for the paper, Corentin. I'm sorry I failed to notice the link here and to get this paper scheduled.
>     I spent some time reading it tonight. It looks good so far.
>     I think the "Control code aliases" table <http://eel.is/c++draft/tab:lex.charset.ucn> in [lex.charset]p5 <http://eel.is/c++draft/lex.charset#5> can be (and should be) removed with these changes and p(5.2) <http://eel.is/c++draft/lex.charset#5.2> updated accordingly. Actually, I think p(5.2) <http://eel.is/c++draft/lex.charset#5.2> can be removed and p(5.1) <http://eel.is/c++draft/lex.charset#5.1> merged with p5 <http://eel.is/c++draft/lex.charset#5>. The listed control aliases are present in NameAliases.txt <https://www.unicode.org/Public/15.0.0/ucd/NameAliases.txt> as control names.

Agreed, including NOT showing BELL (because it conflicts),
which is good.

> I considered that but we would need some wording that says that names in namealiases.txt with the control label should be supported. We'd get rid of the table but we'd have additional wording. 
> And we can't say "just support name aliases.txt" because figments, abbreviations and alternates are current not supported and that would be a design change.
> I would support this change but I'm not sure it's in scope for this paper.

The current draft paper says

> The Unicode Standard subclause "Character Names List”

This is not the correct reference; this subclause (I think it is rather a "chapter" or a "section")
shows the general rules around character names presentation, but doesn't
contain the exhaustive list.  The exhaustive list is scattered around
the "code charts", which are distributed as separate PDFs here:


It would be good to have a single unified PDF (as ISO 10646 does), but no luck,
it seems.


>     Tom.
>>     A careful examination of the 3 standards do not reveal anything I think we should be concerned about besides what I've highlighted in the paper but please let me know if you have specific questions we need to address.
>>     I would like to point out the mess that is __STDC_ISO_10646__. and whose value currently depends on an ISO-10646 version.
>>     In the paper I propose to make that value implementation-defined as it cannot be relied upon except to check if some piece of code has been updated in the past 20+ years.
>>     I've also reworded the deprecated codecvt facilities to not mention UCS-2 and getting rid of one more reference.
>>     I've massaged a few places to improve how we reference unicode properties.
>>     The other thing that is not 100% clear to me is whether we should reference UAX44, the Derived Core properties and UAX 29 (which we do currently), 
>>     or if referencing the Unicode standard implies all of that (I think it does).
>>     I've noticed that the Unicode standard incorrectly references version 14.0 of itself when it means 15.0 but hopefully we understand what is meant.
>>     Thanks,
>>     Corentin