C++ Logo

sg16

Advanced search

Re: Rewording wording for named-universal-characters

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Sun, 27 Feb 2022 09:23:58 +0100
Steve, please make sure to upload your fixed paper to the 2022-03-11 core
telecon wiki, under the "D" name.

On 26/02/2022 09.42, Corentin Jabot wrote:
>
>
> On Fri, Feb 25, 2022 at 11:33 PM Jens Maurer <Jens.Maurer_at_[hidden] <mailto:Jens.Maurer_at_[hidden]>> wrote:
>
> On 25/02/2022 23.20, Corentin Jabot wrote:
> > Can we flip it around?
> >
> >
> > Then the named-universal-character designates the element of the translation character set whose UCS scalar value is equal to the code point of that character.
> > Otherwise, the program is ill-formed.
> > [Note: The lists of names and aliases are guaranteed to be disjoint. An n-char sequence will be found in at most one list. --end note]
>
> We want to avoid "matches" because it might mean "some fuzzy match" instead of
> equality.
>
> We want to start the paragraph with the same introducer as the preceding one.
>
>
> This is challenging.
> There are a lot of moving pieces.
> Can we rewrite the previous paragraph too?
>
>
> If the n-char-sequence of a named-universal-character is exactly equal to either
> - The name alias of a character as specified in ISO/IEC 10646 clause 34 "Character names list"
> - The associated name of a character as specified in ISO/IEC 10646 clause 34 "Character names list"
> - A control code alias of a character as specified in table X
> Then the named-universal-character designates the code point of that character.
>
> A universal-character-name designates the character in the translation character set whose UCS scalar value is:
> - For a universal-character-name of the form \u hex-quad or \U hex-quad hex-quad, the hexadecimal number represented by the sequence of hexadecimal-digits in the universal-character-name.
> - For a named-universal-character, the code point it designates.
>
> If a universal-character-name does not designate a UCS scalar value, the program is ill-formed.

I think part of the confusion stemmed from the fact that people were looking at an old
version of the paper, because I hadn't updated the link at the top of the wiki page.

Suggestion:

A universal-character-name that is a named-universal-character designates the
character named by its n-char-sequence. A character is so named if the
n-char-sequence is equal to
  - the associated character name or associated character name alias specified in
ISO/IEC 10646 subclause "Code charts and lists of character names" or
  - the control code alias given in Table X.
The program is ill-formed if there is no such character.

Jens

Received on 2022-02-27 08:24:07