C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Feedback on P1097R1: U+NNNNNN syntax

From: Martinho Fernandes <rmf_at_[hidden]>
Date: Mon, 9 Jul 2018 12:34:24 +0200
On 06.07.18 23:37, Hubert Tong wrote:
> On Fri, Jul 6, 2018 at 5:31 PM, Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
>
> On 07/06/2018 05:16 PM, Hubert Tong wrote:
>
> I am wondering if accepting U+(4-6 hex digits) in \N{...} as
> Perl does can be considered.
>
>
> It certainly can be, but what is the motivation given that we
> already have \u and \U? Why is supporting both \u1234 and
> \N{U+1234} helpful?
>
> Do stylistic choices count? I happen to like naming Unicode characters
> as U+NNNN.

Personally, I really dislike that `\U` requires 8 digits; two of them
are always zeros (maybe we should refer to this notation as `\U00` :D).
Being able to write only six digits is definitely a plus.

> There is also a possible semantic difference to explore between \u/\U
> and \N{U+...}:
> The \N form should certainly require that a character is assigned in
> Unicode; however, I think assigning a more "raw" meaning to \u/\U could
> make sense.

I'm not sure I like the requirement that a character be assigned for use
with \N{U+...}. Consider: in March 2012, the government of Turkey
selected a sign for their currency, the Turkish lira (U+20BA ₺). Unicode
6.2 was released two months later with this sign added (and as a
curiosity, that was the only new character in that release). Now,
suppose I am programming in May 2012, and I am targeting the Turkish
market. I already have fonts that support this character. I want to be
able to write \N{U+20BA} in my code without a compiler update. Sure, I
won't have any of the semantic properties of U+20BA available so I may
need some care with processing, but I can at least interchange it
(because that doesn't require any special handling) and display it
(because I have fonts that support it).

-- 
Martinho

Received on 2018-07-09 12:41:36