C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] D1628R0 (Unicode character properties)

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 27 Mar 2019 23:04:57 +0100
On Wed, 27 Mar 2019 at 17:40 Markus Scherer <markus.icu_at_[hidden]> wrote:

> Hi Tom & SG16,
>
> First, sorry for having dropped off -- I have been swamped with other work
> and won't make it to today's meeting either.
>
> Second, I would like to ask you to consider if it's necessary to add
> Unicode properties APIs in the language runtime.
> There are widely used libraries like ICU which provide this and more.
>
> Many users will want to be able to use the latest version of Unicode,
> which will tend to be newer than what their compiler provides.
> There are also enough changes in Unicode properties that data structures
> or parsers etc. sometimes need to be adjusted, so you have a maintenance
> burden.
> (I have been doing this for some 19 years.)
>

Nothing precludes an implementation to defer to ICU both at compile time
and run time.
Although i found relying on ICU shipped on platforms to be problematic -
ICU might keep up to of Unicode but OS definitively don't keep on top of
ICU.
It's also difficult to deploy ICU on memory constraint devices or devices
who can't allocate / throw exceptions / etc


>
> And finally, I personally think that the ROI for the name property is low.
> As noted in the document, the data is large, but also a long \N{dozens of
> letters} string is not very readable. I find it's just as easy to use
> \uhhhh escapes with a simple code comment for which character that is, and
> if it's obvious (like a regular printable letter) you use the character
> itself anyway.
>

\N is a separate paper - namely https://wg21.link/p1097r2
I think there is some valid uses cases for name ( for example if you are
doing an editor, ide, etc or any kind of input checking), it might be more
user friendly to say "unexpected space at line 1" rather than "unexpected
\U00020"

Whether that is a valid enough use-case to warrant being in the standard is
up for debate - if implementers use icu, the cost of implementation is low





>
> Best regards,
> markus
>
> On Wed, Mar 27, 2019 at 8:42 AM Corentin <corentin.jabot_at_[hidden]> wrote:
>
>> As requested by Tom, please find attach D1628R0 which will be discussed
>> during today's meeting \N{WHITE EXCLAMATION MARK ORNAMENT}
>>
>> Feedback welcome :)
>>
>> Regards,
>> Corentin
>>
> _______________________________________________
>> SG16 Unicode mailing list
>> Unicode_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/unicode
>>
>

Received on 2019-03-27 23:05:10