C++ Logo

sg16

Advanced search

Re: [SG16] D1949R4 - Unicode Identifiers

From: Zach Laine <whatwasthataddress_at_[hidden]>
Date: Wed, 27 May 2020 12:03:43 -0500
On Wed, May 27, 2020 at 12:01 PM Jens Maurer via SG16
<sg16_at_[hidden]> wrote:
>
> On 26/05/2020 22.51, Steve Downey via SG16 wrote:
> > Find attached a draft of the UAX31 paper for discussion.
> > Viewable at http://htmlpreview.github.io/?https://github.com/steve-downey/papers/blob/master/generated/p1949.html
> > Source at https://github.com/steve-downey/papers/blob/master/p1949.md
>
> I had asked earlier for some prose-text statement on the difficulty
> of checking NFC.
>
> I can only find
>
> "Detection of un-normalized text is fairly straight-forward, and GCC 10 already produces a warning. Normalizing to NFC is not much more difficult."
>
> which is lacking a bit of depth.
>
> What exactly do I have to do to check for NFC? Check some bits in the code points?
> Consult some Unicode tables? Something else?

You have to look up each adjacent pair of code points in a table, and
verify that they form a valid NFC sequence.

Zach

Received on 2020-05-27 12:06:55