C++ Logo

sg16

Advanced search

Width estimation

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 14 Sep 2022 11:28:32 +0200
Hey folks.

How was the table of width in [format] derived?
http://eel.is/c++draft/format#string.std-12.sentence-3

We have 2 issues here: Lack of explanation in the standard makes it hard to
evolve that table,
and it does require maintenance as the Unicode standard evolves.

Reading the intent of
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1868r2.html,

We do want:

   - To treat 0-width codepoint as 1
   - To treat emojis as 2
   - To treat full width east asian as 2.

https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

I think a better specification would be given that we have a floating
reference to UAX44,
to say that codepoints that have the Unicode property "Emoji_Presentation"
or
East_Asian_Width="W" have a width of 2.

This ensures implementation remains coherent as Unicode evolves.

Thanks,
Corentin

Received on 2022-09-14 09:28:45