Date: Wed, 14 Sep 2022 11:28:32 +0200
Hey folks.
How was the table of width in [format] derived?
http://eel.is/c++draft/format#string.std-12.sentence-3
We have 2 issues here: Lack of explanation in the standard makes it hard to
evolve that table,
and it does require maintenance as the Unicode standard evolves.
Reading the intent of
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1868r2.html,
We do want:
- To treat 0-width codepoint as 1
- To treat emojis as 2
- To treat full width east asian as 2.
https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
I think a better specification would be given that we have a floating
reference to UAX44,
to say that codepoints that have the Unicode property "Emoji_Presentation"
or
East_Asian_Width="W" have a width of 2.
This ensures implementation remains coherent as Unicode evolves.
Thanks,
Corentin
How was the table of width in [format] derived?
http://eel.is/c++draft/format#string.std-12.sentence-3
We have 2 issues here: Lack of explanation in the standard makes it hard to
evolve that table,
and it does require maintenance as the Unicode standard evolves.
Reading the intent of
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1868r2.html,
We do want:
- To treat 0-width codepoint as 1
- To treat emojis as 2
- To treat full width east asian as 2.
https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
I think a better specification would be given that we have a floating
reference to UAX44,
to say that codepoints that have the Unicode property "Emoji_Presentation"
or
East_Asian_Width="W" have a width of 2.
This ensures implementation remains coherent as Unicode evolves.
Thanks,
Corentin
Received on 2022-09-14 09:28:45