C++ Logo

sg16

Advanced search

Re: Width estimation

From: Corentin <corentin.jabot_at_[hidden]>
Date: Fri, 16 Sep 2022 00:07:08 +0200
Thanks a lot for your reply.

To clarify, i meant the C++ standard.
Ie this list http://eel.is/c++draft/format#string.std-12.sentence-3

My understanding is that it was derived from what happened to be east Asian
width in some older Unicode version (5.0) and then modified.
I'm pushing C++ to treat all East_Asian_width W and F codepoints as being 2
for the purpose of terminal width estimation, which is standard practice.
Of course, any help from Unicode for how to treat other codepoints would be
much appreciated. Especially a list of 0 width codepoints!



On Thu, Sep 15, 2022, 23:58 Steven R. Loomis <srl295_at_[hidden]> wrote:

> Hi. Briefly, we’ve discussed this issue a little bit at UTC, and I’ve
> tried to engage terminal emulator vendors, who are who probably need to be
> part of the discussion.
>
> I’m not sure about "Lack of explanation in the standard”, I think wording
> was added to make these updates out of scope.
>
> I can try to dig up previous discussion if needed.
>
> -s
>
> --
> Steven R. Loomis
> Code Hive Tx, LLC
> https://codehivetx.us
>
>
>
> On Sep 14, 2022, at 4:28 AM, Corentin via SG16 <sg16_at_[hidden]>
> wrote:
>
> Hey folks.
>
> How was the table of width in [format] derived?
> http://eel.is/c++draft/format#string.std-12.sentence-3
>
> We have 2 issues here: Lack of explanation in the standard makes it hard
> to evolve that table,
> and it does require maintenance as the Unicode standard evolves.
>
> Reading the intent of
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1868r2.html,
>
> We do want:
>
> - To treat 0-width codepoint as 1
> - To treat emojis as 2
> - To treat full width east asian as 2.
>
> https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
>
> I think a better specification would be given that we have a floating
> reference to UAX44,
> to say that codepoints that have the Unicode property "Emoji_Presentation"
> or
> East_Asian_Width="W" have a width of 2.
>
> This ensures implementation remains coherent as Unicode evolves.
>
> Thanks,
> Corentin
>
>
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
>
>

Received on 2022-09-15 22:07:20