sg16: Re: [SG16-Unicode] [isocpp-lib] New issue: Are std::format field widths code units, code points, or something else?

From: Steve Downey <sdowney_at_[hidden]>
Date: Tue, 10 Sep 2019 11:18:50 -0400

Even with monospace fonts you can't just rely on number of characters once
you are beyond ASCII. Han characters are often double wide.
This kind of formatting can't be made to work in the general case, so we're
left with what might be least surprising. Where almost any choice is going
to be surprising to someone. Given that, I would prefer that it be stable,
and therefore independent of locale.

On Tue, Sep 10, 2019 at 10:36 AM Niall Douglas <s_sourceforge_at_[hidden]>
wrote:

>
> > Perhaps it would be helpful to enumerate what we expect to be portable
> > uses of field widths. My personal take is that they are useful to
> > specify widths for fields where the content is restricted to members of
> > the basic source character set where we already have a guarantee that
> > each character can be represented with one code unit.
>
> Most programmers would use field widths for padding items so they appear
> in a grid. They would expect that 𐐗 padded to eight characters yields
> seven spaces and 𐐗, not four spaces and 𐐗 (because 𐐗 consumes four
> bytes of UTF-8).
>
> That said, as we have no idea how unicode would get rendered (0, 1, or 4
> characters for 𐐗 being the most likely), I cannot improve on your
> proposal. The situation sucks, quite frankly.
>
> Niall
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>

Received on 2019-09-10 17:19:04