C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] [isocpp-lib] New issue: Are std::format field widths code units, code points, or something else?

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Wed, 11 Sep 2019 14:58:40 -0400
[cc: Victor Zverovich, as the person most likely to know about use-cases
for `fmt`]

What is the *use-case* for the field-width format specifier?

If it has no use-case, then WG21 should consider not standardizing it.
WG21 could leave the question of what-it-should-do up to individual
implementors β€” via conforming extensions to a standard <format>
implementation. If no vendor feels that their customers would benefit from
a field-width specifier, then there simply won't be any implementation of
field-width, and therefore there will be nothing that needs standardizing.

–Arthur




On Tue, Sep 10, 2019 at 11:19 AM Steve Downey via Lib <lib_at_[hidden]>
wrote:

> Even with monospace fonts you can't just rely on number of characters once
> you are beyond ASCII. Han characters are often double wide.
> This kind of formatting can't be made to work in the general case, so
> we're left with what might be least surprising. Where almost any choice is
> going to be surprising to someone. Given that, I would prefer that it be
> stable, and therefore independent of locale.
>
> On Tue, Sep 10, 2019 at 10:36 AM Niall Douglas <s_sourceforge_at_[hidden]>
> wrote:
>
>>
>> > Perhaps it would be helpful to enumerate what we expect to be portable
>> > uses of field widths. My personal take is that they are useful to
>> > specify widths for fields where the content is restricted to members of
>> > the basic source character set where we already have a guarantee that
>> > each character can be represented with one code unit.
>>
>> Most programmers would use field widths for padding items so they appear
>> in a grid. They would expect that 𐐗 padded to eight characters yields
>> seven spaces and 𐐗, not four spaces and 𐐗 (because 𐐗 consumes four
>> bytes of UTF-8).
>>
>> That said, as we have no idea how unicode would get rendered (0, 1, or 4
>> characters for 𐐗 being the most likely), I cannot improve on your
>> proposal. The situation sucks, quite frankly.
>>
>> Niall
>> _______________________________________________
>> SG16 Unicode mailing list
>> Unicode_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/unicode
>>
> _______________________________________________
> Lib mailing list
> Lib_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib
> Link to this post: http://lists.isocpp.org/lib/2019/09/13533.php
>

Received on 2019-09-11 20:58:54