C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] [isocpp-lib] New issue: Are std::format field widths code units, code points, or something else?

From: Victor Zverovich <victor.zverovich_at_[hidden]>
Date: Wed, 11 Sep 2019 12:34:22 -0700
To the best of my knowledge the main use case for width is to align the
text when viewed in a terminal or an editor with a monospace font. The
second use case is padding the output with '\0', space, or some other
character to satisfy some width requirements. It is somewhat limited in
addressing the first use case because, as pointed out by Tom, it's hard to
solve this problem in general (but it's possibly to have decent
approximations), but it works if you restrict your inputs which is what
often happens in practice.

> If no vendor feels that their customers would benefit from a field-width
specifier, then there simply won't be any implementation of field-width,
and therefore there will be nothing that needs standardizing.

That is the worst possible option in my opinion because it will create a
portability nightmare similar to the one that currently exists with printf
extensions.

Cheers,
Victor


On Wed, Sep 11, 2019 at 11:58 AM Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
wrote:

> [cc: Victor Zverovich, as the person most likely to know about use-cases
> for `fmt`]
>
> What is the *use-case* for the field-width format specifier?
>
> If it has no use-case, then WG21 should consider not standardizing it.
> WG21 could leave the question of what-it-should-do up to individual
> implementors β€” via conforming extensions to a standard <format>
> implementation. If no vendor feels that their customers would benefit from
> a field-width specifier, then there simply won't be any implementation of
> field-width, and therefore there will be nothing that needs standardizing.
>
> –Arthur
>
>
>
>
> On Tue, Sep 10, 2019 at 11:19 AM Steve Downey via Lib <
> lib_at_[hidden]> wrote:
>
>> Even with monospace fonts you can't just rely on number of characters
>> once you are beyond ASCII. Han characters are often double wide.
>> This kind of formatting can't be made to work in the general case, so
>> we're left with what might be least surprising. Where almost any choice is
>> going to be surprising to someone. Given that, I would prefer that it be
>> stable, and therefore independent of locale.
>>
>> On Tue, Sep 10, 2019 at 10:36 AM Niall Douglas <s_sourceforge_at_[hidden]>
>> wrote:
>>
>>>
>>> > Perhaps it would be helpful to enumerate what we expect to be portable
>>> > uses of field widths. My personal take is that they are useful to
>>> > specify widths for fields where the content is restricted to members of
>>> > the basic source character set where we already have a guarantee that
>>> > each character can be represented with one code unit.
>>>
>>> Most programmers would use field widths for padding items so they appear
>>> in a grid. They would expect that 𐐗 padded to eight characters yields
>>> seven spaces and 𐐗, not four spaces and 𐐗 (because 𐐗 consumes four
>>> bytes of UTF-8).
>>>
>>> That said, as we have no idea how unicode would get rendered (0, 1, or 4
>>> characters for 𐐗 being the most likely), I cannot improve on your
>>> proposal. The situation sucks, quite frankly.
>>>
>>> Niall
>>> _______________________________________________
>>> SG16 Unicode mailing list
>>> Unicode_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/unicode
>>>
>> _______________________________________________
>> Lib mailing list
>> Lib_at_[hidden]
>> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib
>> Link to this post: http://lists.isocpp.org/lib/2019/09/13533.php
>>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>

Received on 2019-09-11 21:34:36