On Tue, Sep 10, 2019 at 10:36 AM Niall Douglas <s_sourceforge@nedprod.com> wrote:

> Perhaps it would be helpful to enumerate what we expect to be portable
> uses of field widths.  My personal take is that they are useful to
> specify widths for fields where the content is restricted to members of
> the basic source character set where we already have a guarantee that
> each character can be represented with one code unit.

Most programmers would use field widths for padding items so they appear
in a grid. They would expect that 𐐗 padded to eight characters yields
seven spaces and 𐐗, not four spaces and 𐐗 (because 𐐗 consumes four
bytes of UTF-8).

That said, as we have no idea how unicode would get rendered (0, 1, or 4
characters for 𐐗 being the most likely), I cannot improve on your
proposal. The situation sucks, quite frankly.

     One of the benefits of using code units for char and wchar_t here is that, even if its visually wrong, its dependably wrong. I can pass char-based utf8 and know exactly how to mitigate the problem if I care, and on all platforms I will have exactly the same problem, regardless of whether the program is deployed on a Turkish, German, or Japanese machine. This, combined with the ability to not do anything with std::locale for char and wchar_t, is extremely valuable (if frustrating for those who care).

     char and wchar_t are portability dead ends; let's leave it to the mess that they are and focus on having a really good story for char8_t, char16_t, and char32_t.