C++ Logo


Advanced search

Re: [SG16-Unicode] [isocpp-lib] New issue: Are std::format field widths code units, code points, or something else?

From: Tom Honermann <tom_at_[hidden]>
Date: Fri, 13 Sep 2019 12:15:25 -0400
On 9/13/19 11:42 AM, Corentin Jabot wrote:
> On Fri, 13 Sep 2019 at 15:57, Niall Douglas <s_sourceforge_at_[hidden]
> <mailto:s_sourceforge_at_[hidden]>> wrote:
> On 13/09/2019 14:36, Victor Zverovich wrote:
> >> Instead of inventing something in the abstract, a good next
> step would
> >> be to figure out how (in UTF-8 mode) Apple Terminal, Gnome
> Terminal,
> >> Konsole, and the new Windows Terminal determine how many terminal
> >> display column a string takes. (I'm not volunteering.)
> >
> > I'm volunteering to do this since improving handling of width is
> already
> > on my TODO list for the fmt library.
> I'll be interested in what you come up with on this, as I don't think
> this solvable.
> For example, imagine formatting into a file, and then that file is
> rendered onto a console.
> Another example: imagine formatting into a clipboard, which on Windows
> and POSIX might involve three or four renditions into differing
> formats
> and encodings. Then the consumer of the clipboard chooses an
> unknown one
> of those renditions, and reinterprets it in some unknown way into a
> paste into some document.
> Personally speaking, I think the best course is to declare
> codepoint or
> byte based formatting widths, and draw a line under it.
> Code-points is even less useful than bytes.

Strongly agreed.

Though we should be clear that we're talking about code units here, not
bytes. The difference matters for the wide character set which has code
units that may be more than one byte.

> Bytes or perceived characters, aka egcs.

Perceived characters would be great if there was a specification that we
could draw from, but alas.

Remember that not all EGCs correspond to visibly perceived characters.
Some are intentionally invisible.


> I agree that nothing else has value in the context of the standard
> Niall
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden] <mailto:Unicode_at_[hidden]>
> http://www.open-std.org/mailman/listinfo/unicode
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode

Received on 2019-09-13 18:15:28