Date: Sat, 7 Sep 2019 20:11:47 -0500
On Sat, Sep 7, 2019 at 7:31 PM Tom Honermann via Lib <lib_at_[hidden]>
wrote:
> On 9/7/19 8:27 PM, Tony V E wrote:
>
> I think we would want it to be measured in glyphs.
>
> I agree that would be ideal, but...
>
Stop right there. If that's ideal, let's do that. Or at least, let's
leave room for it to be done at some point. Specifying CUs now prevents
the ideal from ever being realized.
> Are you suggesting code points because glyphs are too hard?
>
> I don't know how to achieve that. Field width doesn't really work for
> alignment unless one assumes a monospace font. We could measure in terms
> of extended grapheme clusters, but EGCS width has changed over time (e.g.,
> family emoji). That makes alignment dependent on both display properties
> and Unicode version. And, of course, this would drag in locale dependence
> as well.
>
If you just count N=EGCs, you get the "right" answer. if your terminal
shows more or less than N characters, get a new terminal. What I mean by
this is that there should be no consideration of fonts.
As for the need for a locale, I don't get that. Grapheme breaking is
simple, and requires no locale info. Do you mean Unicode data? Picking a
version and sticking with it should be sufficient. No system that I know
of has multiple Unicode versions to pick from programatically.
Zach
wrote:
> On 9/7/19 8:27 PM, Tony V E wrote:
>
> I think we would want it to be measured in glyphs.
>
> I agree that would be ideal, but...
>
Stop right there. If that's ideal, let's do that. Or at least, let's
leave room for it to be done at some point. Specifying CUs now prevents
the ideal from ever being realized.
> Are you suggesting code points because glyphs are too hard?
>
> I don't know how to achieve that. Field width doesn't really work for
> alignment unless one assumes a monospace font. We could measure in terms
> of extended grapheme clusters, but EGCS width has changed over time (e.g.,
> family emoji). That makes alignment dependent on both display properties
> and Unicode version. And, of course, this would drag in locale dependence
> as well.
>
If you just count N=EGCs, you get the "right" answer. if your terminal
shows more or less than N characters, get a new terminal. What I mean by
this is that there should be no consideration of fonts.
As for the need for a locale, I don't get that. Grapheme breaking is
simple, and requires no locale info. Do you mean Unicode data? Picking a
version and sticking with it should be sufficient. No system that I know
of has multiple Unicode versions to pick from programatically.
Zach
Received on 2019-09-08 03:11:59