Date: Sun, 8 Sep 2019 12:29:12 -0400
On 9/7/19 11:25 PM, Tom Honermann via Lib wrote:
> On 9/7/19 9:11 PM, Zach Laine wrote:
>> On Sat, Sep 7, 2019 at 7:31 PM Tom Honermann via Lib
>> <lib_at_[hidden] <mailto:lib_at_[hidden]>> wrote:
>>
>> On 9/7/19 8:27 PM, Tony V E wrote:
>>> I think we would want it to be measured in glyphs.
>> I agree that would be ideal, but...
>>
>>
>> Stop right there. If that's ideal, let's do that. Or at least,
>> let's leave room for it to be done at some point. Specifying CUs now
>> prevents the ideal from ever being realized.
> There are other options. For example, a future extension could allow
> specifying what units are to be used for field width.
>>
>>> Are you suggesting code points because glyphs are too hard?
>> I don't know how to achieve that. Field width doesn't really
>> work for alignment unless one assumes a monospace font. We could
>> measure in terms of extended grapheme clusters, but EGCS width
>> has changed over time (e.g., family emoji). That makes alignment
>> dependent on both display properties and Unicode version. And,
>> of course, this would drag in locale dependence as well.
>>
>>
>> If you just count N=EGCs, you get the "right" answer. if your
>> terminal shows more or less than N characters, get a new terminal.
>> What I mean by this is that there should be no consideration of fonts.
> I see field width as either indicating storage (number of code units)
> or alignment. The number of user perceived characters is not useful
> for aligning text unless a monospace font is assumed. Therefore,
> storage seems like the more useful measurement. This also aligns with
> format_to_n and formatted_size which, unless I'm mistaken, work in
> code units. (It would be nice to clarify the wording for these as
> well; what is meant by "number of characters in the character
> representation"?)
Henri Sivonen just today posted a fantastic analysis of the various ways
in which we think about the length/width of a string. Particularly
relevant to this discussion is the "Display Space" section, but I
encourage everyone to read the entire article. It's fascinating!
- https://hsivonen.fi/string-length
Tom.
> On 9/7/19 9:11 PM, Zach Laine wrote:
>> On Sat, Sep 7, 2019 at 7:31 PM Tom Honermann via Lib
>> <lib_at_[hidden] <mailto:lib_at_[hidden]>> wrote:
>>
>> On 9/7/19 8:27 PM, Tony V E wrote:
>>> I think we would want it to be measured in glyphs.
>> I agree that would be ideal, but...
>>
>>
>> Stop right there. If that's ideal, let's do that. Or at least,
>> let's leave room for it to be done at some point. Specifying CUs now
>> prevents the ideal from ever being realized.
> There are other options. For example, a future extension could allow
> specifying what units are to be used for field width.
>>
>>> Are you suggesting code points because glyphs are too hard?
>> I don't know how to achieve that. Field width doesn't really
>> work for alignment unless one assumes a monospace font. We could
>> measure in terms of extended grapheme clusters, but EGCS width
>> has changed over time (e.g., family emoji). That makes alignment
>> dependent on both display properties and Unicode version. And,
>> of course, this would drag in locale dependence as well.
>>
>>
>> If you just count N=EGCs, you get the "right" answer. if your
>> terminal shows more or less than N characters, get a new terminal.
>> What I mean by this is that there should be no consideration of fonts.
> I see field width as either indicating storage (number of code units)
> or alignment. The number of user perceived characters is not useful
> for aligning text unless a monospace font is assumed. Therefore,
> storage seems like the more useful measurement. This also aligns with
> format_to_n and formatted_size which, unless I'm mistaken, work in
> code units. (It would be nice to clarify the wording for these as
> well; what is meant by "number of characters in the character
> representation"?)
Henri Sivonen just today posted a fantastic analysis of the various ways
in which we think about the length/width of a string. Particularly
relevant to this discussion is the "Display Space" section, but I
encourage everyone to read the entire article. It's fascinating!
- https://hsivonen.fi/string-length
Tom.
Received on 2019-09-08 18:29:19