Date: Tue, 12 Nov 2019 17:41:48 +0100
On Tue, 12 Nov 2019 at 16:58, Billy O'Neal (VC LIBS) via Lib-Ext <
lib-ext_at_[hidden]> wrote:
> During review of some Unicode stuff in LWG we had a mini discussion for
> some folks about grapheme clusters and I mentioned everyone who touches
> this stuff might understand the complexities better if they read this:
>
>
>
>
> https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/
>
+1
FYI SG-16 is aware of that blog post and i think there is a pretty strong
agreement with it.
Codepoints have some use (notably the Unicode Character Database is really
the Unicode Codepoint Database, and most Unicode algorithms works on
codepoints), but any kind of user facing UX should deal with EGCS.
It is not always what applications choose to do for a variety of reasons.
Notably Twitter character counts deals in codepoints, web browsers
search function use codepoints as to ignore diacritics, and comparisons can
be done on (normalized) codepoint sequences.
There is also not always a 1-1 mapping between what people understand as
"character", grapheme clusters and glyphes.
>
>
> Billy3
> _______________________________________________
> Lib-Ext mailing list
> Lib-Ext_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext
> Link to this post: http://lists.isocpp.org/lib-ext/2019/11/13606.php
>
lib-ext_at_[hidden]> wrote:
> During review of some Unicode stuff in LWG we had a mini discussion for
> some folks about grapheme clusters and I mentioned everyone who touches
> this stuff might understand the complexities better if they read this:
>
>
>
>
> https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/
>
+1
FYI SG-16 is aware of that blog post and i think there is a pretty strong
agreement with it.
Codepoints have some use (notably the Unicode Character Database is really
the Unicode Codepoint Database, and most Unicode algorithms works on
codepoints), but any kind of user facing UX should deal with EGCS.
It is not always what applications choose to do for a variety of reasons.
Notably Twitter character counts deals in codepoints, web browsers
search function use codepoints as to ignore diacritics, and comparisons can
be done on (normalized) codepoint sequences.
There is also not always a 1-1 mapping between what people understand as
"character", grapheme clusters and glyphes.
>
>
> Billy3
> _______________________________________________
> Lib-Ext mailing list
> Lib-Ext_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext
> Link to this post: http://lists.isocpp.org/lib-ext/2019/11/13606.php
>
Received on 2019-11-12 17:42:01