Date: Sat, 27 Apr 2019 17:41:38 +0200
On Sat, Apr 27, 2019 at 01:53:00PM +0000, Lyberta wrote:
> > posix advocates using utf-8 as the externel chaset and 32 bit 10646 as internal widechar encoding
>
> But doesn't require it. That's the point.
yes, that is the point. posix does not require unicode, it is character set independent.
and so is c++, and that is the way is should be and remain.
>
> >
> >> As for C, C only has char16_t, char32_t, minimalistic literal support...
> >> I'm not sure about standard library. That's about 2% of Unicode support.
> >
> > also look at iso 14651, 14652
>
> I don't see that in C standard library.
correct.
much of it is implemented in glibc.
> >
> > what is missing? i know: quite a lot, but please examplify
> >
>
> A full set of std::vector operations on sequences of scalar values.
will normal widechar functionlity suffice?
> A full set of std::vector operations on sequences of grapheme clusters.
yes that is probably new specific unicode functionality
> Querying properties of scalar values.
some of it is there , isalpha etc.
> Normalization.
not needed. but could be added.
> Case conversion.
toupper, tolower etc.
> Regular expressions.
strcmp and friends
14651 has the fundamentals for all of 10646.
keld
> > posix advocates using utf-8 as the externel chaset and 32 bit 10646 as internal widechar encoding
>
> But doesn't require it. That's the point.
yes, that is the point. posix does not require unicode, it is character set independent.
and so is c++, and that is the way is should be and remain.
>
> >
> >> As for C, C only has char16_t, char32_t, minimalistic literal support...
> >> I'm not sure about standard library. That's about 2% of Unicode support.
> >
> > also look at iso 14651, 14652
>
> I don't see that in C standard library.
correct.
much of it is implemented in glibc.
> >
> > what is missing? i know: quite a lot, but please examplify
> >
>
> A full set of std::vector operations on sequences of scalar values.
will normal widechar functionlity suffice?
> A full set of std::vector operations on sequences of grapheme clusters.
yes that is probably new specific unicode functionality
> Querying properties of scalar values.
some of it is there , isalpha etc.
> Normalization.
not needed. but could be added.
> Case conversion.
toupper, tolower etc.
> Regular expressions.
strcmp and friends
14651 has the fundamentals for all of 10646.
keld
Received on 2019-04-27 17:41:39