![]() |
SG16 |
Subject: Re: [SG16-Unicode] It???s Time to Stop Adding New Features for Non-Unicode Execution Encodings in C++
From: keld_at_[hidden]
Date: 2019-04-27 10:41:38
On Sat, Apr 27, 2019 at 01:53:00PM +0000, Lyberta wrote:
> > posix advocates using utf-8 as the externel chaset and 32 bit 10646 as internal widechar encoding
>
> But doesn't require it. That's the point.
yes, that is the point. posix does not require unicode, it is character set independent.
and so is c++, and that is the way is should be and remain.
>
> >
> >> As for C, C only has char16_t, char32_t, minimalistic literal support...
> >> I'm not sure about standard library. That's about 2% of Unicode support.
> >
> > also look at iso 14651, 14652
>
> I don't see that in C standard library.
correct.
much of it is implemented in glibc.
> >
> > what is missing? i know: quite a lot, but please examplify
> >
>
> A full set of std::vector operations on sequences of scalar values.
will normal widechar functionlity suffice?
> A full set of std::vector operations on sequences of grapheme clusters.
yes that is probably new specific unicode functionality
> Querying properties of scalar values.
some of it is there , isalpha etc.
> Normalization.
not needed. but could be added.
> Case conversion.
toupper, tolower etc.
> Regular expressions.
strcmp and friends
14651 has the fundamentals for all of 10646.
keld
SG16 list run by sg16-owner@lists.isocpp.org