C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] D1628R0 (Unicode character properties)
From: Lyberta (lyberta_at_[hidden])
Date: 2019-03-28 02:42:00


Corentin:
> As requested by Tom, please find attach D1628R0 which will be discussed
> during today's meeting \N{WHITE EXCLAMATION MARK ORNAMENT}
>
> Feedback welcome :)

Do we really want std::uni? std::unicode seems much better.

Unicode always uses the term "code point", not "code point":
https://www.unicode.org/glossary/#code_point

So the name should be std::uni[code]::code_point.

In my experience, I never need the code point because surrogates are not
allowed in valid UTF. I only ever need unicode scalar values:
https://www.unicode.org/glossary/#unicode_scalar_value

Hence I think using code point interfaces should be discouraged.

I think constructing code points or scalar values from char8_t or
char16_t makes no sense. They are at the different levels.

I'm writing a competing proposal where I want to propose
std::unicode_code_point and std::unicode_scalar_value that have explicit
constructors from char32_t and explicit member function .value() to get
char32_t back. I think this is the only way forward. char8_t, char16_t
and char32_t are dumb types that have horrible names, we should o.nly
use them as a transition mechanism.

I'm gonna try to finish the early draft of my proposal and after release
of GCC 9 I'm gonna port my entire code base on its design so I will have
usage experience with it.




SG16 list run by sg16-owner@lists.isocpp.org