Date: Thu, 28 Mar 2019 07:42:00 +0000
Corentin:
> As requested by Tom, please find attach D1628R0 which will be discussed
> during today's meeting \N{WHITE EXCLAMATION MARK ORNAMENT}
>
> Feedback welcome :)
Do we really want std::uni? std::unicode seems much better.
Unicode always uses the term "code point", not "code point":
https://www.unicode.org/glossary/#code_point
So the name should be std::uni[code]::code_point.
In my experience, I never need the code point because surrogates are not
allowed in valid UTF. I only ever need unicode scalar values:
https://www.unicode.org/glossary/#unicode_scalar_value
Hence I think using code point interfaces should be discouraged.
I think constructing code points or scalar values from char8_t or
char16_t makes no sense. They are at the different levels.
I'm writing a competing proposal where I want to propose
std::unicode_code_point and std::unicode_scalar_value that have explicit
constructors from char32_t and explicit member function .value() to get
char32_t back. I think this is the only way forward. char8_t, char16_t
and char32_t are dumb types that have horrible names, we should o.nly
use them as a transition mechanism.
I'm gonna try to finish the early draft of my proposal and after release
of GCC 9 I'm gonna port my entire code base on its design so I will have
usage experience with it.
> As requested by Tom, please find attach D1628R0 which will be discussed
> during today's meeting \N{WHITE EXCLAMATION MARK ORNAMENT}
>
> Feedback welcome :)
Do we really want std::uni? std::unicode seems much better.
Unicode always uses the term "code point", not "code point":
https://www.unicode.org/glossary/#code_point
So the name should be std::uni[code]::code_point.
In my experience, I never need the code point because surrogates are not
allowed in valid UTF. I only ever need unicode scalar values:
https://www.unicode.org/glossary/#unicode_scalar_value
Hence I think using code point interfaces should be discouraged.
I think constructing code points or scalar values from char8_t or
char16_t makes no sense. They are at the different levels.
I'm writing a competing proposal where I want to propose
std::unicode_code_point and std::unicode_scalar_value that have explicit
constructors from char32_t and explicit member function .value() to get
char32_t back. I think this is the only way forward. char8_t, char16_t
and char32_t are dumb types that have horrible names, we should o.nly
use them as a transition mechanism.
I'm gonna try to finish the early draft of my proposal and after release
of GCC 9 I'm gonna port my entire code base on its design so I will have
usage experience with it.
Received on 2019-03-28 08:49:49