Since D1628R0 (Unicode character properties) conflicts with my proposal,
I decided to finish a draft and publish it:
https://github.com/Lyberta/cpp-unicode-fundamental
It proposes 5 strong types which are intended as the basis for the rest
of the Unicode library:
std::unicode::utf8_code_unit
std::unicode::utf16_code_unit
std::unicode::utf32_code_unit
std::unicode::code_point
std::unicode::scalar_value
charX_t have a requirement to be code units in C++20.
We also really do not want to have code units API. Because you can not do anything useful with it.
Especially iterating over code units or querying the properties of code units is something that is probably not useful ever (and has a propency to be missed used)
Scalar value and grapheme views are useful indeed Imo. Text is useful but it's basically something that can spawn a scalar or grapheme view with some storage, high level invariants and state.
Lastly, I am very concerned about a design that would throw by default. Especially something like domain_error. It basically means I wouldn't use any standard Unicode facilities and nor would people in a lot of Industries (games, embedded etc).