Date: Wed, 05 Dec 2018 04:17:00 +0000
This is something that hit me recently. Why are we using fundamental
types for code units? CppCon 2018 is full of people saying that we
should migrate to strong types, that std::size_t should have been a
struct, etc.
I propose we add strong types for code units:
* utf8_code_unit
* utf16_code_unit
* utf32_code_unit
These will hold char8,16,32_t inside of them respectively but will not
allow the invalid values such as >245 for UTF-8, surrogates and
>0x10FFFF for UTF-32, etc.
This will guarantee that all code units are valid and will allow us to
write much faster code because we will never need to check for invalid
values.
types for code units? CppCon 2018 is full of people saying that we
should migrate to strong types, that std::size_t should have been a
struct, etc.
I propose we add strong types for code units:
* utf8_code_unit
* utf16_code_unit
* utf32_code_unit
These will hold char8,16,32_t inside of them respectively but will not
allow the invalid values such as >245 for UTF-8, surrogates and
>0x10FFFF for UTF-32, etc.
This will guarantee that all code units are valid and will allow us to
write much faster code because we will never need to check for invalid
values.
Received on 2018-12-05 05:26:00