C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] [isocpp-direction] DG answer to the Unicode Direction paper (P1238R0)
From: Lyberta (lyberta_at_[hidden])
Date: 2019-01-10 12:51:00


Tony V E:
> If we think that in 5 or 10 or 15 years the world (ie platforms we care
> about) will finally realize UTF-8 is the right answer, maybe we should just
> support that, and just leave enough space that makes other encodings
> possible, but not required.
Considering the power of C++ templates, providing other encodings should
be fairly easy. I think we should ship UTF-8, UTF-16, UTF-32 in the
standard library because the differences of those encodings are so
little compared to the rest of Unicode such as grapheme cluster
iteration, normalization and other stuff.

> • §4.1. We like the idea of std::text and std::text_view with more
suitable interfaces than the (bloated) std::string one. We wonder how
encodings will be presented to/in the type system.

I'm writing the proof of concept library that doesn't use std::string
and uses strong types for code units. I hope to eventually write a paper
based on my implementation. I'm 100% for deprecation and eventual
removal of std::string.

I see at least 2 template parameters for std::text - the encoding and
normalization. I think that normalization should be a compile time
option so it would be much easier to write high level Unicode algorithm.
Of course, we can always add "no normalization" tag that will force the
most inefficient algorithms.




SG16 list run by sg16-owner@lists.isocpp.org