C++ Logo


Advanced search

Subject: Re: [SG16-Unicode] Fundamental Unicode types
From: Lyberta (lyberta_at_[hidden])
Date: 2019-03-28 04:32:00

> charX_t have a requirement to be code units in C++20.

But they do not have the requirements of holding the valid values hence
they are still dangerous.

> We also really do not want to have code units API. Because you can not do
> anything useful with it.
> Especially iterating over code units or querying the properties of code
> units is something that is probably not useful ever (and has a propency to
> be missed used)

This is a transition mechanism from std::basic_string and many other
legacy string classes. Besides, all those functions are required when
implementing encoding forms so I decided to expose them. I don't think
they are harmful.

> Scalar value and grapheme views are useful indeed Imo. Text is useful but
> it's basically something that can spawn a scalar or grapheme view with some
> storage, high level invariants and state.

Current consensus is that if you call std::begin on std::text, it will
return grapheme cluster iterator, I'd personally use .to_graphemes() for
that. So for now I plan to implement it as a distinct type. I haven't
yet implemented grapheme cluster level so I don't have insights yet.

> Lastly, I am very concerned about a design that would throw by default.
> Especially something like domain_error. It basically means I wouldn't use
> any standard Unicode facilities and nor would people in a lot of Industries
> (games, embedded etc).

I do plan to rebase my proposal on top of p0709. That way those people
won't have any excuse any longer :P

SG16 list run by sg16-owner@lists.isocpp.org