C++ Logo


Advanced search

Subject: Re: [SG16-Unicode] Draft revision of P1238 (SG16: Unicode Direction) with a new section on file names
From: Tom Honermann (tom_at_[hidden])
Date: 2019-06-13 23:24:15

On 6/12/19 6:19 AM, Lyberta wrote:
>> Any feedback is appreciated.  This revision is targeting the Cologne
>> pre-meeting submission deadline of next Monday, so please provide any
>> feedback in time for changes to be incorporated by then.
> We should discuss if we want support for code point containers and
> ill-formed Unicode. Well-formed Unicode only contains scalar values so
> std::text having .as_code_points() member function implies that it may
> store ill-formed Unicode. I don't like that.

We have discussed this some, but I agree additional discussion is warranted.

> I have recently dropped support for code point sequences in my library
> and only allow scalar values. This means no WTF-8, ill-formed UTF-16 or
> UTF-32.
> I think we must require std::text to be well-formed by default and we
> should have an explicit policy about when we say "scalar value" or "code
> point".

There are pros and cons to enforcing well-formed text.  I suspect we'll
really get to discussing this once we get a std::text proposal in front
of the group.

I agree we (or at least I) need to get better at consistently using
"scalar value" vs "code point"; I often say "code point" when I really
mean "scalar value".  Thanks for mentioning this.


SG16 list run by sg16-owner@lists.isocpp.org