sg16: Re: [SG16-Unicode] code_unit_sequence and code_point

From: Lyberta <lyberta_at_[hidden]>
Date: Tue, 19 Jun 2018 19:00:00 +0000

Tom Honermann:
> This is overspecification in my opinion. And like Martinho, I don't see
> the point of code_unit_sequence (or code_point_sequence); that is a
> concept, not a container.

This is the "experts only" feature. Some people want to work with code
points and code units.

> Why do you think it is important to specify an underlying storage
> container type for std::text?

I don't want to copy data between QString and std::text. I want
std::text to consume QString by moving it inside.

QString string{"Hello"};
auto text = MakeStdText(std::move(s));

Type of text is now std::text<std::code_point_sequence<QString,
std::utf16>>.

> Because that enables wrapping network and file based I/O without
> requiring additional storage or conversions. These are real use cases.
> Perhaps you just haven't had a need for them?

In my design network and file I/O are handled by
std::code_unit_sequence[_view] because it is a byte level so byte level
classes should handle it.

> I wonder if there is some disconnect between what text_view provides and
> what you think it provides. It would be helpful if you were to provide
> some example code that we could use to clarify discussion; something
> that would allow side-by-side comparisons of various interfaces.

I have started to implement code_unit_sequence and will report my findings.

>> Since we are aiming for a standard library, it is assumed that
>> implementers know the value of std::endian::native.
> That doesn't isolate programmers that use the standard library from
> being impacted.

That's why code_unit_sequence::data() returns std::byte* (should it be
std::span<std::byte>?) so programmers can pass those blobs of bytes
anywhere they want. The endianness conversions should be handled by the
standard library.

Received on 2018-06-19 21:00:16