Subject: Re: [SG16-Unicode] code_unit_sequence and code_point_sequence
From: Steve Downey (sdowney_at_[hidden])
Date: 2018-06-19 06:40:08
I would think that deserialization would be an operation on a Range of
std::byte or int8_t, where you would read out code points depending on the
encoding. Possibly with either replacement or failure. But until you have
code points, it's not text, it's raw octets. [Are we still supporting the
hypothetical 9 bit byte computer in the standard?]
On Tue, Jun 19, 2018, 07:34 Martinho Fernandes <rmf_at_[hidden]> wrote:
> Apologies for the double message. I forgot to "reply to list".
> On 19.06.18 11:53, Lyberta wrote:
> The proposed text_view takes TextEncoding and there are
> std::utf16[be,le]_encodings that satisfy TextEncoding. This is breaking
> abstraction and making user code more complicated.
> Can you explain how the user code becomes more complicated? Perhaps with
> text_view and
> code_point_sequence shouldn't take encoding schemes as template
> parameters, only encoding forms. Essentially, TextEncoding is as
> horrible as std::basic_string in its design.
> Can you explain why it shouldn't take encoding schemes? There is no
> explanation here, and it isn't clear to me why not.
> Unicode mailing list
SG16 list run by herb.sutter at gmail.com