C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] code_unit_sequence and code_point_sequence

From: R. Martinho Fernandes <rmf_at_[hidden]>
Date: Wed, 20 Jun 2018 11:10:19 +0200
On June 20, 2018 7:52:00 AM GMT+02:00, Lyberta <lyberta_at_[hidden]> wrote:
>I idea that programmers won't need to.
>
>std::text t = u8"Hello";
>
>Type of text will be
>std::text<std::code_point_sequence<std::code_unit_sequence<std::utf8,
> std::endian::native, std::no_bom>>>;
>
>Here is standard library has chosen native endianness and no reading or
>writing of BOM - a sane default. Then we provide helpers such as:
>
>auto t = std::make_text<std::endian::big, std::bom>(u8"Hello");
>
>Type of text will be
>std::text<std::code_point_sequence<std::code_unit_sequence<std::utf8,
> std::endian::big, std::bom>>>;
>
>Here programmer has explicitly requested for BE with reading and
>writing
>of BOM. std::bom and std::no_bom are just placeholders, this should be
>an enum class.

I'm sorry, these examples are bonkers again. They are not convincing because you used UTF-8. What does big endian UTF-8 even mean? Can you write the same with e.g. the UTF-16 variants instead? That would make much better examples. I've been trying but I don't understand what e.g. this should mean:

auto t = std::make_text<std::endian::big, std::bom>(u"Hello");

Received on 2018-06-20 11:10:27