C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] Namespaces
From: Steve Downey (sdowney_at_[hidden])
Date: 2019-04-12 17:12:29


We're probably going to want to decode utf-16 and 32 (convert to scalar
value / code point) for a long while, but encoding to them should be
unusual. At least that's the ietf and whatwg recommendation for new
protocols and formats.

On Fri, Apr 12, 2019, 17:53 Tony V E <tvaneerd_at_[hidden]> wrote:

>
>
> On Fri, Apr 12, 2019 at 10:02 AM JeanHeyd Meneide <phdofthehouse_at_[hidden]>
> wrote:
>
>> On Fri, Apr 12, 2019 at 6:45 AM Lyberta <lyberta_at_[hidden]> wrote:
>>
>>> I guess at least teachability and clean structure. The guidance would
>>> be: "stuff in std is old and unusable for text, stuff in std::text is
>>> new and usable".
>>>
>>
>> Having `std::text::text` is a bit of a weird class type (unless we give
>> it a new name), and it's impossible to have `std::text` as a type and
>> `std::text` as a namespace at the same time.
>>
>> The sub-namespace isn't really necessary here because we are not in
>> competition for certain names or algorithms, save for the 3 names I want to
>> specifically name `std::text_decode`, `std::text_encode`,
>> `std::text_transcode`, and similar because it internally implies other
>> semantics and I do not want to steal the names `decode` and `encode` when
>> those are much more broad terms.
>>
>> Other names such as `std::utf8`, `std::utf32`, `std::utf16`,
>> `std::wide_execution`, and `std::narrow_execution` are fairly specific to
>> the text domain and I don't see them clashing.
>>
>> Names such as `std::rope`, `std::text`, and `std::text_view` will speak
>> for themselves. There are a few traits types that might be introduced to
>> the standard, but as far as I can tell none of these will clash either.
>>
>> `std::uni` is an OK namespace for the unicode properties.
>>
>> Regarding earlier points on what the standard does provide: the standard
>> needs to provide encodings for all the encoding types that are (currently)
>> pushed out by the standard, and nothing more. This includes: std::utf8,
>> std::utf16, std::utf32, std::wide_execution, and std::narrow_execution. The
>> standard should not vend any other encodings, but the Encoding and Decoding
>> interfaces should be standard -- much like Allocator -- that allows a user
>> to swap in their own class type and object that replaces the use of an
>> encoding in any interface / function standard templates provide. (Similar
>> to char_traits, except not as useless.) This means users can employ
>> whatever encoding or power they have under the hood and enjoy fast and
>> correct text processing so long as they follow the required semantics.
>>
>> Note that we cannot only ship utf8 as an encoding, because the standard
>> already ships and acknowledges more than utf8 as one of the encoding for
>> string literals. It would be highly dysfunctional to have utf16 string
>> literals that the standard library itself cannot process in a reasonable
>> manner.
>>
>
> It would be fine to do just that, We'd would be signalling that utf16 is
> on the chopping block, to be deprecated some day, or at least ignored going
> forward. Same way we ignore, or give lower priority to, other things in
> the language and library that we feel were a mistake and are not worth our
> time.
>
> For example, no one ever stops a proposal by saying "but that doesn't
> support valarray", because no one cares about valarray. Even though it is
> right there in std::.
>
> Now, you could say some people care about uft16, and that would be a
> reason to continue to support it, but I think eventually (5-10 years) we
> will find that no one cares about utf16, and it is just cruft.
>
> Of course, supporting it now might be better for consensus, but "because
> it exists" isn't reason enough.
>
>
> --
> Be seeing you,
> Tony
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>



SG16 list run by sg16-owner@lists.isocpp.org