C++ Logo


Advanced search

Re: [SG16-Unicode] Questions about some corner cases of proposed std::basic_text encoding implementation

From: Steve Downey <sdowney_at_[hidden]>
Date: Sat, 9 Nov 2019 14:04:17 +0000
We have more space to work with for a strongly normalizing type (the
strawman named std::text). I would prefer that views have no functions
attached, just algorithms that operate on correctly shaped ranges. We got
the strictly necessary constructor for string_view, so we can get back to
std::string and null termination if someone has to.

I'd also like to see the unicode algorithms as _algorithms_ that can
operate over something that can be treated as a scalar value.

If we had Contracts I might feel differently about a strong scalar value
type, but without them, introducing an intermediate type is more likely to
just cause needless casting without improving correctness.

I also have concerns about the compiler doing the fusion of conversion of
utf-8 into the unicode algorithms. Writing SIMD friendly versions, which
can be insanely fast, is important. I know this contradicts my general
position on algorithms.

On Sat, Nov 9, 2019, 11:05 JeanHeyd Meneide <phdofthehouse_at_[hidden]> wrote:

> On Sat, Nov 9, 2019 at 3:44 AM Lyberta <lyberta_at_[hidden]> wrote:
>> Lyberta:
>> > JeanHeyd Meneide:
>> >> Absolutely, the default should be basic_string.
>> >
>> > I disagree but I didn't write a proposal yet so we'll talk later.
>> Although here's one of the big arguments why std::basic_string is really
>> badly designed:
>> http://www.gotw.ca/gotw/084.htm
>> And then if we note that Unicode code units have almost zero semantics,
>> we can remove 80% of the API.
> As a counterpoint, there's no reason you can't just ignore the vestigal
> API on std::string. While a "bloated" interface, long gone are the days
> where a function you write has to appear in your final binary, or even in
> your object files.
> I understand it's not ideal, but remember that introducing new vocabulary
> types means you need to compete with every single API that currently takes
> std::string. That's a lot of APIs you're risking no compatibility with, and
> explicit conversions means the user has a lot of typing to do. Try to think
> about how expensive that can be for the user if you introduce new basic
> types, and the compatibility you lose with code being written even today.
> You can still propose such types, but please do be careful: it might not
> be the best use of your time for something Library Evolution Working Group
> (Incubator) will likely shoot down immediately.
> Sincerely,
> JeanHeyd
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode

Received on 2019-11-09 15:05:15