C++ Logo

std-proposals

Advanced search

Re: [std-proposals] String Slicing Draft Proposal

From: Giuseppe D'Angelo <giuseppe.dangelo_at_[hidden]>
Date: Mon, 18 Mar 2024 17:37:09 +0100
Hello,

On 18/03/2024 16:04, Rhidian De Wit via Std-Proposals wrote:
> In the attachment you may find the updated proposal.

Some comment on the proposal:

* There's a bit of overlap in scope with the very recent P3044R0 by
Michael Florian Hava ("sub-string_view from string").


* It's unclear to me what string slicing has to do with the addition of
contains(), starts_with(), ends_with(). Those are "query" functions, and
return bool. They don't return offsets or indices. In what way is
slicing related to them?


* With this proposal, we'll have two/three slightly different ways to do
subslicing:

1a) auto x = str.substr(start, count);
1b) auto y = std::string(str, start, count);
2 ) auto z = str.slice(start, end);

Is this subtle API difference going to confuse users? What are going to
teach them?

** Also what about the relationship of these functions with other pieces
of the stdlib? For instance, `std::match_results` returns start and
length, not start and end.

** I dislike the imperative tense "slice" as it doesn't actually slice
`*this`. It should be called "sliced", or "sliced_view" (since it
returns something slightly different). P3044 proposes "subview" (from
"substr").


* I think these functions on basic_string should return strings, not
string views. I'm puzzled by the arguments brought forward.

** The paper states: "It is best for these functions to return
std::basic_string_view since [...] These functions will most often be
used to find something in a string, often not requiring a new dynamic
allocation to be made."

I don't quite understand this, what is meant by "find something in a
string", and why that implies that slice has to return string views.


** "std::basic_string::contains(), std::basic_string::starts_with() and
std::basic_string::ends_with() all take a std::basic_string_view as a
parameter. Therefore, the return value of the proposed functions
matching up with these is a benefit"

Why is it a benefit? These functions work just fine if the return value
were std::string (as it converts implicitly to string_view).

But why are those functions "special"? In what way would the return
value of slice() be used as an input to those functions? Is it an
important use case? Why is it an important use case?


** "If the user wants a std::basic_string instead of a
std::basic_string_view, they can always construct a std::basic_string."

They cannot, in the general case: you lose informations on the
allocator. It can be worked around, but the resulting code looks ugly.
The risk is that people using allocators might just ban `slice()`...


** I'd even claim that, since P2591 isn't merged yet, a string API
returning string_views is user hostile:

std::string str1 = "hello, world";
std::string str2 = "planet";

str1.slice(0, 7) + str2; // doesn't compile


** In general, why aren't the very same APIs being proposed on top of
basic_string_view? Any const string API should also exist in
string_view, with identical semantics (modulo things like return types).

In other words, why not offer the user both `string.sliced()` (returns
another string) and `stringview.sliced()` (returns a string view)? The
users can then choose whatever they need, and of course they can always
convert their strings to string views and slice the latter.


* first(N) and last(N) are, IMVHO, way less contentious (at least on
basic_string_view), and you should propose them separately. (Being a fan
of rich APIs, they also make total sense on strings to me.)


Thank you,

-- 
Giuseppe D'Angelo

Received on 2024-03-18 16:37:13