C++ Logo

std-proposals

Advanced search

Re: [std-proposals] String Slicing Draft Proposal

From: Rhidian De Wit <rhidiandewit_at_[hidden]>
Date: Tue, 19 Mar 2024 11:47:35 +0100
Hi,

First off, thank you for the extensive email, you raised many good points
which I'll reply to now:

*> * There's a bit of overlap in scope with the very recent P3044R0 by
Michael Florian Hava ("sub-string_view from string").*

Indeed there is, and it is entirely suggesting what I am suggesting: adding
a sort of *sub_string_view* from both *std::string* and *std::string_view*.
I was not aware of this paper and I wholly support it, although I don't
agree with the name: it's a bit long.


*> * It's unclear to me what string slicing has to do with the addition of
contains(), starts_with(), ends_with(). Those are "query" functions, and>
return bool. They don't return offsets or indices. In what way is slicing
related to them?*

They have no direct relation to them, however, they do fit in nicely with
them, since a sliced string can be passed to these. It was merely a vision
of how these new functions could be used with the existing API.





*> * With this proposal, we'll have two/three slightly different ways to do
subslicing:> 1a) auto x = str.substr(start, count);> 1b) auto y =
std::string(str, start, count);> 2 ) auto z = str.slice(start, end);> Is
this subtle API difference going to confuse users? What are going to teach
them?*

I feel like 1a and 2 are meant for similar purposes, but just via a
slightly different way of achieving it. They both have their use and I feel
like both are equally valid. In what we should teach newcomers to C++:
Both. Neither is overly complicated and only offers more choice for those
who prefer different things.


*> ** Also what about the relationship of these functions with other pieces
of the stdlib? For instance, `std::match_results` returns start and length,
not start and end.*

My proposal is not meant to replace *substring*, but only to supplement the
existing functionality, such as *contains()*, *ends_with()*, etc. (Which is
why I mentioned these functions before).

*> ** I dislike the imperative tense "slice" as it doesn't actually slice
`*this`. It should be called "sliced", or "sliced_view" (since it returns
something slightly different).*
*> P3044 proposes "subview" (from "substr").*

That is fair enough, and it might indeed be misleading. *subview* sounds
like a reasonable alternative.



*> * I think these functions on basic_string should return strings, not
string views. I'm puzzled by the arguments brought forward.> * The paper
states: "It is best for these functions to return std::basic_string_view
since [...] These functions will most often be used to find something in a
string, often not requiring a new dynamic allocation to be made."> I don't
quite understand this, what is meant by "find something in a string", and
why that implies that slice has to return string views.*

I wasn't sure whether it was a good idea to instantly propose to add
something to both *std::string* and *std::string_view* at the same time, so
if we *only* implemented this in *std::string* I thought it would be better
to return a *std::string_view*, but that's a very subjective topic and
returning *std::string* would be equally right.
As P3044R0 suggests adding to both *std::string* *and* *std::string_view,* the
argument I brought forth doesn't make sense anymore, since I agree with the
paper to add both options and to let both classes return their own type,
respectively.


*> ** "std::basic_string::contains(), std::basic_string::starts_with() and
std::basic_string::ends_with() all take a std::basic_string_view as a
parameter. Therefore, the return value of the proposed functions matching
up with these is a benefit"> Why is it a benefit? These functions work just
fine if the return value were std::string (as it converts implicitly to
string_view).*

You're right. Your argument makes sense, and I refer you to the above point
that it was with trying to avoid unnecessary memory allocations from
returning a *std::string* instead of *std::string_view*, but since we'd
have *both* anyway, the point I made is moot and obsolete :)


*> But why are those functions "special"? In what way would the return
value of slice() be used as an input to those functions? Is it an important
use case? Why is it an important use case?*

It is an equally important use-case to *.substr()* which can also be used
as an input to the above-mentioned functions. The only difference is a
slightly different way of getting said sub-string.

> *** "If the user wants a std::basic_string instead of a
std::basic_string_view, they can always construct a std::basic_string."*

*> They cannot, in the general case: you lose informations on the
allocator. It can be worked around, but the resulting code looks ugly.> The
risk is that people using allocators might just ban `slice()`...*

I was completely unaware of this fact, as I don't use many allocators in
conjunction with std::string, but in that case that would indeed mean we
cannot simply convert a string_view to a string.


*> ** In general, why aren't the very same APIs being proposed on top of
basic_string_view? Any const string API should also exist in string_view,
with identical semantics (modulo things like return types).> In other
words, why not offer the user both `string.sliced()` (returns another
string) and `stringview.sliced()` (returns a string view)?*
*> The users can then choose whatever they need, and of course they can
always convert their strings to string views and slice the latter.*

As mentioned above, I completely agree.
My only question now is, do I drop this proposal in favour of P3044R0? Or
does it make sense to keep this one alive, but modified.
I'll also take your suggestion and move *first(n)* and *last(n)* to a
different proposal.

Again, thank you very much for your rich insight.

Sincerely,

Op ma 18 mrt 2024 om 17:37 schreef Giuseppe D'Angelo via Std-Proposals <
std-proposals_at_[hidden]>:

> Hello,
>
> On 18/03/2024 16:04, Rhidian De Wit via Std-Proposals wrote:
> > In the attachment you may find the updated proposal.
>
> Some comment on the proposal:
>
> * There's a bit of overlap in scope with the very recent P3044R0 by
> Michael Florian Hava ("sub-string_view from string").
>
>
> * It's unclear to me what string slicing has to do with the addition of
> contains(), starts_with(), ends_with(). Those are "query" functions, and
> return bool. They don't return offsets or indices. In what way is
> slicing related to them?
>
>
> * With this proposal, we'll have two/three slightly different ways to do
> subslicing:
>
> 1a) auto x = str.substr(start, count);
> 1b) auto y = std::string(str, start, count);
> 2 ) auto z = str.slice(start, end);
>
> Is this subtle API difference going to confuse users? What are going to
> teach them?
>
> ** Also what about the relationship of these functions with other pieces
> of the stdlib? For instance, `std::match_results` returns start and
> length, not start and end.
>
> ** I dislike the imperative tense "slice" as it doesn't actually slice
> `*this`. It should be called "sliced", or "sliced_view" (since it
> returns something slightly different). P3044 proposes "subview" (from
> "substr").
>
>
> * I think these functions on basic_string should return strings, not
> string views. I'm puzzled by the arguments brought forward.
>
> ** The paper states: "It is best for these functions to return
> std::basic_string_view since [...] These functions will most often be
> used to find something in a string, often not requiring a new dynamic
> allocation to be made."
>
> I don't quite understand this, what is meant by "find something in a
> string", and why that implies that slice has to return string views.
>
>
> ** "std::basic_string::contains(), std::basic_string::starts_with() and
> std::basic_string::ends_with() all take a std::basic_string_view as a
> parameter. Therefore, the return value of the proposed functions
> matching up with these is a benefit"
>
> Why is it a benefit? These functions work just fine if the return value
> were std::string (as it converts implicitly to string_view).
>
> But why are those functions "special"? In what way would the return
> value of slice() be used as an input to those functions? Is it an
> important use case? Why is it an important use case?
>
>
> ** "If the user wants a std::basic_string instead of a
> std::basic_string_view, they can always construct a std::basic_string."
>
> They cannot, in the general case: you lose informations on the
> allocator. It can be worked around, but the resulting code looks ugly.
> The risk is that people using allocators might just ban `slice()`...
>
>
> ** I'd even claim that, since P2591 isn't merged yet, a string API
> returning string_views is user hostile:
>
> std::string str1 = "hello, world";
> std::string str2 = "planet";
>
> str1.slice(0, 7) + str2; // doesn't compile
>
>
> ** In general, why aren't the very same APIs being proposed on top of
> basic_string_view? Any const string API should also exist in
> string_view, with identical semantics (modulo things like return types).
>
> In other words, why not offer the user both `string.sliced()` (returns
> another string) and `stringview.sliced()` (returns a string view)? The
> users can then choose whatever they need, and of course they can always
> convert their strings to string views and slice the latter.
>
>
> * first(N) and last(N) are, IMVHO, way less contentious (at least on
> basic_string_view), and you should propose them separately. (Being a fan
> of rich APIs, they also make total sense on strings to me.)
>
>
> Thank you,
>
> --
> Giuseppe D'Angelo
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>


-- 
Rhidian De Wit
Software Engineer - Barco

Received on 2024-03-19 10:47:48