1. Table of Contents
2. Changelog
2.1. R5
-
Added multi-indexed operator[] for unsafe variant of
std :: string :: slice ( size_t start , size_t end ) -
Adjust proposed wording
2.2. R4
-
Removed
since with the addition of std::stride_view it will no longer be necessary.std :: basic_string :: slice ( size_t start , size_t end , size_t step ) -
Added proposed wording
-
Made functions
,constexpr
andconst
(where applicable)noexcept
2.3. R3
-
Added unit tests to a GitHub project.
2.4. R2
-
now throwsstd :: basic_string :: slice ()
whenstd :: out_of_range
.start >= size ()
2.5. R1
-
Removed default parameter from
.std :: basic_string :: slice ( size_t start , size_t end , size_t step )
3. Motivation and Scope
Parsing and string manipulation in C++ used to be very cumbersome, with seemingly basic and trivial methods missing fromstd :: basic_string
. The introduction of C++20 and C++23 resolved some of these issues by adding the above listed utility functions.
I believe we can make string manipulation in C++ even better by adding more of these utility functions to std :: basic_string
, and one option I always miss, that is present in other programming languages (such as Python), is string-slicing.
Python’s string-slicing is very graceful and easy-to-use, but C++ does not support that syntax.Instead, I propose to add several functions to
std :: basic_string
to emulate string-slicing.The functions I propose to add to
std :: basic_string
are the following:
namespace std { /* 1. */ constexpr basic_string_view basic_string :: operator []( size_t start , size_t end ) const ; /* 2. */ constexpr basic_string_view basic_string::slice ( size_t start , size_t end ) const ; /* 3. */ constexpr basic_string_view basic_string::first ( size_t count ) const noexcept ; /* 4. */ constexpr basic_string_view basic_string::last ( size_t count ) const noexcept ; }
4. Impact on the Standard
Since these are only trivial functions requiring no major changes to the language or changes to existing API, the impact of this proposal on the standard is minimal.These functions can already be implemented in the current version of C++23 without any extra changes.
Implementation will be left up to the vendor of course, but since these are trivial functions, we can provide a "template" implementation.
5. Design Decisions
There is a choice in whether astd :: basic_string
is returned, or a std :: basic_string_view
is returned by these new utility functions. It is best for these functions to return
std :: basic_string_view
since:
-
These functions will most often be used to find something in a string, often not requiring a new dynamic allocation to be made.
-
,std :: basic_string :: contains ()
andstd :: basic_string :: starts_with ()
all take astd :: basic_string :: ends_with ()
as a parameter. Therefore, the return value of the proposed functions matching up with these is a benefit.std :: basic_string_view -
If the user wants a
instead of astd :: basic_string
, they can always construct astd :: basic_string_view
.std :: basic_string
C++23 introduced
with any number of subscripts cppreference here.
Using this technique, we can very closely mimic what Python does with their string slicing. This does raise the question if
is still useful, but in my opinion it can serve as a the safe variant of the unsafe
like
is the safe variant of the unsafe
.
6. Technical Specifications
-
takes 2 parameters:std :: basic_string :: operator []
andsize_t start
and returns asize_t end
.std :: basic_string_view -
is the starting index (inclusive) of where to start the slice.start -
There is no guarantee of safety when
.start >= size ()
-
-
is the ending index (exclusive) of where to end the slice.end -
There is no guarantee of safety when
end > size () -
There is no guarantee of safety when
end < start
-
-
-
takes 2 parameters:std :: basic_string :: slice ()
andsize_t start
and returns asize_t end
.std :: basic_string_view -
is the starting index (inclusive) of where to start the slice.start -
is thrown whenstd :: out_of_range
.start >= size ()
-
-
is the ending index (exclusive) of where to end the slice.end -
if
thenend > size ()
will be set toend size () -
if
thenend < start
will be set toend start
-
-
-
takes 1 parameter:std :: basic_string :: first ()
and returns asize_t count
.std :: basic_string_view -
is the amount of characters to be included (counting from index 0) in the slice.count -
if
thencount >= size ()
will be set tocount
.size ()
-
-
-
takes 1 parameter:std :: basic_string :: last ()
and returns asize_t count
.std :: basic_string_view -
is the amount of characters to be included (counting from the last index) in the slice.count -
if
thencount >= size ()
will be set tocount
.size ()
-
-
These are easily implemented functions and depend on specific vendor-implementation of
, but I have provided unit tests and sample implementations here.
7. Proposed Wording
7.1. Addition to < string >
Add the following to 23.4.3.1 basic.string.general:// [...] namespace std { // [...] // [string.ops], string operations // [...] constexpr bool contains ( const charT * x ) const ; constexpr basic_string_view < charT , traits > operator []( size_t start , size_t end ) const noexcept ; constexpr basic_string_view < charT , traits > slice ( size_t start , size_t end ) const ; constexpr basic_string_view < charT , traits > first ( size_t count ) const noexcept ; constexpr basic_string_view < charT , traits > last ( size_t count ) const noexcept ; }
7.2. std :: basic_string :: operator []
Add the following subclause to 23.4.3.8 string.ops:-
23.4.3.?:
basic_string :: operator [] [ string . slice ] -
constexpr basic_string_view < charT , traits > operator []( size_t start , size_t end ) const noexcept ; -
Effects: Equivalent to:
return basic_string_view < charT , traits > ( data () + start , end );
-
-
7.3. std :: basic_string :: slice
Add the following subclause to 23.4.3.8 string.ops:-
23.4.3.?:
basic_string :: slice [ string . slice ] -
constexpr basic_string_view < charT , traits > slice ( size_t start , size_t end ) const ; -
Effects: Determines the effective length
of the string to be returned asxlen
.std :: max ( std :: min ( end , size ()), start )
Returns the characters in the range[ data () + start , data () + xlen ); -
Returns:
basic_string_view < charT , traits > ( data () + start , end ) -
Throws:
ifout_of_range start >= size ()
-
-
7.4. std :: basic_string :: first
Add the following subclause to 23.4.3.8 string.ops:-
23.4.3.?:
basic_string :: first [ string . first ] -
constexpr basic_string_view < charT , traits > first ( size_t count ) const noexcept ; -
Effects: Equivalent to:
return basic_string_view < charT , traits > ( data (), count );
-
-
7.5. std :: basic_string :: last
Add the following subclause to 23.4.3.8 string.ops:-
23.4.3.?:
basic_string :: last [ string . last ] -
constexpr basic_string_view < charT , traits > last ( size_t count ) const noexcept ; -
Effects: Equivalent to:
return basic_string_view < charT , traits > ( data () + ( size () - count ), count );
-
-