Date: Fri, 1 Sep 2023 12:18:48 +0000
Abstract
This paper proposes to add iterator-based versions of the find family of functions for basic_string and basic_string_view, and deprecate the index-based versions. This is to align with the iterator-based interface of the C++ standard library. The current use of indices and npos is inconsistent with the C++ style and causes confusion and inefficiency.
Motivation
The C++ standard library extensively uses iterators as a generic way to access and manipulate elements in a range. Iterators were invented by Alexander Stepanov, who also designed the STL, which was later incorporated into the standard library. However, std::string predates the invention of iterators, and thus its find family of member functions does not use iterators, but rather indices and a special value npos to indicate the position of elements or substrings. This is unfortunate, as it creates a discrepancy between the interface of std::string and other standard containers and algorithms.
The use of indices and npos has several drawbacks:
* It is inconsistent with the iterator-based interface of the standard library, which makes std::string less compatible with generic algorithms and utilities.
* It confuses users, as npos is a static data member, which might be overlooked by someone who is not familiar with C++ or a beginner. This increases the learning cost, and conceptually, npos is a magic value.
* It is inefficient, as it requires extra arithmetic operations to convert between indices and iterators, or to check for npos. For example, one might need to write something like this:
if (auto pos = s.find(c); pos != std::string::npos) {
auto it = s.begin() + pos;
// do something with it
}
This involves an addition operation, which could be avoided if s.find(c) returned an iterator directly.
Proposal
I propose to add a new set of find functions for basic_string and basic_string_view, which take and return iterators instead of indices. These functions will have the same name as the existing ones, but with an ¡°i¡± prefix. For example:
template<class CharT, class Traits, class Allocator>
constexpr basic_string<CharT, Traits, Allocator>::const_iterator
basic_string<CharT, traits, Allocator>::ifind(CharT ch, basic_string<CharT, Traits, Allocator>::const_iterator first = {}) const noexcept;
template<class CharT, class Traits>
constexpr basic_string_view<CharT, Traits>::const_iterator
basic_string_view<CharT, Traits>::ifind(CharT ch, basic_string_view<CharT, Traits>::const_iterator first = {}) const noexcept;
These functions will behave similarly to the existing ones, except that will return an iterator to the first element of the found substring, or the end iterator if not found and they will also take iterators as parameters to specify the range to search in, instead of indices. For example:
std::string s {"Hello world"};
auto it = s.ifind("world"); // returns an iterator to 'w'
auto it2 = s.ifind("foo"); // returns s.end()
auto it3 = s.ifind("ll", s.begin() + 2); // returns an iterator to 'l'
auto it4 = s.ifind("ll", s.begin() + 3); // returns s.end()
I also propose to deprecate the existing index-based find functions, and encourage users to migrate to the new iterator-based ones. This will make the interface of basic_string and basic_string_view more consistent with the rest of the standard library, and avoid the confusion and inefficiency caused by indices and npos.
References
* [1] Alexander Stepanov, ¡°STL and Its Design Principles¡±, Talk presented at Adobe Systems Inc., 2002.
* [2] Bjarne Stroustrup, ¡°The Design and Evolution of C++¡±, Addison-Wesley, 1994.
This paper proposes to add iterator-based versions of the find family of functions for basic_string and basic_string_view, and deprecate the index-based versions. This is to align with the iterator-based interface of the C++ standard library. The current use of indices and npos is inconsistent with the C++ style and causes confusion and inefficiency.
Motivation
The C++ standard library extensively uses iterators as a generic way to access and manipulate elements in a range. Iterators were invented by Alexander Stepanov, who also designed the STL, which was later incorporated into the standard library. However, std::string predates the invention of iterators, and thus its find family of member functions does not use iterators, but rather indices and a special value npos to indicate the position of elements or substrings. This is unfortunate, as it creates a discrepancy between the interface of std::string and other standard containers and algorithms.
The use of indices and npos has several drawbacks:
* It is inconsistent with the iterator-based interface of the standard library, which makes std::string less compatible with generic algorithms and utilities.
* It confuses users, as npos is a static data member, which might be overlooked by someone who is not familiar with C++ or a beginner. This increases the learning cost, and conceptually, npos is a magic value.
* It is inefficient, as it requires extra arithmetic operations to convert between indices and iterators, or to check for npos. For example, one might need to write something like this:
if (auto pos = s.find(c); pos != std::string::npos) {
auto it = s.begin() + pos;
// do something with it
}
This involves an addition operation, which could be avoided if s.find(c) returned an iterator directly.
Proposal
I propose to add a new set of find functions for basic_string and basic_string_view, which take and return iterators instead of indices. These functions will have the same name as the existing ones, but with an ¡°i¡± prefix. For example:
template<class CharT, class Traits, class Allocator>
constexpr basic_string<CharT, Traits, Allocator>::const_iterator
basic_string<CharT, traits, Allocator>::ifind(CharT ch, basic_string<CharT, Traits, Allocator>::const_iterator first = {}) const noexcept;
template<class CharT, class Traits>
constexpr basic_string_view<CharT, Traits>::const_iterator
basic_string_view<CharT, Traits>::ifind(CharT ch, basic_string_view<CharT, Traits>::const_iterator first = {}) const noexcept;
These functions will behave similarly to the existing ones, except that will return an iterator to the first element of the found substring, or the end iterator if not found and they will also take iterators as parameters to specify the range to search in, instead of indices. For example:
std::string s {"Hello world"};
auto it = s.ifind("world"); // returns an iterator to 'w'
auto it2 = s.ifind("foo"); // returns s.end()
auto it3 = s.ifind("ll", s.begin() + 2); // returns an iterator to 'l'
auto it4 = s.ifind("ll", s.begin() + 3); // returns s.end()
I also propose to deprecate the existing index-based find functions, and encourage users to migrate to the new iterator-based ones. This will make the interface of basic_string and basic_string_view more consistent with the rest of the standard library, and avoid the confusion and inefficiency caused by indices and npos.
References
* [1] Alexander Stepanov, ¡°STL and Its Design Principles¡±, Talk presented at Adobe Systems Inc., 2002.
* [2] Bjarne Stroustrup, ¡°The Design and Evolution of C++¡±, Addison-Wesley, 1994.
Received on 2023-09-01 12:18:54