C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Iterators for basic_string and basic_string_view find functions

From: Yexuan Xiao <bizwen_at_[hidden]>
Date: Fri, 1 Sep 2023 14:58:22 +0000
We now have tools like Clang-tidy that can convert subscripts to iterators, although it will slightly reduce efficiency:

// input: s, "ii", pos
// output: p
auto p = s.find("ii", pos)
// equivalent to
auto i = s.ifind("ii", std::begin(s) + pos);
decltype(s)::size_type p = (i == std::end(s)) ? decltype(s)::npos : std::distance(std::begin(s), i);

Marking as deprecated does not necessarily have to start from C++26, it is just a possibility rather than a necessity, I think this issue can be decided by an additional vote. I just want to express an idea of recommending using the new functions.
Maybe compilers can provide a warning option to encourage people to use the new functions, but this requires the support of the compiler authors, not the standard.

________________________________
From: Std-Proposals <std-proposals-bounces_at_lists.isocpp.org> on behalf of Sebastian Wittmeier via Std-Proposals <std-proposals_at_[hidden]>
Sent: Friday, September 1, 2023 22:36
To: std-proposals_at_[hidden]g <std-proposals_at_[hidden]>
Cc: Sebastian Wittmeier <wittmeier_at_[hidden]>
Subject: Re: [std-proposals] Iterators for basic_string and basic_string_view find functions


Some companies and other institutions may have regulations to fix all code so that is does not rely on deprecated features.



std::string use is very wide-spread, so lots of high-level code is affected.



By introducing an alternate feature and deprecating the existing one within the *same* revision, you give users a very short amount of time to adapt.

E.g. if this would be introduced to C++26, code following the above deprecation-regulation would not be able to be compatible with C++23 and C++26 at the same time, except one implements both versions with #if __cplusplus / _MSC_VER or one uses neither of the find() functions and manually implements the functionality.


-----Ursprüngliche Nachricht-----
Von: Yexuan Xiao via Std-Proposals <std-proposals_at_lists.isocpp.org>
Gesendet: Fr 01.09.2023 16:26
Betreff: Re: [std-proposals] Iterators for basic_string and basic_string_view find functions
An: std-proposals_at_[hidden]p.org;
CC: Yexuan Xiao <bizwen_at_[hidden]>;
xxx.h headers are deprecated in C++98 and undeprecated in C++23. The standard actually violated the point, and the standard does not specify when to delete the deprecated content, so it can be kept deprecated and let time decide.

________________________________
From: Jonathan Wakely <cxx_at_kayari.org>
Sent: Friday, September 1, 2023 21:59
To: Xiao Yexuan <bizwen_at_[hidden]>
Cc: C++ Proposals <std-proposals_at_[hidden]>
Subject: Re: [std-proposals] Iterators for basic_string and basic_string_view find functions



On Fri, 1 Sept 2023, 14:27 Xiao Yexuan, <bizwen_at_nykz.org<mailto:bizwen_at_[hidden]>> wrote:

Marking a function as deprecate does not mean deleting it, it may never be deleted, it just means that it is discouraged to use.

That's not what the standard says it means:


"Normative for the current revision of C++, but having been identified as a candidate for removal from future revisions."




We now have tools like Clang-tidy that can convert subscripts to iterators, although it will slightly reduce efficiency.

// input: s, "ii", pos

// output: p



auto p = s.find("ii", pos)

// equivalent to

auto i = s.ifind("ii", std::begin(s) + pos);

decltype(s)::size_type p = (i == std::end(s)) ? decltype(s)::npos : std::distance(std::begin(s), i)l;





发件人: Jonathan Wakely<mailto:cxx_at_[hidden]>
发送时间: 2023年9月1日 20:40
收件人: std-proposals_at_[hidden]<mailto:std-proposals_at_[hidden]>
抄送: 萧叶轩<mailto:bizwen_at_[hidden]>
主题: Re: [std-proposals] Iterators for basic_string and basic_string_view find functions







On Fri, 1 Sept 2023 at 13:19, 萧 叶轩 via Std-Proposals <std-proposals_at_[hidden]<mailto:std-proposals_at_[hidden]>> wrote:

Abstract

This paper proposes to add iterator-based versions of the find family of functions for basic_string and basic_string_view, and deprecate the index-based versions.



Deprecating the existing functions will needlessly break many, many programs. Please don't. They work fine, exactly as they've worked for decades.





This is to align with the iterator-based interface of the C++ standard library. The current use of indices and npos is inconsistent with the C++ style and causes confusion and inefficiency.

Motivation

The C++ standard library extensively uses iterators as a generic way to access and manipulate elements in a range. Iterators were invented by Alexander Stepanov, who also designed the STL, which was later incorporated into the standard library. However, std::string predates the invention of iterators, and thus its find family of member functions does not use iterators, but rather indices and a special value npos to indicate the position of elements or substrings. This is unfortunate, as it creates a discrepancy between the interface of std::string and other standard containers and algorithms.

The use of indices and npos has several drawbacks:

  * It is inconsistent with the iterator-based interface of the standard library, which makes std::string less compatible with generic algorithms and utilities.
  * It confuses users, as npos is a static data member, which might be overlooked by someone who is not familiar with C++ or a beginner. This increases the learning cost, and conceptually, npos is a magic value.
  * It is inefficient, as it requires extra arithmetic operations to convert between indices and iterators, or to check for npos. For example, one might need to write something like this:

if (auto pos = s.find(c); pos != std::string::npos) {

 auto it = s.begin() + pos;

 // do something with it

}

This involves an addition operation, which could be avoided if s.find(c) returned an iterator directly.

Proposal

I propose to add a new set of find functions for basic_string and basic_string_view, which take and return iterators instead of indices. These functions will have the same name as the existing ones, but with an “i” prefix. For example:

template<classCharT, classTraits, classAllocator>

 constexpr basic_string<CharT, Traits, Allocator>::const_iterator

    basic_string<CharT, traits, Allocator>::ifind(CharT ch, basic_string<CharT, Traits, Allocator>::const_iterator first = {}) constnoexcept;



template<classCharT, classTraits>

 constexpr basic_string_view<CharT, Traits>::const_iterator

    basic_string_view<CharT, Traits>::ifind(CharT ch, basic_string_view<CharT, Traits>::const_iterator first = {}) constnoexcept;

These functions will behave similarly to the existing ones, except that will return an iterator to the first element of the found substring, or the end iterator if not found and they will also take iterators as parameters to specify the range to search in, instead of indices. For example:

std::string s {"Hello world"};

auto it = s.ifind("world"); // returns an iterator to 'w'

auto it2 = s.ifind("foo"); // returns s.end()

auto it3 = s.ifind("ll", s.begin() + 2); // returns an iterator to 'l'

auto it4 = s.ifind("ll", s.begin() + 3); // returns s.end()

I also propose to deprecate the existing index-based find functions, and encourage users to migrate to the new iterator-based ones. This will make the interface of basic_string and basic_string_view more consistent with the rest of the standard library, and avoid the confusion and inefficiency caused by indices and npos.

References

  * [1] Alexander Stepanov, “STL and Its Design Principles”, Talk presented at Adobe Systems Inc., 2002.
  * [2] Bjarne Stroustrup, “The Design and Evolution of C++”, Addison-Wesley, 1994.



--
Std-Proposals mailing list
Std-Proposals_at_[hidden]<mailto:Std-Proposals_at_[hidden]>
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals



--
 Std-Proposals mailing list
 Std-Proposals_at_[hidden]
 https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2023-09-01 14:58:29