std-proposals: Re: Finding the index of an element (std::find_first

From: Simon Kraemer <sikraemer_at_[hidden]>
Date: Sat, 3 Oct 2020 11:42:03 +0200

> auto i = /* insert std_find_if + distance or the proposed template to
search in bigDynamicList. */
> someLUT[i] = 42;

Why not just use iterators as they are meant to be used?

auto iter = std::find(...);
*iter = 42;

Same number of lines, no duplicate lookup.
No index required.

Kosher Yosher via Std-Proposals <std-proposals_at_[hidden]> schrieb am
Fr., 2. Okt. 2020, 19:41:

> 1 function call being more concise than 2 is not an opinion, it's a fact.
> I honestly can't believe I have to explain this over multiple emails. I
> hope we can agree that calling water "water" is better than
> "hydrogen + oxygen". Apply the exact same logic to programming languages
> please.
>
> And if you insist on an example that is "grounded":
>
> std::list<int> bigDynamicList; /* maybe part of a library you can't modify
> */
> std::vector<int> someLUT; /* happens to be what you are stuck with and you
> need to do something with it now */
> ...
> auto i = /* insert std_find_if + distance or the proposed template to
> search in bigDynamicList. */
> someLUT[i] = 42;
>
> is not as far fetched as you make it out to be.
>
> Using this indirection based index lookup concept is a very common
> operation, using an iterator for this purpose as you suggest is not what I
> would do here in C++, that'd be even uglier: the index semantics are fine
> and the intent is clear with the template function. If someLUT is not
> random access like you suggested, the proposed template is still going to
> be the one doing the minimum amount of work possible. You haven't pointed
> out a single gotcha yet, because there is none. It does what you would do
> by hand which is the correct solution for this operation.
>
> If the user needs both iterator and index use the ranged version. It's ok
> to have both in your toolkit and use the one that's concise for the
> use-case at hand. Most GetX functions for this type of A/B indirection do
> not need the iterator, they operate by definition on indices, so muddying
> up code everywhere with unused iterators / tuple elements does not sound
> good to me as a "standard" solution. Please don't say that's my opinion
> again or I will sprinkle `int unusedDummy123;` variables throughout your
> code. :)
>
> And about the last point of this function causing confusion ... let's keep
> it real: it's a function that says it returns the index of an element given
> a predicate, and like Arthur pointed out, fits in with the already existing
> count_X functions sort of, it's not an alien family of functions.
>
> auto num_items0 = std::count_if(v.begin(), v.end(), [](const auto& e){return e.abc == 0;});
>
> auto index_of_first0 = std::count_until(v.begin(), v.end(), [](const auto& e){return e.abc == 0;});
>
>
> Looks like a nice buddy sitting next to count_if.
>
> On Fri, 2 Oct 2020 at 15:09, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> On Thu, Oct 1, 2020 at 8:20 PM Gokhan via Std-Proposals
>> <std-proposals_at_[hidden]> wrote:
>> >
>> > But in those situations, the initial range is undoubtedly an array
>> (performance game programming doesn't do linked-lists), so `distance` is
>> 100% OK.
>> >
>> >
>> >
>> > Yes, but that is not the point. Theoretical examples like that are
>> cherries the standard picks on purpose in an attempt to make the library
>> code as generic as possible without sacrificing performance. The example,
>> regardless of how silly it might be in practice, is valid in that regard, a
>> program can be written to prove it, and hence is a point in favour of the
>> template.
>>
>> Changes to the standard need to be justified on the basis of making
>> someone's life as a C++ programmer *materially better*. If all of the
>> examples of cases where your suggestion are better than the
>> alternatives are purely theoretical, if the actual real-world cases
>> where someone needs this functionality can be covered by a trivial
>> addition to their code, then your idea has very weak motivation.
>>
>> Thus far, your motivation comes down to two points:
>>
>> 1: Aesthetics. 1 function call is cleaner than 2.
>>
>> 2: Ranges of size beyond that of a difference-type.
>>
>> #1 is purely an opinion, and #2 is something that happens so rarely in
>> the real world that it's basically a rounding error.
>>
>> However, with C++20 canonizing 2's complement signed integers, we do
>> have the means to solve #2 now. Previously, if a positive difference
>> from `std::distance` would exceed the different_type, then you get
>> undefined behavior. We could now say that `std::distance` in these
>> cases (and all iterator differencing functionality it is based on)
>> behaves exactly as if you had gotten the difference as a `size_type`,
>> then cast it to a `difference_type`. This would allow you to recover
>> values exceeding the signed range by casting it back to `size_type`.
>> Which C++20 makes valid.
>>
>> This is more generally useful than a targeted solution specifically
>> for `std::find` and similar algorithms.
>>
>> > The same theoretical reasoning applies as an argument against Matthews
>> point about the std::distance range covering the allocation limit in
>> practice. The standard is written against an imaginary machine that *can*
>> allocate the full max(size_t) range.
>> >
>> >
>> >
>> > std::distance also returns a signed integer type – containers are
>> indexed by size_type (unsigned). Another point in favour of the suggested
>> template.
>> >
>> >
>> >
>> > But, while all of that holds, it’s besides the more important practical
>> argument that it is simply more concise to have 1 vs 2 function calls. If
>> one of the goals of the standard committee is to convince people to stop
>> writing C in C++, and express their intent clearly, then this is what you
>> have to do: provide the absolute shortest form in the standard that Just
>> Works(TM). The suggested template function is that.
>>
>> The structure of that argument could be applied to virtually anything.
>> At some point, you have to look at a thing and ask whether the "1
>> function call" version is worth the effort of adding to the standard
>> compared to the "2 function call" version. Building higher level
>> functionality out of lower level functionality is perfectly normal.
>> Nobody expects the standard to provide a "do_my_work_for_me" function.
>>
>> So if you're going this route, then you need to provide a convincing
>> argument that this code is sufficiently common to be worth adding a
>> specific function for it. Especially since you're not just talking
>> about one function but an entire family of them.
>>
>> Also, I don't understand this: "convince people to stop writing C in
>> C++". Encouraging people to use indices when they could use iterators
>> is a big part of the "C in C++" style of coding. So how are you
>> encouraging people to stop doing that kind of coding when you're
>> providing tools to help them do that kind of coding?
>>
>> > The ranges solution Dvir suggests in [P2164]
>> >
>> > auto [index, it] = ranges::find_if(range | views::enumerate, pred);
>> >
>> > is pretty good too, and I was expecting something like that to come out
>> of the ranges proposals. It’s still a little off from the more concise,
>> proposed version though, especially given that we have all the other
>> count_X and find_X functions doing very similar work. I don’t see how
>> adding this template would make anything worse, once the return type and
>> naming is fixed.
>>
>> What if the user needs both the iterator and the index? I mean, if
>> they're going to access the value at that index, you may as well do it
>> through the iterator and save yourself the explicit offset, right? Is
>> that not a better "expression of the programmer's intent" than having
>> to offset the beginning of that range again? And what if the range
>> isn't random-access; that increment is an O(1) operation?
>>
>> Basically, we don't need 3 ways to do a find. We need 1 way that has
>> the flexibility to do what's needed. If the user needs enumeration,
>> then they can be taught the way to do that: use `views::enumerate`.
>> That mechanism is *universal*, working on every algorithm and even
>> range-based `for` equally well and in a consistent way.
>>
>> Why confuse this by adding a second way to enumerate a range that only
>> works in very specific cases?
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2020-10-03 04:42:17