std-proposals: Re: Finding the index of an element (std::find_first

From: Kosher Yosher <goekhan.oezdogan_at_[hidden]>
Date: Fri, 2 Oct 2020 18:39:29 +0100

1 function call being more concise than 2 is not an opinion, it's a fact. I
honestly can't believe I have to explain this over multiple emails. I hope
we can agree that calling water "water" is better than "hydrogen + oxygen".
Apply the exact same logic to programming languages please.

And if you insist on an example that is "grounded":

std::list<int> bigDynamicList; /* maybe part of a library you can't modify
*/
std::vector<int> someLUT; /* happens to be what you are stuck with and you
need to do something with it now */
...
auto i = /* insert std_find_if + distance or the proposed template to
search in bigDynamicList. */
someLUT[i] = 42;

is not as far fetched as you make it out to be.

Using this indirection based index lookup concept is a very common
operation, using an iterator for this purpose as you suggest is not what I
would do here in C++, that'd be even uglier: the index semantics are fine
and the intent is clear with the template function. If someLUT is not
random access like you suggested, the proposed template is still going to
be the one doing the minimum amount of work possible. You haven't pointed
out a single gotcha yet, because there is none. It does what you would do
by hand which is the correct solution for this operation.

If the user needs both iterator and index use the ranged version. It's ok
to have both in your toolkit and use the one that's concise for the
use-case at hand. Most GetX functions for this type of A/B indirection do
not need the iterator, they operate by definition on indices, so muddying
up code everywhere with unused iterators / tuple elements does not sound
good to me as a "standard" solution. Please don't say that's my opinion
again or I will sprinkle `int unusedDummy123;` variables throughout your
code. :)

And about the last point of this function causing confusion ... let's keep
it real: it's a function that says it returns the index of an element given
a predicate, and like Arthur pointed out, fits in with the already existing
count_X functions sort of, it's not an alien family of functions.

auto num_items0 = std::count_if(v.begin(), v.end(), [](const auto&
e){return e.abc == 0;});

auto index_of_first0 = std::count_until(v.begin(), v.end(), [](const
auto& e){return e.abc == 0;});

Looks like a nice buddy sitting next to count_if.

On Fri, 2 Oct 2020 at 15:09, Jason McKesson via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> On Thu, Oct 1, 2020 at 8:20 PM Gokhan via Std-Proposals
> <std-proposals_at_[hidden]> wrote:
> >
> > But in those situations, the initial range is undoubtedly an array
> (performance game programming doesn't do linked-lists), so `distance` is
> 100% OK.
> >
> >
> >
> > Yes, but that is not the point. Theoretical examples like that are
> cherries the standard picks on purpose in an attempt to make the library
> code as generic as possible without sacrificing performance. The example,
> regardless of how silly it might be in practice, is valid in that regard, a
> program can be written to prove it, and hence is a point in favour of the
> template.
>
> Changes to the standard need to be justified on the basis of making
> someone's life as a C++ programmer *materially better*. If all of the
> examples of cases where your suggestion are better than the
> alternatives are purely theoretical, if the actual real-world cases
> where someone needs this functionality can be covered by a trivial
> addition to their code, then your idea has very weak motivation.
>
> Thus far, your motivation comes down to two points:
>
> 1: Aesthetics. 1 function call is cleaner than 2.
>
> 2: Ranges of size beyond that of a difference-type.
>
> #1 is purely an opinion, and #2 is something that happens so rarely in
> the real world that it's basically a rounding error.
>
> However, with C++20 canonizing 2's complement signed integers, we do
> have the means to solve #2 now. Previously, if a positive difference
> from `std::distance` would exceed the different_type, then you get
> undefined behavior. We could now say that `std::distance` in these
> cases (and all iterator differencing functionality it is based on)
> behaves exactly as if you had gotten the difference as a `size_type`,
> then cast it to a `difference_type`. This would allow you to recover
> values exceeding the signed range by casting it back to `size_type`.
> Which C++20 makes valid.
>
> This is more generally useful than a targeted solution specifically
> for `std::find` and similar algorithms.
>
> > The same theoretical reasoning applies as an argument against Matthews
> point about the std::distance range covering the allocation limit in
> practice. The standard is written against an imaginary machine that *can*
> allocate the full max(size_t) range.
> >
> >
> >
> > std::distance also returns a signed integer type – containers are
> indexed by size_type (unsigned). Another point in favour of the suggested
> template.
> >
> >
> >
> > But, while all of that holds, it’s besides the more important practical
> argument that it is simply more concise to have 1 vs 2 function calls. If
> one of the goals of the standard committee is to convince people to stop
> writing C in C++, and express their intent clearly, then this is what you
> have to do: provide the absolute shortest form in the standard that Just
> Works(TM). The suggested template function is that.
>
> The structure of that argument could be applied to virtually anything.
> At some point, you have to look at a thing and ask whether the "1
> function call" version is worth the effort of adding to the standard
> compared to the "2 function call" version. Building higher level
> functionality out of lower level functionality is perfectly normal.
> Nobody expects the standard to provide a "do_my_work_for_me" function.
>
> So if you're going this route, then you need to provide a convincing
> argument that this code is sufficiently common to be worth adding a
> specific function for it. Especially since you're not just talking
> about one function but an entire family of them.
>
> Also, I don't understand this: "convince people to stop writing C in
> C++". Encouraging people to use indices when they could use iterators
> is a big part of the "C in C++" style of coding. So how are you
> encouraging people to stop doing that kind of coding when you're
> providing tools to help them do that kind of coding?
>
> > The ranges solution Dvir suggests in [P2164]
> >
> > auto [index, it] = ranges::find_if(range | views::enumerate, pred);
> >
> > is pretty good too, and I was expecting something like that to come out
> of the ranges proposals. It’s still a little off from the more concise,
> proposed version though, especially given that we have all the other
> count_X and find_X functions doing very similar work. I don’t see how
> adding this template would make anything worse, once the return type and
> naming is fixed.
>
> What if the user needs both the iterator and the index? I mean, if
> they're going to access the value at that index, you may as well do it
> through the iterator and save yourself the explicit offset, right? Is
> that not a better "expression of the programmer's intent" than having
> to offset the beginning of that range again? And what if the range
> isn't random-access; that increment is an O(1) operation?
>
> Basically, we don't need 3 ways to do a find. We need 1 way that has
> the flexibility to do what's needed. If the user needs enumeration,
> then they can be taught the way to do that: use `views::enumerate`.
> That mechanism is *universal*, working on every algorithm and even
> range-based `for` equally well and in a consistent way.
>
> Why confuse this by adding a second way to enumerate a range that only
> works in very specific cases?
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2020-10-02 12:39:43