C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Floating an idea: unstable_remove_if and friends

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Fri, 7 Feb 2025 11:48:15 -0500
You can find a few more benchmarks for `unstable_remove_if` by searching
GitHub (oddly, not Google — I guess Google's AI is feuding with Microsoft's
AI now) for "unstable_remove_if benchmark."

Eric Niebler's range-v3 library has `unstable_remove_if` (#1010
<https://github.com/ericniebler/range-v3/pull/1010>), including some
potentially important discussion about the iterator+sentinel overload. At
first glance I think that discussion applies only to the
ranges::action::unstable_remove_if "action," which wouldn't be included in
the Standard Library version, but I could be wrong.

My own benchmark for SG14's `unstable_remove_if` (here
<https://github.com/Quuxplusone/SG14/blob/b509fc470a30e1d19820487f899f987c12653a68/benchmarks/unstable_remove_bench.cpp#L20-L38>)
compares the algorithm on a vector(30'000) of array<int,16>. Its reported
timings on my MacBook just now are 620µs for unstable_remove_if, 650µs for
stable remove_if.

It occurred to me yesterday that (at least when the predicate is
usually-false) the stable `remove_if` spends most of its time shifting down
large chunks of elements by small fixed amounts. libc++
<https://github.com/llvm/llvm-project/blob/main/libcxx/include/__algorithm/remove_if.h>,
libstdc++
<https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L2122-L2140>,
and
MS
<https://github.com/microsoft/STL/blob/fc15609a0f2ae2a134c34e7c9a13977994f37367/stl/inc/xmemory#L2341-L2360>
all
implement that as repeated single assignments: test, assign, increment,
test, assign, increment... But in the commonest case (contiguous iterator,
trivially move-assignable value_type) it might make sense to implement it
as test, increment, test, increment, test, increment, std::copy; test,
increment, test, increment, test, increment, test, increment, std::copy;
and so on, because the large std::copy chunks can be further optimized into
memmove. I wonder if anyone's ever benchmarked *that*.

–Arthur


On Thu, Feb 6, 2025 at 7:34 PM Ivan Matek via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> If you could share some code that would be nice.
>
> It is possible that compiler just decided to not unroll one loop and
> unroll another or something silly like that.
>
> Although last time I checked this(10+y ago) I also noticed minimal
> improvements except for types that do not have efficient move constructor.
>
> On Thu, Feb 6, 2025 at 10:10 PM Ted Lyngmo via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> Hi!
>>
>> I created a few simple test cases to be able to show what would be
>> gained by using the set of unstable_* functions since I didn’t have my
>> old tests saved.
>>
>> I’m not sure what’s changed since I wrote the functions, but their
>> performance is really bad. Only with very contrived data sets will they
>> outperform the “stable” versions. I assume cache locality explains most
>> of the difference. I must have been testing them with some bias back
>> then or used an ancient computer :)
>>
>> Anyway, I no longer think they are such a good idea.
>
>
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2025-02-07 16:48:29