Date: Mon, 24 Oct 2022 17:14:38 -0400
On Mon, Oct 24, 2022 at 1:22 PM Madhur Chauhan <madhur4127_at_[hidden]> wrote:
> Thanks for the comments Arthur, much appreciated!
>
> I think the core problem is absense of an efficient iterator for bitset.
> Normal random access iterator will still access each bit and that's
> inefficient so we'll need something different that can process words and
> then trailing bits.
> If that fancy iterator exists then we don't need proposed function and
> <algorithm> can be used instead.
>
That's a great point. I had forgotten (but now I vaguely recall) that
libc++ has several algorithms, such as std::find, specifically optimized
for bit-iterators. They also have a few algorithms specifically optimized
for deque::iterator, because deques are "piecewise contiguous" in a way
that's known at compile time. (Sadly, P0447 hive::iterator can't benefit
from the same optimizations because a hive's degree of piecewise-contiguity
is known only at runtime.)
Double-sadly, `std::bitset` doesn't have iterators! Bit-iterators only
exist for `vector<bool>`. I agree that "bitset is not iterable" might be
the root of the problem here. But, again, if we think of a bitset as a
mapping from small integers to bools (or a *set* of small integers), then
it kind of makes sense that you can't "iterate" it — you'd just be
iterating over "false, false, true, false, true, false..." which isn't
relevant to the problem domain. (Contrast to vector<bool>, which *is* a
sequence of bools you can iterate.)
I could see it being hard to convince LEWG that there was a real problem
here, unfortunately.
Did you try benchmarking the <algorithm> version of your code, and the
hand-rolled version, using `vector<bool>`? How does it compare?
I tried this version of the code:
https://godbolt.org/z/9q88qzG4e
GCC+*libstdc++* gives these numbers on Godbolt:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
bs_manual 4184 ns 2251 ns 252405
bv_manual 4061 ns 2349 ns 288193
bv_stl 4164 ns 2210 ns 255113
bs_library 130 ns 62.6 ns 14677544
Clang+*libc++* gives these numbers on my local laptop:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
bs_manual 2186 ns 2180 ns 315701
bv_manual 2115 ns 2112 ns 319833
bv_stl 39.7 ns 39.6 ns 17456620
Hmm, I had thought C++20 or C++23 was going to give us std::bit_iterator
and std::bit_reference... what ever happened to those?
Not that they would be specifically useful in this scenario, since you
can't get at the underlying storage of a std::bitset<N> in order to iterate
over it, even if you *had* a std::bit_iterator adaptor.
–Arthur
>
> Thanks for the comments Arthur, much appreciated!
>
> I think the core problem is absense of an efficient iterator for bitset.
> Normal random access iterator will still access each bit and that's
> inefficient so we'll need something different that can process words and
> then trailing bits.
> If that fancy iterator exists then we don't need proposed function and
> <algorithm> can be used instead.
>
That's a great point. I had forgotten (but now I vaguely recall) that
libc++ has several algorithms, such as std::find, specifically optimized
for bit-iterators. They also have a few algorithms specifically optimized
for deque::iterator, because deques are "piecewise contiguous" in a way
that's known at compile time. (Sadly, P0447 hive::iterator can't benefit
from the same optimizations because a hive's degree of piecewise-contiguity
is known only at runtime.)
Double-sadly, `std::bitset` doesn't have iterators! Bit-iterators only
exist for `vector<bool>`. I agree that "bitset is not iterable" might be
the root of the problem here. But, again, if we think of a bitset as a
mapping from small integers to bools (or a *set* of small integers), then
it kind of makes sense that you can't "iterate" it — you'd just be
iterating over "false, false, true, false, true, false..." which isn't
relevant to the problem domain. (Contrast to vector<bool>, which *is* a
sequence of bools you can iterate.)
I could see it being hard to convince LEWG that there was a real problem
here, unfortunately.
Did you try benchmarking the <algorithm> version of your code, and the
hand-rolled version, using `vector<bool>`? How does it compare?
I tried this version of the code:
https://godbolt.org/z/9q88qzG4e
GCC+*libstdc++* gives these numbers on Godbolt:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
bs_manual 4184 ns 2251 ns 252405
bv_manual 4061 ns 2349 ns 288193
bv_stl 4164 ns 2210 ns 255113
bs_library 130 ns 62.6 ns 14677544
Clang+*libc++* gives these numbers on my local laptop:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
bs_manual 2186 ns 2180 ns 315701
bv_manual 2115 ns 2112 ns 319833
bv_stl 39.7 ns 39.6 ns 17456620
Hmm, I had thought C++20 or C++23 was going to give us std::bit_iterator
and std::bit_reference... what ever happened to those?
Not that they would be specifically useful in this scenario, since you
can't get at the underlying storage of a std::bitset<N> in order to iterate
over it, even if you *had* a std::bit_iterator adaptor.
–Arthur
>
Received on 2022-10-24 21:14:50