Hello everybody,
Problem: Find the first set bit in a bitset with index >= a given pos
[...]
Proposal: Add a new member into std::bitset
Thought-provoking!
If you're touching std::bitset, then at first glance I'd think a good approach would be to propose standardizing the entire <bit> header for std::bitset:
bool has_single_bit(const std::bitset<N>&) noexcept;
std::bitset<N> bit_ceil(const std::bitset<N>&) noexcept;
std::bitset<N> bit_floor(const std::bitset<N>&) noexcept;
int bit_width(const std::bitset<N>&) noexcept;
std::bitset<N> rotl(const std::bitset<N>&, int) noexcept;
std::bitset<N> rotr(const std::bitset<N>&, int) noexcept;
int countl_zero(const std::bitset<N>&) noexcept;
int countr_zero(const std::bitset<N>&) noexcept;
int countl_one(const std::bitset<N>&) noexcept;
int countr_one(const std::bitset<N>&) noexcept;
int popcount(const std::bitset<N>&) noexcept;
Then you could write
int find_first_set(std::bitset<N> bs, int pos) {
bs >>= pos;
return std::countr_zero(bs) + pos;
}
Notice that if std::bitset were allocating, like `vector<bool>`, then you would want to take-by-value in the functions that return bitsets (so that you could manipulate the by-value parameter directly and then move it into the return slot, instead of making your own copy); but since std::bitset is non-allocating, like `std::array`, you actually want to take it by reference and make that (NRVO-eligible) copy yourself.
I know my suggested `find_first_set` implementation isn't going to be as fast as yours, because mine wastes all that time memcpy'ing (or worse) inside the `>>=`. But is that a complete dealbreaker?
The "fully general, design-by-committee" approach might be to have something like `std::slice` for bitsets, so that you could do `return countr_zero(slice(bs, pos, -1))`. But that feels like a bad idea to me, and therefore I bet probably also to the committee. ;)
Another idea might be to expand `std::bitset` with member versions of the <bit> header, each taking optional start and end indices á là std::string::compare:
template<size_t N>
class bitset {
static constexpr size_t npos = size_t(-1);
constexpr size_t countl_zero(size_t start = 0, size_t end = npos) const;
constexpr size_t countr_zero(size_t start = 0, size_t end = npos) const;
constexpr size_t countl_one(size_t start = 0, size_t end = npos) const;
constexpr size_t countr_one(size_t start = 0, size_t end = npos) const;
constexpr size_t bit_width(size_t start = 0, size_t end = npos) const;
constexpr size_t popcount(size_t start = 0, size_t end = npos) const;
constexpr bitset<N> bit_floor() const;
constexpr bitset<N> bit_ceil() const;
constexpr bitset<N> rotl(int n, size_t start = 0, size_t end = npos) const;
constexpr bitset<N> rotr(int n, size_t start = 0, size_t end = npos) const;
[...]
};
But this feels awkward, because these members return `size_t` instead of `int`-as-in-<bit>; and also because the mutating ones `bit_floor`, `rotl`,... don't really have good interfaces. It seems that you might want something like `std::string::substr` for bitsets, too.
But the deeper I get into this rabbit hole, the more it seems that this whole feature-set is incompatible with the design of std::bitset. The purpose of `bitset` seems to be right in the name: it's a set! It's not obviously meant to be an arbitrary-bignum-math class or even an arbitrary-bitstring class. On the one hand, it does have >>= and <<=; but on the other hand, if you're using it as a set, as a mapping from small consecutive ints to bools, then it's highly non-obvious why you'd ever want an operation like
std::bitset<10> b;
b[2] = true; // 2 is in the set
b <<= 3; // ha ha, now 2 is not in the set and 5 is in the set!
That operation just doesn't seem to correspond to anything in the problem domain. Likewise,
int n = b.countr_zero(5); // count the number of consecutive elements >=5 that are not in the set, until the next one that is
doesn't seem like a very set-like operation.
So maybe the real best answer would be to use `_ExtInt(N)`, and encourage vendors to provide the <bit> operations for `_ExtInt(N)`...
I end up with no clear recommendation, except that if you propose such a thing, the paper ought to discuss all these different possible designs and explain the pros and cons of each (as you see them).
my $.02,
–Arthur