ISOCPP std-proposals List: Re: [std-proposals] [bitset] find first set bit after a position pos

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Mon, 24 Oct 2022 12:08:28 -0400

On Mon, Oct 24, 2022 at 11:20 AM Madhur Chauhan via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> Hello everybody,
>
> *Problem*: Find the first set bit in a bitset with index >= a given pos
> *[...]*
> *Proposal*: Add a new member into std::bitset
>

Thought-provoking!

If you're touching std::bitset, then at first glance I'd think a good
approach would be to propose standardizing the entire <bit> header for
std::bitset:
bool has_single_bit(const std::bitset<N>&) noexcept;
std::bitset<N> bit_ceil(const std::bitset<N>&) noexcept;
std::bitset<N> bit_floor(const std::bitset<N>&) noexcept;
int bit_width(const std::bitset<N>&) noexcept;
std::bitset<N> rotl(const std::bitset<N>&, int) noexcept;
std::bitset<N> rotr(const std::bitset<N>&, int) noexcept;
int countl_zero(const std::bitset<N>&) noexcept;
int countr_zero(const std::bitset<N>&) noexcept;
int countl_one(const std::bitset<N>&) noexcept;
int countr_one(const std::bitset<N>&) noexcept;
int popcount(const std::bitset<N>&) noexcept;

Then you could write
  int find_first_set(std::bitset<N> bs, int pos) {
    bs >>= pos;
    return std::countr_zero(bs) + pos;
  }

Notice that if std::bitset were allocating, like `vector<bool>`, then you
would want to take-by-value in the functions that return bitsets (so that
you could manipulate the by-value parameter directly and then move it into
the return slot, instead of making your own copy); but since std::bitset is
non-allocating, like `std::array`, you actually want to take it by
reference and make that (NRVO-eligible) copy yourself.

I know my suggested `find_first_set` implementation isn't going to be as
fast as yours, because mine wastes all that time memcpy'ing (or worse)
inside the `>>=`. But is that a complete dealbreaker?
The "fully general, design-by-committee" approach might be to have
something like `std::slice` for bitsets, so that you could do `return
countr_zero(slice(bs, pos, -1))`. But that feels like a bad idea to me, and
therefore I bet probably also to the committee. ;)

Another idea might be to expand `std::bitset` with member versions of the
<bit> header, each taking optional start and end indices á là
std::string::compare:

    template<size_t N>
    class bitset {
        static constexpr size_t npos = size_t(-1);
        constexpr size_t countl_zero(size_t start = 0, size_t end = npos)
const;
        constexpr size_t countr_zero(size_t start = 0, size_t end = npos)
const;
        constexpr size_t countl_one(size_t start = 0, size_t end = npos)
const;
        constexpr size_t countr_one(size_t start = 0, size_t end = npos)
const;
        constexpr size_t bit_width(size_t start = 0, size_t end = npos)
const;
        constexpr size_t popcount(size_t start = 0, size_t end = npos)
const;
        constexpr bitset<N> bit_floor() const;
        constexpr bitset<N> bit_ceil() const;
        constexpr bitset<N> rotl(int n, size_t start = 0, size_t end =
npos) const;
        constexpr bitset<N> rotr(int n, size_t start = 0, size_t end =
npos) const;
        [...]
    };

But this feels awkward, because these members return `size_t` instead of
`int`-as-in-<bit>; and also because the mutating ones `bit_floor`,
`rotl`,... don't really have good interfaces. It seems that you might want
something like `std::string::substr` for bitsets, too.

But the deeper I get into this rabbit hole, the more it seems that this
whole feature-set is incompatible with the design of std::bitset. The
purpose of `bitset` seems to be right in the name: it's a *set!* It's not
obviously meant to be an arbitrary-bignum-math class or even an
arbitrary-bitstring class. On the one hand, it does have >>= and <<=; but
on the other hand, if you're using it as a *set*, as a mapping from small
consecutive ints to bools, then it's highly non-obvious why you'd ever want
an operation like
    std::bitset<10> b;
    b[2] = true; // 2 is in the set
    b <<= 3; // ha ha, now 2 is *not* in the set and 5 *is* in the set!
That operation just doesn't seem to correspond to anything in the problem
domain. Likewise,
    int n = b.countr_zero(5); // count the number of consecutive elements
>=5 that are *not* in the set, until the next one that is
doesn't seem like a very set-like operation.

So maybe the real best answer would be to use `_ExtInt(N)`, and encourage
vendors to provide the <bit> operations for `_ExtInt(N)`...

I end up with no clear recommendation, except that if you propose such a
thing, the paper ought to discuss all these different possible designs and
explain the pros and cons of each (as you see them).

my $.02,
–Arthur

Received on 2022-10-24 16:08:40