Date: Mon, 24 Oct 2022 12:08:28 -0400
On Mon, Oct 24, 2022 at 11:20 AM Madhur Chauhan via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> Hello everybody,
>
> *Problem*: Find the first set bit in a bitset with index >= a given pos
> *[...]*
> *Proposal*: Add a new member into std::bitset
>
Thought-provoking!
If you're touching std::bitset, then at first glance I'd think a good
approach would be to propose standardizing the entire <bit> header for
std::bitset:
bool has_single_bit(const std::bitset<N>&) noexcept;
std::bitset<N> bit_ceil(const std::bitset<N>&) noexcept;
std::bitset<N> bit_floor(const std::bitset<N>&) noexcept;
int bit_width(const std::bitset<N>&) noexcept;
std::bitset<N> rotl(const std::bitset<N>&, int) noexcept;
std::bitset<N> rotr(const std::bitset<N>&, int) noexcept;
int countl_zero(const std::bitset<N>&) noexcept;
int countr_zero(const std::bitset<N>&) noexcept;
int countl_one(const std::bitset<N>&) noexcept;
int countr_one(const std::bitset<N>&) noexcept;
int popcount(const std::bitset<N>&) noexcept;
Then you could write
int find_first_set(std::bitset<N> bs, int pos) {
bs >>= pos;
return std::countr_zero(bs) + pos;
}
Notice that if std::bitset were allocating, like `vector<bool>`, then you
would want to take-by-value in the functions that return bitsets (so that
you could manipulate the by-value parameter directly and then move it into
the return slot, instead of making your own copy); but since std::bitset is
non-allocating, like `std::array`, you actually want to take it by
reference and make that (NRVO-eligible) copy yourself.
I know my suggested `find_first_set` implementation isn't going to be as
fast as yours, because mine wastes all that time memcpy'ing (or worse)
inside the `>>=`. But is that a complete dealbreaker?
The "fully general, design-by-committee" approach might be to have
something like `std::slice` for bitsets, so that you could do `return
countr_zero(slice(bs, pos, -1))`. But that feels like a bad idea to me, and
therefore I bet probably also to the committee. ;)
Another idea might be to expand `std::bitset` with member versions of the
<bit> header, each taking optional start and end indices á là
std::string::compare:
template<size_t N>
class bitset {
static constexpr size_t npos = size_t(-1);
constexpr size_t countl_zero(size_t start = 0, size_t end = npos)
const;
constexpr size_t countr_zero(size_t start = 0, size_t end = npos)
const;
constexpr size_t countl_one(size_t start = 0, size_t end = npos)
const;
constexpr size_t countr_one(size_t start = 0, size_t end = npos)
const;
constexpr size_t bit_width(size_t start = 0, size_t end = npos)
const;
constexpr size_t popcount(size_t start = 0, size_t end = npos)
const;
constexpr bitset<N> bit_floor() const;
constexpr bitset<N> bit_ceil() const;
constexpr bitset<N> rotl(int n, size_t start = 0, size_t end =
npos) const;
constexpr bitset<N> rotr(int n, size_t start = 0, size_t end =
npos) const;
[...]
};
But this feels awkward, because these members return `size_t` instead of
`int`-as-in-<bit>; and also because the mutating ones `bit_floor`,
`rotl`,... don't really have good interfaces. It seems that you might want
something like `std::string::substr` for bitsets, too.
But the deeper I get into this rabbit hole, the more it seems that this
whole feature-set is incompatible with the design of std::bitset. The
purpose of `bitset` seems to be right in the name: it's a *set!* It's not
obviously meant to be an arbitrary-bignum-math class or even an
arbitrary-bitstring class. On the one hand, it does have >>= and <<=; but
on the other hand, if you're using it as a *set*, as a mapping from small
consecutive ints to bools, then it's highly non-obvious why you'd ever want
an operation like
std::bitset<10> b;
b[2] = true; // 2 is in the set
b <<= 3; // ha ha, now 2 is *not* in the set and 5 *is* in the set!
That operation just doesn't seem to correspond to anything in the problem
domain. Likewise,
int n = b.countr_zero(5); // count the number of consecutive elements
>=5 that are *not* in the set, until the next one that is
doesn't seem like a very set-like operation.
So maybe the real best answer would be to use `_ExtInt(N)`, and encourage
vendors to provide the <bit> operations for `_ExtInt(N)`...
I end up with no clear recommendation, except that if you propose such a
thing, the paper ought to discuss all these different possible designs and
explain the pros and cons of each (as you see them).
my $.02,
–Arthur
std-proposals_at_[hidden]> wrote:
> Hello everybody,
>
> *Problem*: Find the first set bit in a bitset with index >= a given pos
> *[...]*
> *Proposal*: Add a new member into std::bitset
>
Thought-provoking!
If you're touching std::bitset, then at first glance I'd think a good
approach would be to propose standardizing the entire <bit> header for
std::bitset:
bool has_single_bit(const std::bitset<N>&) noexcept;
std::bitset<N> bit_ceil(const std::bitset<N>&) noexcept;
std::bitset<N> bit_floor(const std::bitset<N>&) noexcept;
int bit_width(const std::bitset<N>&) noexcept;
std::bitset<N> rotl(const std::bitset<N>&, int) noexcept;
std::bitset<N> rotr(const std::bitset<N>&, int) noexcept;
int countl_zero(const std::bitset<N>&) noexcept;
int countr_zero(const std::bitset<N>&) noexcept;
int countl_one(const std::bitset<N>&) noexcept;
int countr_one(const std::bitset<N>&) noexcept;
int popcount(const std::bitset<N>&) noexcept;
Then you could write
int find_first_set(std::bitset<N> bs, int pos) {
bs >>= pos;
return std::countr_zero(bs) + pos;
}
Notice that if std::bitset were allocating, like `vector<bool>`, then you
would want to take-by-value in the functions that return bitsets (so that
you could manipulate the by-value parameter directly and then move it into
the return slot, instead of making your own copy); but since std::bitset is
non-allocating, like `std::array`, you actually want to take it by
reference and make that (NRVO-eligible) copy yourself.
I know my suggested `find_first_set` implementation isn't going to be as
fast as yours, because mine wastes all that time memcpy'ing (or worse)
inside the `>>=`. But is that a complete dealbreaker?
The "fully general, design-by-committee" approach might be to have
something like `std::slice` for bitsets, so that you could do `return
countr_zero(slice(bs, pos, -1))`. But that feels like a bad idea to me, and
therefore I bet probably also to the committee. ;)
Another idea might be to expand `std::bitset` with member versions of the
<bit> header, each taking optional start and end indices á là
std::string::compare:
template<size_t N>
class bitset {
static constexpr size_t npos = size_t(-1);
constexpr size_t countl_zero(size_t start = 0, size_t end = npos)
const;
constexpr size_t countr_zero(size_t start = 0, size_t end = npos)
const;
constexpr size_t countl_one(size_t start = 0, size_t end = npos)
const;
constexpr size_t countr_one(size_t start = 0, size_t end = npos)
const;
constexpr size_t bit_width(size_t start = 0, size_t end = npos)
const;
constexpr size_t popcount(size_t start = 0, size_t end = npos)
const;
constexpr bitset<N> bit_floor() const;
constexpr bitset<N> bit_ceil() const;
constexpr bitset<N> rotl(int n, size_t start = 0, size_t end =
npos) const;
constexpr bitset<N> rotr(int n, size_t start = 0, size_t end =
npos) const;
[...]
};
But this feels awkward, because these members return `size_t` instead of
`int`-as-in-<bit>; and also because the mutating ones `bit_floor`,
`rotl`,... don't really have good interfaces. It seems that you might want
something like `std::string::substr` for bitsets, too.
But the deeper I get into this rabbit hole, the more it seems that this
whole feature-set is incompatible with the design of std::bitset. The
purpose of `bitset` seems to be right in the name: it's a *set!* It's not
obviously meant to be an arbitrary-bignum-math class or even an
arbitrary-bitstring class. On the one hand, it does have >>= and <<=; but
on the other hand, if you're using it as a *set*, as a mapping from small
consecutive ints to bools, then it's highly non-obvious why you'd ever want
an operation like
std::bitset<10> b;
b[2] = true; // 2 is in the set
b <<= 3; // ha ha, now 2 is *not* in the set and 5 *is* in the set!
That operation just doesn't seem to correspond to anything in the problem
domain. Likewise,
int n = b.countr_zero(5); // count the number of consecutive elements
>=5 that are *not* in the set, until the next one that is
doesn't seem like a very set-like operation.
So maybe the real best answer would be to use `_ExtInt(N)`, and encourage
vendors to provide the <bit> operations for `_ExtInt(N)`...
I end up with no clear recommendation, except that if you propose such a
thing, the paper ought to discuss all these different possible designs and
explain the pros and cons of each (as you see them).
my $.02,
–Arthur
Received on 2022-10-24 16:08:40