Date: Thu, 7 Nov 2019 18:11:14 +0100
Dear std C++ community
The new ranges and views will change the way we write algorithms. I
recently found std::sample and thought it would be nice to have to have an
analogous sample view which samples from a population without repetition -
with repetition is so trivial that we can ignore it for now.
What do you think about a sample view that accepts a range and a random
generator and returns a range which can be iterated over to access the
elements of the input range in a random order evaluating a bijective
(perfect and minimal) hash function to generate random indices without
repetition? This range would could still be relatively lightweight, because
it would only store a state encoded in an integer and the randomized
bijective hash function, similar to a random number generator. Sadly this
could only work well for random access iterators, not for forward
iterators, there reservoir sampling is the non-lightweight solution.
To preempt one objection, yes a randomized bijective hash for n consecutive
integers can be generated in small constant time. Multiplication with a
coprime integer and addition modulo n is bijective and randomizes the
order, for example. There are better hashes however.
I'd be interested what you think and volunteer to write a sample
implementation if there is interest - I am using similar ideas in my own
code, but you know how it is, it's not pretty.
Best Regards,
Wolfgang Brehm
The new ranges and views will change the way we write algorithms. I
recently found std::sample and thought it would be nice to have to have an
analogous sample view which samples from a population without repetition -
with repetition is so trivial that we can ignore it for now.
What do you think about a sample view that accepts a range and a random
generator and returns a range which can be iterated over to access the
elements of the input range in a random order evaluating a bijective
(perfect and minimal) hash function to generate random indices without
repetition? This range would could still be relatively lightweight, because
it would only store a state encoded in an integer and the randomized
bijective hash function, similar to a random number generator. Sadly this
could only work well for random access iterators, not for forward
iterators, there reservoir sampling is the non-lightweight solution.
To preempt one objection, yes a randomized bijective hash for n consecutive
integers can be generated in small constant time. Multiplication with a
coprime integer and addition modulo n is bijective and randomizes the
order, for example. There are better hashes however.
I'd be interested what you think and volunteer to write a sample
implementation if there is interest - I am using similar ideas in my own
code, but you know how it is, it's not pretty.
Best Regards,
Wolfgang Brehm
Received on 2019-11-07 11:13:45