Date: Sun, 6 Sep 2020 09:13:24 -0400

On Sun, Sep 6, 2020 at 2:55 AM Andreas Ringlstetter via Std-Proposals <

std-proposals_at_[hidden]> wrote:

> Follow up on

> https://stackoverflow.com/questions/62306580/balanced-partition-of-iterator-range-without-legacyrandomaccessiterator

>

Interesting problem and relevant motivation!

On SO, darune writes:

> All associate containers are already inherently sorted and already

provides specialized accessors and api's.

But this isn't really true for your use-case, is it? I mean, suppose I

have a std::set. It is a balanced binary tree, but (unless I'm missing

something) there is no specialized accessor that would give me an iterator

to the "midpoint/root" element in anything less than O(n) time. And a very

slow and pointer-chasey O(n) at that!

So (unless I'm missing something) darune's dismissal is a bit premature.

However, I also don't see any clean way to implement what you're asking

for. Consider what the compiler sees for

std::set<int> s = {1,2,3,4,5,6,7,8,9,10,11};

auto first = s.begin(), last = s.end();

auto it1 = std2::divide_in_half(first, last);

assert(*it1 == 6);

std::advance(first, 2); std::advance(last, -4);

auto it2 = std2::divide_in_half(first, last);

assert(*it2 == 5);

The iterators `first` and `last` don't know anything about the container

that they're part of. They don't have any way to get at the "root" node of

that tree, except to traverse over to it. (It is true AFAIK that an

iterator would be able to tell *whether* it were pointing directly at the

root node, but that doesn't help us here AFAIK.)

And the type-system doesn't see any difference between the first call to

`std2::divide_in_half` there and the second call. The first one we'd like

to quickly return the root node; the second one we do *not *want to return

the root node, and must do the slow O(n) traversal.

NOTE: For the purposes of this thread, I'd like to completely handwave away

the practical difference between "the root node of the red-black tree" and

"the n/2th node in sorted order." I'd like to just assume that we can solve

any potential off-by-1 or off-by-2 errors that might arise, and not go down

that nerdsnipe rabbit hole at all.

[...]

> If they provide only the `LegacyForwardIterator` trait, only thing we

> can do in constant time is:

>

> std::set<int> foo;

> ... // fill it

> auto begin = foo.begin();

> auto end = foo.end();

> auto middle = (begin != end) ? ++(begin) : begin;

>

> No chance of processing the iterator range in an even remotely

> balanced way without a linear scan.

>

FWIW, `std::for_each` is allowed to take O(n) time to do its job. If it

spends 1/Kth of that time doing an O(n) traversal of the list to find its

actual midpoint, then that's fine, AFAIK. There's nothing in the formal

standard AFAIK that says it *must* use a dumb O(1) algorithm to partition

the work. If you're seeing that dumb algorithm being used in

implementations today, I think that's likely for software-engineering

reasons, not standard-conformance reasons.

[...]

> For reference, the following is about on-par with what the existing

> STL implementations do:

>

> // Generic implementation

> template<typename Iter, ~~~>

> std::pair<std::pair<Iter, Iter>, std::pair<Iter, Iter>>

> divide(const Iter &begin, const Iter &end)

> {

> if constexpr (std::is_convertible_v<typename

> std::iterator_traits<Iter>::iterator_category,

> std::random_access_iterator_tag>)

> {

> // Ideal bisection

> const Iter middle = end - std::distance(begin, end) / 2;

> return { {begin, middle}, {begin, middle} };

>

Here and below, that should be {{begin, middle}, {middle, end}}, right?

Finally, I think you have at least two different problems here:

(1) Bisect (trisect,...) a range into contiguous subranges of the same

iterator-type and *exactly* the right sizes. I think range-v3 would be

capable of this task; something with std::ranges::size and then repeatedly

views::slice. Someone better at Ranges than me might try it.

(2) Given a range, split it "quick and dirty" into subranges of possibly

different iterator-type, and possibly not exactly the right sizes, but

close enough to get speedup on parallel tasks.

You mention that unordered_set would probably be very good at #2 — but

horrible at #1.

Another kind of data structure that might be good at #2 would be

skiplists, whose iterators would know how to take "chunks" easily.

[...]

> A heuristic alternative to "std::distance". The exact form we already

> have does not have an efficient implementation for anything except

> dense, compact allocations where it can be reduced to pointer

> arithmetic.

>

Pedantry: No, std::distance uses iterator arithmetic, so it works

efficiently on any random-access iterator type. std::distance of std::deque

iterators is constant-time, for example, even though deque is not

contiguous.

C++20 adds "contiguous iterator" as a refinement of "random-access

iterator."

–Arthur

std-proposals_at_[hidden]> wrote:

> Follow up on

> https://stackoverflow.com/questions/62306580/balanced-partition-of-iterator-range-without-legacyrandomaccessiterator

>

Interesting problem and relevant motivation!

On SO, darune writes:

> All associate containers are already inherently sorted and already

provides specialized accessors and api's.

But this isn't really true for your use-case, is it? I mean, suppose I

have a std::set. It is a balanced binary tree, but (unless I'm missing

something) there is no specialized accessor that would give me an iterator

to the "midpoint/root" element in anything less than O(n) time. And a very

slow and pointer-chasey O(n) at that!

So (unless I'm missing something) darune's dismissal is a bit premature.

However, I also don't see any clean way to implement what you're asking

for. Consider what the compiler sees for

std::set<int> s = {1,2,3,4,5,6,7,8,9,10,11};

auto first = s.begin(), last = s.end();

auto it1 = std2::divide_in_half(first, last);

assert(*it1 == 6);

std::advance(first, 2); std::advance(last, -4);

auto it2 = std2::divide_in_half(first, last);

assert(*it2 == 5);

The iterators `first` and `last` don't know anything about the container

that they're part of. They don't have any way to get at the "root" node of

that tree, except to traverse over to it. (It is true AFAIK that an

iterator would be able to tell *whether* it were pointing directly at the

root node, but that doesn't help us here AFAIK.)

And the type-system doesn't see any difference between the first call to

`std2::divide_in_half` there and the second call. The first one we'd like

to quickly return the root node; the second one we do *not *want to return

the root node, and must do the slow O(n) traversal.

NOTE: For the purposes of this thread, I'd like to completely handwave away

the practical difference between "the root node of the red-black tree" and

"the n/2th node in sorted order." I'd like to just assume that we can solve

any potential off-by-1 or off-by-2 errors that might arise, and not go down

that nerdsnipe rabbit hole at all.

[...]

> If they provide only the `LegacyForwardIterator` trait, only thing we

> can do in constant time is:

>

> std::set<int> foo;

> ... // fill it

> auto begin = foo.begin();

> auto end = foo.end();

> auto middle = (begin != end) ? ++(begin) : begin;

>

> No chance of processing the iterator range in an even remotely

> balanced way without a linear scan.

>

FWIW, `std::for_each` is allowed to take O(n) time to do its job. If it

spends 1/Kth of that time doing an O(n) traversal of the list to find its

actual midpoint, then that's fine, AFAIK. There's nothing in the formal

standard AFAIK that says it *must* use a dumb O(1) algorithm to partition

the work. If you're seeing that dumb algorithm being used in

implementations today, I think that's likely for software-engineering

reasons, not standard-conformance reasons.

[...]

> For reference, the following is about on-par with what the existing

> STL implementations do:

>

> // Generic implementation

> template<typename Iter, ~~~>

> std::pair<std::pair<Iter, Iter>, std::pair<Iter, Iter>>

> divide(const Iter &begin, const Iter &end)

> {

> if constexpr (std::is_convertible_v<typename

> std::iterator_traits<Iter>::iterator_category,

> std::random_access_iterator_tag>)

> {

> // Ideal bisection

> const Iter middle = end - std::distance(begin, end) / 2;

> return { {begin, middle}, {begin, middle} };

>

Here and below, that should be {{begin, middle}, {middle, end}}, right?

Finally, I think you have at least two different problems here:

(1) Bisect (trisect,...) a range into contiguous subranges of the same

iterator-type and *exactly* the right sizes. I think range-v3 would be

capable of this task; something with std::ranges::size and then repeatedly

views::slice. Someone better at Ranges than me might try it.

(2) Given a range, split it "quick and dirty" into subranges of possibly

different iterator-type, and possibly not exactly the right sizes, but

close enough to get speedup on parallel tasks.

You mention that unordered_set would probably be very good at #2 — but

horrible at #1.

Another kind of data structure that might be good at #2 would be

skiplists, whose iterators would know how to take "chunks" easily.

[...]

> A heuristic alternative to "std::distance". The exact form we already

> have does not have an efficient implementation for anything except

> dense, compact allocations where it can be reduced to pointer

> arithmetic.

>

Pedantry: No, std::distance uses iterator arithmetic, so it works

efficiently on any random-access iterator type. std::distance of std::deque

iterators is constant-time, for example, even though deque is not

contiguous.

C++20 adds "contiguous iterator" as a refinement of "random-access

iterator."

–Arthur

Received on 2020-09-06 08:17:06