On Sun, Sep 6, 2020 at 2:55 AM Andreas Ringlstetter via Std-Proposals <std-proposals@lists.isocpp.org> wrote:

Interesting problem and relevant motivation!

On SO, darune writes:
> All associate containers are already inherently sorted and already provides specialized accessors and api's.

But this isn't really true for your use-case, is it?  I mean, suppose I have a std::set. It is a balanced binary tree, but (unless I'm missing something) there is no specialized accessor that would give me an iterator to the "midpoint/root" element in anything less than O(n) time. And a very slow and pointer-chasey O(n) at that!
So (unless I'm missing something) darune's dismissal is a bit premature.

However, I also don't see any clean way to implement what you're asking for. Consider what the compiler sees for

    std::set<int> s = {1,2,3,4,5,6,7,8,9,10,11};
    auto first = s.begin(), last = s.end();
    auto it1 = std2::divide_in_half(first, last);
    assert(*it1 == 6);
    std::advance(first, 2); std::advance(last, -4);
    auto it2 = std2::divide_in_half(first, last);
    assert(*it2 == 5);

The iterators `first` and `last` don't know anything about the container that they're part of. They don't have any way to get at the "root" node of that tree, except to traverse over to it.  (It is true AFAIK that an iterator would be able to tell whether it were pointing directly at the root node, but that doesn't help us here AFAIK.)
And the type-system doesn't see any difference between the first call to `std2::divide_in_half` there and the second call. The first one we'd like to quickly return the root node; the second one we do not want to return the root node, and must do the slow O(n) traversal.

NOTE: For the purposes of this thread, I'd like to completely handwave away the practical difference between "the root node of the red-black tree" and "the n/2th node in sorted order." I'd like to just assume that we can solve any potential off-by-1 or off-by-2 errors that might arise, and not go down that nerdsnipe rabbit hole at all.

[...]
If they provide only the `LegacyForwardIterator` trait, only thing we
can do in constant time is:

    std::set<int> foo;
    ... // fill it
    auto begin = foo.begin();
    auto end = foo.end();
    auto middle = (begin != end) ? ++(begin) : begin;

No chance of processing the iterator range in an even remotely
balanced way without a linear scan.

FWIW, `std::for_each` is allowed to take O(n) time to do its job. If it spends 1/Kth of that time doing an O(n) traversal of the list to find its actual midpoint, then that's fine, AFAIK.  There's nothing in the formal standard AFAIK that says it must use a dumb O(1) algorithm to partition the work. If you're seeing that dumb algorithm being used in implementations today, I think that's likely for software-engineering reasons, not standard-conformance reasons.

[...]
For reference, the following is about on-par with what the existing
STL implementations do:

    // Generic implementation
    template<typename Iter, ~~~>
    std::pair<std::pair<Iter, Iter>, std::pair<Iter, Iter>>
    divide(const Iter &begin, const Iter &end)
    {
        if constexpr (std::is_convertible_v<typename
std::iterator_traits<Iter>::iterator_category,
std::random_access_iterator_tag>)
        {
            // Ideal bisection
            const Iter middle = end - std::distance(begin, end) / 2;
            return { {begin, middle}, {begin, middle} };

Here and below, that should be {{begin, middle}, {middle, end}}, right?

Finally, I think you have at least two different problems here:
(1) Bisect (trisect,...) a range into contiguous subranges of the same iterator-type and exactly the right sizes. I think range-v3 would be capable of this task; something with std::ranges::size and then repeatedly views::slice. Someone better at Ranges than me might try it.
(2) Given a range, split it "quick and dirty" into subranges of possibly different iterator-type, and possibly not exactly the right sizes, but close enough to get speedup on parallel tasks.

You mention that unordered_set would probably be very good at #2 — but horrible at #1.
Another kind of data structure that might be good at #2 would be skiplists, whose iterators would know how to take "chunks" easily.

[...]
A heuristic alternative to "std::distance". The exact form we already
have does not have an efficient implementation for anything except
dense, compact allocations where it can be reduced to pointer
arithmetic.

Pedantry: No, std::distance uses iterator arithmetic, so it works efficiently on any random-access iterator type. std::distance of std::deque iterators is constant-time, for example, even though deque is not contiguous.
C++20 adds "contiguous iterator" as a refinement of "random-access iterator."

–Arthur