On Mon, Jan 22, 2024 at 11:03 AM Phil Endecott via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
Iúri Chaer wrote:

>> But in the case of std::list since C++11 we have constant-time
>> size(), which std::lower_bound() doesn't exploit.
>
> Well, std::lower_bound() takes a pair of iterators, so we can't really rely
> on size().

Here's an idea, how about a std::lower_bound overload that takes
an iterator and a size?

That is,
  std::lower_bound_n(first, n, value)
  std::lower_bound_n(first, n, value, comp)
That's a very interesting idea!
But that's solving the reverse of Iuri's problem: that overload would be applicable when you do know the length of the range, whereas one-sided binary search is useful when you don't know the length of the range.

In fact, one-sided binary search is usable when you have an infinite range — e.g. you could write
    std::ranges::one_sided_partition_point(std::views::iota(0), some_predicate);
Is that enough of a killer app that it ought to be its own named algorithm?  Certainly it ought to be a data point discussed in the paper.


Ugh, I feel a bit queasy with the idea of doing a binary search through a
std::set without using its (private) left/right/parent pointers to do so.
Slightly more queasy than finding the size of a std::list without using its
size() member. Here's an idea, add a std::set::lower_bound overload that
takes an iterator pair (or a single begin iterator); maybe call it a hint,
like std::set::insert?

Off the top of my head, I think that might not be efficiently implementable. IIRC, the library's red-black trees don't have parent pointers; they have next/prev pointers (for iterator traversal). So a tree operation like "least common ancestor" isn't efficiently implementable, which I would expect means that `set.lower_bound(first, last, value)` isn't efficiently implementable either. But it would be interesting to be proved wrong!

Adding a new algorithm(s) should definitely be discussed in the paper. But I think it drifts from Iúri's original point, which is that `std::lower_bound` is unaccountably slow over linked lists when it really doesn't have to be. Library implementors could give it a 2x runtime speedup, with no API break, just by switching to one-sided binary search. We've done similar things in the past; e.g. switching `std::pop_heap` from the naïve algorithm to Floyd's algorithm: 
https://reviews.llvm.org/D118003
We were able to do that only because the Complexity clause got out of our way and let us do our job. If it had been specified as "O(log N) calls to `swap`", we wouldn't have been able to get that speedup.

Iúri has submitted a patch to libc++ which allegedly speeds up `std::set_intersection` by using one-sided binary search:
https://github.com/llvm/llvm-project/pull/75230
(Although I personally am skeptical of this patch. It seems to me that the old code did `n` increments and `n` comparisons, and the new code does between-`n`-and-`2n` increments and between-`log n`-and-`n` comparisons. That's not a good tradeoff unless comparison is vastly more expensive than increment; but by assumption, increment is pointer-chasing and thus expensive. I think it might be useful only for Splunk's very specific use-case where one set is gigantic and the other is tiny (100% x 1% = 1%), and it's certainly harmless when the sets are almost equal (90% x 90% = 81%), but it seems like it might be harmful in the "C++ 101" case where the sets are random (50% x 50% = 25%).)

Iúri claims — rightly AFAICT — that library implementors could similarly speed up `std::lower_bound` (which already does binary search, so I'm much less skeptical of that patch) if only it weren't for that pesky Complexity clause.

We shouldn't let the tail wag the dog: we should value actual runtime performance over adherence to formal Complexity clauses.

Anyway, I agree that Iúri should write a paper, and have offered my help if desired. :)

my $.02,
–Arthur