Date: Wed, 12 Jun 2024 22:38:28 +0100
Hi,
On 11/06/2024 20:01, Aeren via Std-Proposals wrote:
> Hi, this is my first proposal to C++ STL and I'm looking forward to
> getting some feedback :)
>
> std::lower_bound / std::upper_bound lets you binary search on a range
> defined by a pair of legacy random access iterators, and we have a BBST
> counterpart for this: std::(set / map / multiset /
> multimap)::(lower_bound / upper_bound)
>
> I would like to propose to add the BBST counterpart of
> std::partition_point, which generalizes both std::lower_bound and
> std::upper_bound.
>
> This addition would let us binary search on dynamically changing points
> with an arbitrary binary predicate on which the points are "partitioned".
>
> Currently, there's a workaround for this using the signature (3) and (4)
> on cppreference lower_bound doc
> <https://en.cppreference.com/w/cpp/container/set/lower_bound>. However,
> it is very ugly to do this because it forces K and x to be meaningless
> and forces me to define a weird comparator with K.
One motivating example for this is a map that stores ip routing
information, so the keys are subnets like 192.168.0.0/24, 10.0.0.0/8,
etc... .
Two subnet1 compares less than subnet2 if:
* The subnets are disjoint and the ip of subnet1 compares less than the
ip of subnet2 (for example 10.0.0.0/8 < 192.168.0.0/24), otherwise
* subnet1 is a proper subset of subnet2 (for example 10.1.0.0/16 < 10.0.0.8)
Example:
10.1.0.0/16 < 10.0.0.0/8 < 192.168.0.0/24
Note that subnets can't overlap partially. In this sorted range of
subnets the last ip address in each subnet is also a sorted range. For
the example above:
10.1.255.255 < 10.255.255.255 < 192.168.255.255
We want to look up the most specific subnet in the table for a given IP
address. We can do that by finding the partition point in range for the
predicate [ip](const auto& subnet){ return subnet.last_ip() < ip; }.
Then if the subnet at the partition point (if there is any) contains the
ip, then that's the most specific subnet corresponding to the ip address.
Having said that, this can be expressed with a transparent key
comparison, only a little bit more awkward. The only precondition I
found for lower_bound with a transparent key comparison is that for the
specified value it needs to partition the range, which is also the exact
precondition of partition_point. Just don't use std::ranges::less for
Compare, which imposes additional constraints and semantic requirements.
On the other hand, a member partition_point's precondition would be
exactly the same as its algorithm counterpart, so I don't think that it
would cause too much confusion there. The same subnets example can be
easily expressed with a sorted range (or flat_map) and partition_point
(on the underlying container for flat_map), yet it needs some gymnastics
to express the same with map, which can be especially awkward when
refactoring to change the container type.
This is possibly not enough motivation, the real cost of adding it is
spending committee time and burden on implementers. I don't see other
downsides though.
Maybe the best course of action would be to propose this to a 3rd party
library first, like Abseil.
Cheers,
Lénárd
On 11/06/2024 20:01, Aeren via Std-Proposals wrote:
> Hi, this is my first proposal to C++ STL and I'm looking forward to
> getting some feedback :)
>
> std::lower_bound / std::upper_bound lets you binary search on a range
> defined by a pair of legacy random access iterators, and we have a BBST
> counterpart for this: std::(set / map / multiset /
> multimap)::(lower_bound / upper_bound)
>
> I would like to propose to add the BBST counterpart of
> std::partition_point, which generalizes both std::lower_bound and
> std::upper_bound.
>
> This addition would let us binary search on dynamically changing points
> with an arbitrary binary predicate on which the points are "partitioned".
>
> Currently, there's a workaround for this using the signature (3) and (4)
> on cppreference lower_bound doc
> <https://en.cppreference.com/w/cpp/container/set/lower_bound>. However,
> it is very ugly to do this because it forces K and x to be meaningless
> and forces me to define a weird comparator with K.
One motivating example for this is a map that stores ip routing
information, so the keys are subnets like 192.168.0.0/24, 10.0.0.0/8,
etc... .
Two subnet1 compares less than subnet2 if:
* The subnets are disjoint and the ip of subnet1 compares less than the
ip of subnet2 (for example 10.0.0.0/8 < 192.168.0.0/24), otherwise
* subnet1 is a proper subset of subnet2 (for example 10.1.0.0/16 < 10.0.0.8)
Example:
10.1.0.0/16 < 10.0.0.0/8 < 192.168.0.0/24
Note that subnets can't overlap partially. In this sorted range of
subnets the last ip address in each subnet is also a sorted range. For
the example above:
10.1.255.255 < 10.255.255.255 < 192.168.255.255
We want to look up the most specific subnet in the table for a given IP
address. We can do that by finding the partition point in range for the
predicate [ip](const auto& subnet){ return subnet.last_ip() < ip; }.
Then if the subnet at the partition point (if there is any) contains the
ip, then that's the most specific subnet corresponding to the ip address.
Having said that, this can be expressed with a transparent key
comparison, only a little bit more awkward. The only precondition I
found for lower_bound with a transparent key comparison is that for the
specified value it needs to partition the range, which is also the exact
precondition of partition_point. Just don't use std::ranges::less for
Compare, which imposes additional constraints and semantic requirements.
On the other hand, a member partition_point's precondition would be
exactly the same as its algorithm counterpart, so I don't think that it
would cause too much confusion there. The same subnets example can be
easily expressed with a sorted range (or flat_map) and partition_point
(on the underlying container for flat_map), yet it needs some gymnastics
to express the same with map, which can be especially awkward when
refactoring to change the container type.
This is possibly not enough motivation, the real cost of adding it is
spending committee time and burden on implementers. I don't see other
downsides though.
Maybe the best course of action would be to propose this to a 3rd party
library first, like Abseil.
Cheers,
Lénárd
Received on 2024-06-12 21:38:32