Date: Thu, 13 Jun 2024 18:12:58 +0300

I was thinking about exactly the same thing long ago (a-la “they did great

job when designing the std library, how could it be that this thing is

missing?”). Certainly I needed that more than once for production code (of

course, I don't remember the details now). Happily, since c++14 (and in

boost), we can workaround that via transparent comparators. Still, it would

be certainly an improvement (closing the gap) if someone writes the paper

and it gets accepted. My intuition don't even believes this needs an

explicit motivation (binary tree index for a composite key [kinda RDBMS

terminology here, but the idea is true everywhere] should give a

possibility to search not only by the full key but also by the subkeys a-la

first-few-attributes-of-the-full-key – by itself, i.e. without building

additional indices [creating extra maps]), though of course my mind

understands that the motivation is always needed.

On Thu, Jun 13, 2024, 00:38 Lénárd Szolnoki via Std-Proposals <

std-proposals_at_[hidden]> wrote:

> Hi,

>

> On 11/06/2024 20:01, Aeren via Std-Proposals wrote:

> > Hi, this is my first proposal to C++ STL and I'm looking forward to

> > getting some feedback :)

> >

> > std::lower_bound / std::upper_bound lets you binary search on a range

> > defined by a pair of legacy random access iterators, and we have a BBST

> > counterpart for this: std::(set / map / multiset /

> > multimap)::(lower_bound / upper_bound)

> >

> > I would like to propose to add the BBST counterpart of

> > std::partition_point, which generalizes both std::lower_bound and

> > std::upper_bound.

> >

> > This addition would let us binary search on dynamically changing points

> > with an arbitrary binary predicate on which the points are "partitioned".

> >

> > Currently, there's a workaround for this using the signature (3) and (4)

> > on cppreference lower_bound doc

> > <https://en.cppreference.com/w/cpp/container/set/lower_bound>. However,

> > it is very ugly to do this because it forces K and x to be meaningless

> > and forces me to define a weird comparator with K.

>

> One motivating example for this is a map that stores ip routing

> information, so the keys are subnets like 192.168.0.0/24, 10.0.0.0/8,

> etc... .

>

> Two subnet1 compares less than subnet2 if:

> * The subnets are disjoint and the ip of subnet1 compares less than the

> ip of subnet2 (for example 10.0.0.0/8 < 192.168.0.0/24), otherwise

> * subnet1 is a proper subset of subnet2 (for example 10.1.0.0/16 <

> 10.0.0.8)

>

> Example:

>

> 10.1.0.0/16 < 10.0.0.0/8 < 192.168.0.0/24

>

> Note that subnets can't overlap partially. In this sorted range of

> subnets the last ip address in each subnet is also a sorted range. For

> the example above:

>

> 10.1.255.255 < 10.255.255.255 < 192.168.255.255

>

> We want to look up the most specific subnet in the table for a given IP

> address. We can do that by finding the partition point in range for the

> predicate [ip](const auto& subnet){ return subnet.last_ip() < ip; }.

>

> Then if the subnet at the partition point (if there is any) contains the

> ip, then that's the most specific subnet corresponding to the ip address.

>

> Having said that, this can be expressed with a transparent key

> comparison, only a little bit more awkward. The only precondition I

> found for lower_bound with a transparent key comparison is that for the

> specified value it needs to partition the range, which is also the exact

> precondition of partition_point. Just don't use std::ranges::less for

> Compare, which imposes additional constraints and semantic requirements.

>

> On the other hand, a member partition_point's precondition would be

> exactly the same as its algorithm counterpart, so I don't think that it

> would cause too much confusion there. The same subnets example can be

> easily expressed with a sorted range (or flat_map) and partition_point

> (on the underlying container for flat_map), yet it needs some gymnastics

> to express the same with map, which can be especially awkward when

> refactoring to change the container type.

>

> This is possibly not enough motivation, the real cost of adding it is

> spending committee time and burden on implementers. I don't see other

> downsides though.

>

> Maybe the best course of action would be to propose this to a 3rd party

> library first, like Abseil.

>

> Cheers,

> Lénárd

> --

> Std-Proposals mailing list

> Std-Proposals_at_[hidden]

> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

>

job when designing the std library, how could it be that this thing is

missing?”). Certainly I needed that more than once for production code (of

course, I don't remember the details now). Happily, since c++14 (and in

boost), we can workaround that via transparent comparators. Still, it would

be certainly an improvement (closing the gap) if someone writes the paper

and it gets accepted. My intuition don't even believes this needs an

explicit motivation (binary tree index for a composite key [kinda RDBMS

terminology here, but the idea is true everywhere] should give a

possibility to search not only by the full key but also by the subkeys a-la

first-few-attributes-of-the-full-key – by itself, i.e. without building

additional indices [creating extra maps]), though of course my mind

understands that the motivation is always needed.

On Thu, Jun 13, 2024, 00:38 Lénárd Szolnoki via Std-Proposals <

std-proposals_at_[hidden]> wrote:

> Hi,

>

> On 11/06/2024 20:01, Aeren via Std-Proposals wrote:

> > Hi, this is my first proposal to C++ STL and I'm looking forward to

> > getting some feedback :)

> >

> > std::lower_bound / std::upper_bound lets you binary search on a range

> > defined by a pair of legacy random access iterators, and we have a BBST

> > counterpart for this: std::(set / map / multiset /

> > multimap)::(lower_bound / upper_bound)

> >

> > I would like to propose to add the BBST counterpart of

> > std::partition_point, which generalizes both std::lower_bound and

> > std::upper_bound.

> >

> > This addition would let us binary search on dynamically changing points

> > with an arbitrary binary predicate on which the points are "partitioned".

> >

> > Currently, there's a workaround for this using the signature (3) and (4)

> > on cppreference lower_bound doc

> > <https://en.cppreference.com/w/cpp/container/set/lower_bound>. However,

> > it is very ugly to do this because it forces K and x to be meaningless

> > and forces me to define a weird comparator with K.

>

> One motivating example for this is a map that stores ip routing

> information, so the keys are subnets like 192.168.0.0/24, 10.0.0.0/8,

> etc... .

>

> Two subnet1 compares less than subnet2 if:

> * The subnets are disjoint and the ip of subnet1 compares less than the

> ip of subnet2 (for example 10.0.0.0/8 < 192.168.0.0/24), otherwise

> * subnet1 is a proper subset of subnet2 (for example 10.1.0.0/16 <

> 10.0.0.8)

>

> Example:

>

> 10.1.0.0/16 < 10.0.0.0/8 < 192.168.0.0/24

>

> Note that subnets can't overlap partially. In this sorted range of

> subnets the last ip address in each subnet is also a sorted range. For

> the example above:

>

> 10.1.255.255 < 10.255.255.255 < 192.168.255.255

>

> We want to look up the most specific subnet in the table for a given IP

> address. We can do that by finding the partition point in range for the

> predicate [ip](const auto& subnet){ return subnet.last_ip() < ip; }.

>

> Then if the subnet at the partition point (if there is any) contains the

> ip, then that's the most specific subnet corresponding to the ip address.

>

> Having said that, this can be expressed with a transparent key

> comparison, only a little bit more awkward. The only precondition I

> found for lower_bound with a transparent key comparison is that for the

> specified value it needs to partition the range, which is also the exact

> precondition of partition_point. Just don't use std::ranges::less for

> Compare, which imposes additional constraints and semantic requirements.

>

> On the other hand, a member partition_point's precondition would be

> exactly the same as its algorithm counterpart, so I don't think that it

> would cause too much confusion there. The same subnets example can be

> easily expressed with a sorted range (or flat_map) and partition_point

> (on the underlying container for flat_map), yet it needs some gymnastics

> to express the same with map, which can be especially awkward when

> refactoring to change the container type.

>

> This is possibly not enough motivation, the real cost of adding it is

> spending committee time and burden on implementers. I don't see other

> downsides though.

>

> Maybe the best course of action would be to propose this to a 3rd party

> library first, like Abseil.

>

> Cheers,

> Lénárd

> --

> Std-Proposals mailing list

> Std-Proposals_at_[hidden]

> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

>

Received on 2024-06-13 15:13:13