C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relaxing precondition on std::set_intersection algorithm

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Sun, 14 Sep 2025 13:38:01 -0400
On Sun, Sep 14, 2025 at 1:08 PM Peter Neiss via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> Ok. I have not thought about overlapping input ranges. Classical aliasing
> problem. Modifying the first range at a later position, through the second
> maybe makes the range unsorted. But when I try to construct an example, the
> sorting seems to make this impossible and it still works out correctly.
> I am not familiar with weak comparators and when they are allowed in the
> stl. Is sorting by a weak comparator enough for a range to be sorted ?
> Please give an example that breaks with overlapping input ranges. If this
> is a problem inplace_set_intersection is a good idea, otherwise i would
> prefer changing the precondition on std::set_intersection.
>

Note that it is traditional in the STL to have separate names for the "A +
B => C" and "A + B => A" versions of algorithms.
For example, `std::copy_if(first, last, out, Odd)` when `first == out`
becomes `std::remove_if(first, last, Even)`.
For example, `std::unique_copy(first, last, out)` when `first == out`
becomes `std::unique(first, last)`.

In your case, `std::set_intersection(first, last, first2, last2, out)` when
`first == out` might become something like
`std::remove_if_not_included(first, last, first2, last2)`. (Compare
`std::includes`, `std::find_if_not`.)
Of course it would be nice if we could have had
    std::set_intersect(first, last, first2, last2);
    std::set_intersect_copy(first, last, first2, last2, out); // exactly
today's std::set_intersection
like we do for many other STL algorithms (rotate/rotate_copy,
partition/partition_copy, partial_sort/partial_sort_copy, etc.), but sadly
that ship has sailed.

If the input ranges overlap, the output can generally only coincide with
>> the first range (the source for the copy). Otherwise you can get wrong
>> results with a weak comparator.
>
>
I don't think this is true. The output range could be the same as the first
input range, or it could be the same as the second input range, or it could
be unrelated, or it could start before either input range.
The output range obviously can't point into the *middle* of either input
range, but nobody was claiming that.

I wonder if the term "overlap with" is ever actually defined in the spec,
or if there's an LWG issue about that. I think as the iterator type becomes
more complicated than just a simple pointer, and the element types become
more resource-managey and/or view-ish, "overlap" gets harder to define.
What we mean, of course(tm), is that while the algorithm is running the
values of all elements in the two input ranges must be stable and not
change for any reason. And how we want to loosen that restriction is to say
something like, the values of all *not-yet-read* elements in the two input
ranges must be stable and not change (along with some explanation of when
*does* the algorithm read each element).

my $.02,
–Arthur

Received on 2025-09-14 17:38:19