Date: Wed, 12 Feb 2025 14:50:32 +0000
On Wed, 12 Feb 2025 at 14:36, Александр Поваляев <apovalyaev_at_[hidden]>
wrote:
> [[[[[My comments below]]]]]
>
> ср, 12 февр. 2025 г. в 15:46, Jonathan Wakely <cxx_at_[hidden]>:
>
>>
>>
>> On Wed, 12 Feb 2025 at 12:12, Александр Поваляев <apovalyaev_at_[hidden]>
>> wrote:
>>
>>> Well, to make things more clear.
>>> The process of inserting a new element with the same key value might
>>> lead to a different result (different position of a newly inserted element).
>>> And the result depends upon whether a hint is being used within 'insert'
>>> method of std::multiset or not.
>>> But once the new element has been inserted, the order is preserved
>>> starting from that moment.
>>>
>>> Let's call an 'oldest' element that one which is closest to the
>>> beginning among all the elements within the same key value.
>>> To the moment when the 'reduce' method is called.
>>>
>>> 'reduce' method is not supposed to use 'equal_range' as we have already
>>> discovered. Since it would double the number of compare operations required.
>>>
>>
>> I don't agree.
>>
>> For example, std::multiset count is 2^20, 'equal_range' will require 2*20
>>> compare operations.
>>>
>>
>> For 2^20 equivalent elements, I get 57 comparisons for equal_range. If
>> you want to remove 1% of those 2^20 equivalent elements, your preferred
>> approach would require 10k comparisons. But using equal_range would require
>> 57 comparisons.
>>
>> https://godbolt.org/z/E1E5T6daM
>>
>> 57 is not double 10k.
>>
> Here we are talking about possible implementations.
> Let me explain to you, how do I count comparison operations needed:
> * Given a std::multiset of 2^20 elements in total (some of them do have
> equal keys and some of them don't), 'count' will return 2^20
> * Let it be a pair of two elements with the same key K (A and B), so both
> A and B has the same key K and no any other element amongst 2^20 has key K
> * The order in which these elements are placed inside std::multiset is A
> then B, so A element is "oldest" and is placed closed to the beginning
> * So, we're going to delete only one element of these two
> * We call 'reduce' method of std::multiset container with the following
> parameters reduce(K, 1)
> * Here we comes to the place where we can compare two possible
> implementations you proposed earlier:
> Implementation 1: 'equal_range' is used and so about 20 * 2
> comparisons are needed. I get 57 comparisons because a red-black tree is
> used, not a pure binary.
> Implementation 2: 'find' method is being used and so, it is required
> about 20 comparisons (or so, may be 28)
> * Implementation 2 requires twice as much comparisons as Implementation 1.
> It is bad, because if a class of elements A and B defines custom
> comparator, it will work significantly slower.
> We're talking about STL library implementation (it is a gold standard
> for industry), so we can't just afford just unnecessary doubling of any
> operations;
>
> * Don't take about O(lnN) or O(N), cause our N is not going to infinity
> :D, but we know for sure that Implementation 1 works at least twice slower
> than Implementation 2
> * I think we will need a third implementation, cause there is a lot of
> cases which needed to be considered
>
So make it adaptive:
https://godbolt.org/z/1jnE4xEEa
I don't consider that necessary. The logarithmic one using equal_range wins
in general, because optimizing for small sizes isn't usually needed (lookup
in them is fast enough anyway).
This still doesn't need to be a new member function of std::multiset though.
wrote:
> [[[[[My comments below]]]]]
>
> ср, 12 февр. 2025 г. в 15:46, Jonathan Wakely <cxx_at_[hidden]>:
>
>>
>>
>> On Wed, 12 Feb 2025 at 12:12, Александр Поваляев <apovalyaev_at_[hidden]>
>> wrote:
>>
>>> Well, to make things more clear.
>>> The process of inserting a new element with the same key value might
>>> lead to a different result (different position of a newly inserted element).
>>> And the result depends upon whether a hint is being used within 'insert'
>>> method of std::multiset or not.
>>> But once the new element has been inserted, the order is preserved
>>> starting from that moment.
>>>
>>> Let's call an 'oldest' element that one which is closest to the
>>> beginning among all the elements within the same key value.
>>> To the moment when the 'reduce' method is called.
>>>
>>> 'reduce' method is not supposed to use 'equal_range' as we have already
>>> discovered. Since it would double the number of compare operations required.
>>>
>>
>> I don't agree.
>>
>> For example, std::multiset count is 2^20, 'equal_range' will require 2*20
>>> compare operations.
>>>
>>
>> For 2^20 equivalent elements, I get 57 comparisons for equal_range. If
>> you want to remove 1% of those 2^20 equivalent elements, your preferred
>> approach would require 10k comparisons. But using equal_range would require
>> 57 comparisons.
>>
>> https://godbolt.org/z/E1E5T6daM
>>
>> 57 is not double 10k.
>>
> Here we are talking about possible implementations.
> Let me explain to you, how do I count comparison operations needed:
> * Given a std::multiset of 2^20 elements in total (some of them do have
> equal keys and some of them don't), 'count' will return 2^20
> * Let it be a pair of two elements with the same key K (A and B), so both
> A and B has the same key K and no any other element amongst 2^20 has key K
> * The order in which these elements are placed inside std::multiset is A
> then B, so A element is "oldest" and is placed closed to the beginning
> * So, we're going to delete only one element of these two
> * We call 'reduce' method of std::multiset container with the following
> parameters reduce(K, 1)
> * Here we comes to the place where we can compare two possible
> implementations you proposed earlier:
> Implementation 1: 'equal_range' is used and so about 20 * 2
> comparisons are needed. I get 57 comparisons because a red-black tree is
> used, not a pure binary.
> Implementation 2: 'find' method is being used and so, it is required
> about 20 comparisons (or so, may be 28)
> * Implementation 2 requires twice as much comparisons as Implementation 1.
> It is bad, because if a class of elements A and B defines custom
> comparator, it will work significantly slower.
> We're talking about STL library implementation (it is a gold standard
> for industry), so we can't just afford just unnecessary doubling of any
> operations;
>
> * Don't take about O(lnN) or O(N), cause our N is not going to infinity
> :D, but we know for sure that Implementation 1 works at least twice slower
> than Implementation 2
> * I think we will need a third implementation, cause there is a lot of
> cases which needed to be considered
>
So make it adaptive:
https://godbolt.org/z/1jnE4xEEa
I don't consider that necessary. The logarithmic one using equal_range wins
in general, because optimizing for small sizes isn't usually needed (lookup
in them is fast enough anyway).
This still doesn't need to be a new member function of std::multiset though.
Received on 2025-02-12 14:50:49