C++ Logo

std-proposals

Advanced search

Re: [std-proposals] A Proposal to Add a New Multiset container Method to the Standard Library Technical Report

From: Александр Поваляев <apovalyaev_at_[hidden]>
Date: Wed, 12 Feb 2025 17:36:28 +0300
[[[[[My comments below]]]]]

ср, 12 февр. 2025 г. в 15:46, Jonathan Wakely <cxx_at_[hidden]>:

>
>
> On Wed, 12 Feb 2025 at 12:12, Александр Поваляев <apovalyaev_at_[hidden]>
> wrote:
>
>> Well, to make things more clear.
>> The process of inserting a new element with the same key value might lead
>> to a different result (different position of a newly inserted element).
>> And the result depends upon whether a hint is being used within 'insert'
>> method of std::multiset or not.
>> But once the new element has been inserted, the order is preserved
>> starting from that moment.
>>
>> Let's call an 'oldest' element that one which is closest to the beginning
>> among all the elements within the same key value.
>> To the moment when the 'reduce' method is called.
>>
>> 'reduce' method is not supposed to use 'equal_range' as we have already
>> discovered. Since it would double the number of compare operations required.
>>
>
> I don't agree.
>
> For example, std::multiset count is 2^20, 'equal_range' will require 2*20
>> compare operations.
>>
>
> For 2^20 equivalent elements, I get 57 comparisons for equal_range. If you
> want to remove 1% of those 2^20 equivalent elements, your preferred
> approach would require 10k comparisons. But using equal_range would require
> 57 comparisons.
>
> https://godbolt.org/z/E1E5T6daM
>
> 57 is not double 10k.
>
Here we are talking about possible implementations.
Let me explain to you, how do I count comparison operations needed:
* Given a std::multiset of 2^20 elements in total (some of them do have
equal keys and some of them don't), 'count' will return 2^20
* Let it be a pair of two elements with the same key K (A and B), so both A
and B has the same key K and no any other element amongst 2^20 has key K
* The order in which these elements are placed inside std::multiset is A
then B, so A element is "oldest" and is placed closed to the beginning
* So, we're going to delete only one element of these two
* We call 'reduce' method of std::multiset container with the following
parameters reduce(K, 1)
* Here we comes to the place where we can compare two possible
implementations you proposed earlier:
    Implementation 1: 'equal_range' is used and so about 20 * 2 comparisons
are needed. I get 57 comparisons because a red-black tree is used, not a
pure binary.
    Implementation 2: 'find' method is being used and so, it is required
about 20 comparisons (or so, may be 28)
* Implementation 2 requires twice as much comparisons as Implementation 1.
It is bad, because if a class of elements A and B defines custom
comparator, it will work significantly slower.
    We're talking about STL library implementation (it is a gold standard
for industry), so we can't just afford just unnecessary doubling of any
operations;

* Don't take about O(lnN) or O(N), cause our N is not going to infinity :D,
but we know for sure that Implementation 1 works at least twice slower than
Implementation 2
* I think we will need a third implementation, cause there is a lot of
cases which needed to be considered



>
>
>> So, we will use the implementation of 'reduce' method which is closer to
>> the second variant you proposed (when we find the first element over the
>> sequence of the element with the same key value) and then use iterator to
>> delete every next element one by one until we reach the end of
>> 'std::multiset' or the end of the sequence of the elements with the same
>> key value.
>> Then the 'reduce' method returns the number of elements which have been
>> deleted.
>>
>>
>>
>>
>>
>>
>> ср, 12 февр. 2025 г. в 12:53, Jonathan Wakely <cxx_at_[hidden]>:
>>
>>>
>>>
>>> On Wed, 12 Feb 2025 at 09:04, Александр Поваляев <apovalyaev_at_[hidden]>
>>> wrote:
>>>
>>>> Do you mean that "The order of the elements that compare equivalent is
>>>> the order of insertion and does not change." (
>>>> https://en.cppreference.com/w/cpp/container/multiset") statement is
>>>> wrong?
>>>>
>>>
>>> It's incomplete or ambiguous at best. It could be read to mean that the
>>> order is determined at the time of insertion and doesn't change after that,
>>> which would be correct. You are reading it to mean that the spatial order
>>> in the container is the same as the temporal order that they are inserted,
>>> which is wrong.
>>>
>>> cppreference is not the standard, and not every statement on every page
>>> is authoritative. You find a more correct statement at the link you gave
>>> earlier:
>>>
>>> https://en.cppreference.com/w/cpp/container/multiset/equal_range#Return_value
>>> "Since `emplace` and unhinted `insert` always insert at the upper bound,
>>> the order of equivalent elements in the equal range is the order of
>>> insertion unless hinted `insert` or `emplace_hint` was used to insert an
>>> element at a different position."
>>>
>>>
>>>
>>>
>>>> And a hint while being used with 'insert' method of 'std::multiset' is
>>>> considered as "A MUST", not as "A RECOMMENDATION"?
>>>>
>>>
>>> Yes. The standard says for a.insert(p, t) that "t is inserted as close
>>> as possible to the position just prior to p."
>>>
>>>
>>>
>>>

Received on 2025-02-12 14:36:44