ISOCPP std-proposals List: Re: [std-proposals] A Proposal to Add a New Multiset container Method to the Standard Library Technical Report

From: Andrey Semashev <andrey.semashev_at_[hidden]>
Date: Wed, 12 Feb 2025 21:38:35 +0300

On 2/12/25 20:52, Александр Поваляев via Std-Proposals wrote:
> [[[[[[ My comments below ]]]]]]
>
> ср, 12 февр. 2025 г. в 17:50, Jonathan Wakely <cxx_at_[hidden]
> <mailto:cxx_at_[hidden]>>:
>
>
>
> On Wed, 12 Feb 2025 at 14:36, Александр Поваляев
> <apovalyaev_at_[hidden] <mailto:apovalyaev_at_[hidden]>> wrote:
>
> [[[[[My comments below]]]]]
>
> ср, 12 февр. 2025 г. в 15:46, Jonathan Wakely <cxx_at_[hidden]
> <mailto:cxx_at_[hidden]>>:
>
>
>
> On Wed, 12 Feb 2025 at 12:12, Александр Поваляев
> <apovalyaev_at_[hidden] <mailto:apovalyaev_at_[hidden]>> wrote:
>
> Well, to make things more clear.
> The process of inserting a new element with the same key
> value might lead to a different result (different
> position of a newly inserted element).
> And the result depends upon whether a hint is being used
> within 'insert' method of std::multiset or not.
> But once the new element has been inserted, the order is
> preserved starting from that moment.
>
> Let's call an 'oldest' element that one which is closest
> to the beginning among all the elements within the same
> key value.
> To the moment when the 'reduce' method is called.
>
> 'reduce' method is not supposed to use 'equal_range' as
> we have already discovered. Since it would double the
> number of compare operations required.
>
>
> I don't agree.
>
> For example, std::multiset count is 2^20, 'equal_range'
> will require 2*20 compare operations.
>
>
> For 2^20 equivalent elements, I get 57 comparisons for
> equal_range. If you want to remove 1% of those 2^20
> equivalent elements, your preferred approach would require
> 10k comparisons. But using equal_range would require 57
> comparisons.
>
> https://godbolt.org/z/E1E5T6daM <https://godbolt.org/z/
> E1E5T6daM>
>
> 57 is not double 10k.
>
> Here we are talking about possible implementations.
> Let me explain to you, how do I count comparison operations needed:
> * Given a std::multiset of 2^20 elements in total (some of them
> do have equal keys and some of them don't), 'count' will return 2^20
> * Let it be a pair of two elements with the same key K (A and
> B), so both A and B has the same key K and no any other element
> amongst 2^20 has key K
> * The order in which these elements are placed inside
> std::multiset is A then B, so A element is "oldest" and is
> placed closed to the beginning
> * So, we're going to delete only one element of these two
> * We call 'reduce' method of std::multiset container with the
> following parameters reduce(K, 1)
> * Here we comes to the place where we can compare two possible
> implementations you proposed earlier:
> Implementation 1: 'equal_range' is used and so about 20 * 2
> comparisons are needed. I get 57 comparisons because a red-black
> tree is used, not a pure binary.
> Implementation 2: 'find' method is being used and so, it is
> required about 20 comparisons (or so, may be 28)
> * Implementation 2 requires twice as much comparisons as
> Implementation 1. It is bad, because if a class of elements A
> and B defines custom comparator, it will work significantly slower.
> We're talking about STL library implementation (it is a gold
> standard for industry), so we can't just afford just
> unnecessary doubling of any operations;
>
> * Don't take about O(lnN) or O(N), cause our N is not going to
> infinity :D, but we know for sure that Implementation 1 works at
> least twice slower than Implementation 2
> * I think we will need a third implementation, cause there is a
> lot of cases which needed to be considered
>
>
> So make it adaptive:
> https://godbolt.org/z/1jnE4xEEa <https://godbolt.org/z/1jnE4xEEa>
>
> I don't consider that necessary. The logarithmic one using
> equal_range wins in general, because optimizing for small sizes
> isn't usually needed (lookup in them is fast enough anyway).
>
> [[[[[[Aleksandr]]]]]] I think that anything matters when it is on a
> performance path. Selection of the right algorithm makes a significant
> difference. Whereas a code optimization can also bring a positive
> effect. That is why we're trying to make it more efficient
> Implementation1->Implementation2->Implementation3, until everybody
> agrees it is optimal for all the use-cases.
>
> So, there was a question: for what particular use-case 'reduce' method
> can be used.
> I think one of the possible use-cases is 'priority queue which preserves
> order of elements'.

Why not just use std::priority_queue then, with keys augmented with a
sequence number?

Tree-based containers are not well fit for queues anyway because of the
need for rebalancing. Better use a different data structure.

Received on 2025-02-12 18:38:41