C++ Logo

std-discussion

Advanced search

Re: Adding a new modification of algorithm std::unique

From: Vladimir Grigoriev <vlad.moscow_at_[hidden]>
Date: Fri, 12 Jan 2024 21:03:43 +0300
As I wrote in the initial message the name of the algorithm can be made more suitable. At present it is unimportant. What is important is that the standard algorithm std::unique changes the content of a container (or a sequence).
 It is evident that it is a drawback of the algorithm. Sometimes it is required to achive a similar result without changing the content in whole of a container. Neither modification of partition algorithms is suitable here. And I know myself that the algorithm std::unique may be applied to non-sorted sequences. I used sorted sequences only because the results of applying of the suggested algorithm in my provided examples are more visible for such sequences.
 
I am sure that it is a drawback of the library of the standard algorithms. What alternative can you suggest?
 
With best regards,
(Vlad from Moscow)
You can meet me at http://cpp.forum24.ru/ or www.stackoverflow.com or http://ru.stackoverflow.com
 
 
>Четверг, 11 января 2024, 14:49 +03:00 от Giuseppe D'Angelo via Std-Discussion <std-discussion_at_[hidden]>:
>
>Hi,
>
>On 10/01/2024 20:48, Vladimir Grigoriev via Std-Discussion wrote:
>> The algorithm std::unique has a side effect that is not always
>> desirable. That is it can change the content of a container.
>
>> I named it stable_unique though any more suitable name may be proposed.
>
>I don't think "stable" is a good prefix to use here. Stability refers to
>a property of the output of certain algorithms that generate a
>permutation of their input (sort, partition, ...), and specifically to
>the fact that elements that compare equivalent (as far as the algorithm
>is concerned) retain their relative ordering after the permutation --
>the ordering as it was in the sequence before it was processed by the
>algorithm.
>
>In other words, if you have something like:
>
>> struct Employee { std::string name; Job job; };
>>
>> Employee employees[] = {
>> { "Alice", Job::Engineer },
>> { "Bob", Job::Engineer },
>> { "Carol", Job::Manager },
>> { "Dave", Job::QA },
>> { "Eve", Job::Manager },
>> };
>
>then a
>
>> stable_partition(employees, [](auto &&x) { return x.job == Job::Manager; });
>
>*must* yield the { Carol, Eve, Alice, Bob, Dave } permutation, while a
>mere `partition` may return any permutation of { Carol, Eve } followed
>by any permutation of { Alice, Bob, Dave }. The reason why both
>alternatives are offered for certain algorithms is because the
>non-stable version can be faster, so that's the trade-off.
>
>The concept of stability doesn't really have sense for `unique`, as it's
>not a permuting algorithm (just like `remove_if` isn't just a
>`partition`): `unique` "eliminates all but the first element from every
>consecutive group of equivalent elements"
>( https://eel.is/c++draft/alg.unique#3 ).
>
>I am not sure if "eliminates" is formally defined somewhere, but what it
>means concretely is that the tail of elements (going from the returned
>iterator to the `last`) consists of elements that have been (possibly)
>moved-from. That's a "damaging" operation, as you observe; in general
>you can't even _reason_ about their contents (elements are partially
>formed). But what this means is that the output is not a permutation,
>and therefore there's no concept of stability to apply.
>
>(On the other hand, `unique` today is already ""stable"", in the sense
>that the non-eliminated elements are left in the same relative order of
>the original sequence; however, I don't seem to find a provision for that.)
>
>
>Now, for the sake of the argument let's call your algorithm ...
>CTFEFECGOEE ("Collect The First Element From Every Consecutive Group Of
>Equivalent Elements") -- sorry, I don't know if it's got an established
>name in literature. Your output would be a permutation, and as such, you
>could have a stable and a non-stable version (assuming a faster
>implementation is available if you relinquish stability).
>
>For
>> Employee employees[] = {
>> { "Alice", Job::Engineer },
>> { "Bob", Job::Engineer },
>> { "Carol", Job::Manager },
>> { "Dave", Job::QA },
>> { "Eve", Job::Manager },
>> { "Frank", Job::Manager },
>> { "Grace", Job::Manager },
>> };
>
>then stable_CTFEFECGOEE must yield
>
>{ Alice, Carol, Dave, Eve, Bob, Frank, Grace }
>
>
>while non-stable CTFEFECGOEE could e.g. yield
>
>{ Alice, Carol, Dave, Eve, Bob, Grace, Frank }
>
>
>or similar variations. Pay attention to the fact that `unique` does not
>require its input to be sorted, and neither should CTFEFECGOEE; you need
>to "carefully" define CTFEFECGOEE's behavior if you want CTFEFECGOEE's
>output to still be sorted when its input is.
>
>If you're willing to continue with this idea, please survey if this
>algorithm has indeed an established name (or names) in classic algorithm
>literature / other programming language libraries / ..., and then come
>up with a full proposal for it?
>
>My c,
>--
>Giuseppe D'Angelo
>--
>Std-Discussion mailing list
>Std-Discussion_at_[hidden]
>https://lists.isocpp.org/mailman/listinfo.cgi/std-discussion
 

Received on 2024-01-12 18:03:54