C++ Logo

std-proposals

Advanced search

Re: Reasons for adding insert_return_type in associative containers

From: Boyarinov, Konstantin <konstantin.boyarinov_at_[hidden]>
Date: Tue, 23 Jul 2019 10:57:09 +0000
Thank you for the explanation.

Also, I thought about usage of overload 2 in std::insert_iterator.
It can be useful to combine the node handling mechanism and STL algorithms.

For example, if we want to “transfer” all of the elements satisfying some predicate from one container to another using the node handling mechanism, currently we need to write something like this:

set<int> s1({1, 2, 3, 4, 5});
set<int> s2;

for(auto it = s1.begin(); it != s1.end(); ++it) {
                if(pred(*it)) {
                                s2.insert(s1.extract(it));
                }
}

I had an idea to add ability to make it using STL algorithm, e.g. std::copy_if

The interface would be:
set<int> s1({1, 2, 3, 4, 5});

std::copy_if(std::extractor(s1.begin()), std::extractor(s1.end()),
        std::inserter(s2.begin()), pred);

Where std::extractor is a make function for the special iterator class extract_iterator.
extract_iterator is an iterator contains the pointer to the container and the instance of iterator for the container.
Operator* for such iterator invoking extract method on the container and returns the rvalue reference on the node which was extracted.

Also, it requires adding an extra overload for std::insert_iterator::operator=:

constexpr insert_iterator& operator=(typename Container::node_type&& nh);

This overload only participates in overload resolution if Container::node_type is a valid member type.

These mechanism have some side effects:

1. Consider the following example

std::set<int> s1({1, 2, 3, 4, 5});
std::set<int> s2;

std::copy_if(std::extractor(s1.begin()), std::extractor(s1.end()), std::inserter(s2.begin()), is_even{});

The result of this code would be:
Container s1 is empty
Container s2: {2, 4}

The elements 1, 3, 5 are lost

2. If the insertion in the insert_iterator fails:

std::set<int> s1({1, 2, 3, 4, 5});
std::set<int> s2({1,2, 3, 5});

std::copy(std::extractor(s1.begin()), std::extractor(s1.end()), std::inserter(s2.begin()));

The result would be:
Container s1 is empty
Container s2: {1, 2, 3, 4, 5}

Elements 1, 2, 3, 5 from the first container are lost.

What do you think about this idea?
Thank you one more time for your help.


Best regards,
Konstantin Boyarinov




From: Arthur O'Dwyer [mailto:arthur.j.odwyer_at_[hidden]]
Sent: Tuesday, July 23, 2019 9:22 AM
To: std-proposals_at_[hidden]
Cc: Boyarinov, Konstantin <konstantin.boyarinov_at_[hidden]>
Subject: Re: [std-proposals] Reasons for adding insert_return_type in associative containers

On Tue, Jul 23, 2019 at 1:30 AM Boyarinov, Konstantin via Std-Proposals <std-proposals_at_[hidden]<mailto:std-proposals_at_[hidden]>> wrote:
Hello,

I have a small question regarding insert_return_type member type in STL ordered and unordered associative containers.
Node handling mechanism introduced two new overloads for insert:

insert_return_type insert(node_type&& nh); (1)
iterator insert(const_iterator hint, node_type&& nh); (2)

[where the `node` member of the returned `insert_return_type` gets the old value of `nh` iff the insert does not happen]

Overload (2) inserts a node owned by nh into the container and if the insertion fails nh retains ownership on the node.

So the question is why such inconsistence is needed?

Well, overload (1) is clearly a better interface than overload (2). "Conditionally moving-from" a reference parameter is terrible API design. Why? Because the static analyzer would like to look at a local piece of code such as

    Widget w;
    foo(std::move(w));
    bar(w); // OOPS

and diagnose the line marked "OOPS" as a use-after-move. If `foo(std::move(w))` is not a bug, and yet does not leave `w` in a moved-from state either, then the static analyzer is going to produce a lot of false positives. This is bad. For the sake of static analysis, we need "move" to mean "move." When our library forces the programmer to write `std::move` on a function argument, it had better be because that function is going to unconditionally move-out-of that argument!

So overload (1) has a good reason to unconditionally move-out-of its rvalue parameter `nh`. But sometimes it doesn't want to take ownership of the node referred to by `nh`. In that case, it wants to return ownership of the node back to the caller. So it includes a node handle as part of the return type of the function.

So your question boils down to: Why does overload (2) — the "hinted" version of `insert` — use a known terrible API design which is deliberately inconsistent with the good API design of overload (1) — especially given that both overloads were added in the same version of C++?


I don't know for sure, but I suspect that the answer is that nobody cares about the "hinted" versions of `insert`. Vendors just ignore the `hint` parameter; programmers don't call these methods in the first place. So they were probably not looked at very thoroughly during wording review.

LWG discussion on the Jacksonville (2016) wiki indicates that the good API for (1) was not part of the original proposal, but was designed-by-committee in LEWG. Presumably LEWG didn't notice or care about the "hinted" version because nobody uses it. A few people in LWG at Jacksonville did notice the asymmetry between (1) and (2), but presumably didn't care to do "design" work that LEWG was already supposed to have done, so they shipped it with the asymmetry intact.

I am obliged to point out the date on P0083R3: June 24, 2016, mere days before the Committee Draft of C++17 was released<https://herbsutter.com/2016/06/30/trip-report-summer-iso-c-standards-meeting-oulu/>. We are now (July 2019) in that same period — the Committee Draft of C++20 was set in stone at Cologne just a few days ago. And this time we have even more features still undergoing design work, and barely squeaking into the draft ahead of the deadline with insufficient review time. There will be a lot more mistakes in C++20 than there were in C++17 (which had more than C++14, and so on). Being able to spot the mistakes two years later doesn't help, when C++ is on a three-year release cycle with major features going in in the last few days of each cycle!

–Arthur

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Received on 2019-07-23 05:59:09