ISOCPP std-proposals List: [std-proposals] std::shared

From: Oliver Schädlich <oliver.schaedlich_at_[hidden]>
Date: Wed, 28 Aug 2024 15:12:02 +0200

I recently have presented my idea for an improvement of shared_ptr<> in combination with
atomic<shared_ptr<>> here, but I was misunderstood. My idea was simply that shared_ptr<>
should add a new assignement-operator with a atomic<shared_ptr<>> as its parameter. The
trick with that would be that the assignment-operator could check if both pointers point to
the same data and then it could do nothing.
If you have a RCU-like pattern the participating theads could simply keep a thread_local
shared_ptr<> which is periodically updated with such an assignment. Like with RCU-like
patterns the central atomic<shared_ptr<>> would be updated rarely and the comparison
would be very quick on mostly shared cachelines.
With current C++ load()ing from a atomic<shared_ptr<>> is extremely slow, even when you
have a lock-free solution since the cacheline-flipping between the cores is rather expensive.
I've written my own shared_obj<>- (like shared_ptr<>) and tshared_obj<>-classes (like atomic
<shared_ptr<>>) and with this optimization I get a speedup of factor 5,000 when 32 threads
are contending on a single thared_obj<>, i.e. constantly copying it's pointer to a shared_obj<>
object while the tshared_obj is rarely updated.

This is the code:

template<typename T>
shared_obj<T> &shared_obj<T>::operator =( tshared_obj<T> const &ts ) noexcept
{
    using namespace std;
    // important optimization for RCU-like patterns:
    // we're managing the same object like the thread-safe object-pointer ?
    if( (size_t)m_ctrl == ts.m_dca.first().load( memory_order_relaxed ) ) [[likely]]
        // yes: nothing to do
        return *this;
    // erase our object pointer
    this->~shared_obj();
    // copy() increments the reference count for ourself
    m_ctrl = ts.copy();
    return *this;
}

Received on 2024-08-28 13:12:04