Date: Fri, 22 Nov 2024 12:49:58 -0500
I think C++ may benefit from a native managed memory object (ala C++/CLI)
I found std::shared_ptr has a couple of serious issues that are causing me
to shy away from it:
* need for derivation from std::enable_shared_from_this()
* cannot use shared_from_this() in ctors
* messy downcasting of shared_from_this() when deriving from base classes
that derive from std::shared_from_this()
On Wed, Aug 28, 2024 at 12:20 PM Jonathan Wakely via Std-Proposals <
std-proposals_at_[hidden]> wrote:
>
>
> On Wed, 28 Aug 2024 at 16:53, Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
> wrote:
>
>> On Wed, Aug 28, 2024 at 9:12 AM Oliver Schädlich via Std-Proposals <
>> std-proposals_at_[hidden]> wrote:
>>
>>> I recently have presented my idea for an improvement of shared_ptr<> in
>>> combination with
>>> atomic<shared_ptr<>> here, but I was misunderstood. My idea was simply
>>> that shared_ptr<>
>>> should add a new assignement-operator with a atomic<shared_ptr<>> as its
>>> parameter.
>>>
>> I believe I understood that `operator=` was the form-factor you hit on
>> first, yes. But that's not the right form-factor.
>> I very strongly oppose adding any new overloaded operator to `atomic`
>> which does not have any named-member-function version, for two reasons:
>> - Good multithreaded code IMNSHO makes all atomic operations explicit
>> (e.g. `int b = a.load(); a.store(b)`).
>> - You need somewhere to hang the `memory_order` argument, anyway.
>>
>> With current C++ load()ing from a atomic<shared_ptr<>> is extremely slow,
>>> even when you
>>> have a lock-free solution since the cacheline-flipping between the cores
>>> is rather expensive.
>>> I've written my own shared_obj<>- (like shared_ptr<>) and
>>> tshared_obj<>-classes (like atomic
>>> <shared_ptr<>>) and with this optimization I get a speedup of factor
>>> 5,000 when 32 threads
>>> are contending on a single thared_obj<>, i.e. constantly copying it's
>>> pointer to a shared_obj<>
>>> object while the tshared_obj is rarely updated.
>>>
>>> This is the code:
>>>
>>> template<typename T>
>>> shared_obj<T> &shared_obj<T>::operator =( tshared_obj<T> const &ts )
>>> noexcept
>>> {
>>> using namespace std;
>>> // important optimization for RCU-like patterns:
>>> // we're managing the same object like the thread-safe
>>> object-pointer ?
>>> if( (size_t)m_ctrl == ts.m_dca.first().load( memory_order_relaxed )
>>> ) [[likely]]
>>> // yes: nothing to do
>>> return *this;
>>>
>> FYI, this would not be correct for `std::shared_ptr`. With
>> `std::shared_ptr` it is totally possible to have two shared_ptrs owning the
>> same control block yet with different values:
>> std::shared_ptr<int[]> a = std::make_shared<int[]>(2);
>> std::shared_ptr<int> b = std::shared_ptr<int>(a, &a[0]);
>> std::shared_ptr<int> c = std::shared_ptr<int>(a, &a[1]);
>> std::atomic<std::shared_ptr<int>> at;
>> at.store(b);
>> c = at.load(); // Replace this line with the new API, i.e.
>> at.load_into(c); or c = at;
>> assert(c == b); // This had better be true after the load!
>>
>> So you need to look at both halves of the shared_ptr. I believe this
>> requires a double-wide atomic load to avoid tearing-esque problems.
>>
>
> Or you load an actual shared_ptr<T> object, which gives you both pointers
> "atomically" (even if there might be a lock involved), but that involves
> the ref count increment that is apparently the source of the overhead.
>
> Maybe you could atomically get a pair<control_block*, T*> and then if both
> values (or just the first?) equal the values in *this, then you do nothing
> (and we know this is thread-safe because *this must own the pointer, so its
> ref count is greater than 0 and won't drop to zero even if the
> atomic<shared_ptr<T>> is being concurrently modified). If the values are
> not equal, then you need to reload from the atomic, because it could have
> changed since you loaded the pair<control_block*, T*> and invalidated the
> previous values.
>
> But that's not what the posted code does.
>
>
> // erase our object pointer
>>> this->~shared_obj();
>>>
>> As Jonathan said, technically, what you want here is
>> *this = nullptr;
>> Also, the following line repoints `m_ctrl`, but where do you actually
>> repoint the object pointer to point to the same object as `*ts`?
>>
>
>
> It looks like this implementation just has a single pointer (the pointer
> to the control block, maybe?) and not a second pointer to a T*. That's not
> how std::shared_ptr works, and I don't immediately see how to adapt the
> code that was shown to work for std::shared_ptr.
>
>
>> // copy() increments the reference count for ourself
>>> m_ctrl = ts.copy();
>>> return *this;
>>> }
>>>
>>
>>
>> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
I found std::shared_ptr has a couple of serious issues that are causing me
to shy away from it:
* need for derivation from std::enable_shared_from_this()
* cannot use shared_from_this() in ctors
* messy downcasting of shared_from_this() when deriving from base classes
that derive from std::shared_from_this()
On Wed, Aug 28, 2024 at 12:20 PM Jonathan Wakely via Std-Proposals <
std-proposals_at_[hidden]> wrote:
>
>
> On Wed, 28 Aug 2024 at 16:53, Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
> wrote:
>
>> On Wed, Aug 28, 2024 at 9:12 AM Oliver Schädlich via Std-Proposals <
>> std-proposals_at_[hidden]> wrote:
>>
>>> I recently have presented my idea for an improvement of shared_ptr<> in
>>> combination with
>>> atomic<shared_ptr<>> here, but I was misunderstood. My idea was simply
>>> that shared_ptr<>
>>> should add a new assignement-operator with a atomic<shared_ptr<>> as its
>>> parameter.
>>>
>> I believe I understood that `operator=` was the form-factor you hit on
>> first, yes. But that's not the right form-factor.
>> I very strongly oppose adding any new overloaded operator to `atomic`
>> which does not have any named-member-function version, for two reasons:
>> - Good multithreaded code IMNSHO makes all atomic operations explicit
>> (e.g. `int b = a.load(); a.store(b)`).
>> - You need somewhere to hang the `memory_order` argument, anyway.
>>
>> With current C++ load()ing from a atomic<shared_ptr<>> is extremely slow,
>>> even when you
>>> have a lock-free solution since the cacheline-flipping between the cores
>>> is rather expensive.
>>> I've written my own shared_obj<>- (like shared_ptr<>) and
>>> tshared_obj<>-classes (like atomic
>>> <shared_ptr<>>) and with this optimization I get a speedup of factor
>>> 5,000 when 32 threads
>>> are contending on a single thared_obj<>, i.e. constantly copying it's
>>> pointer to a shared_obj<>
>>> object while the tshared_obj is rarely updated.
>>>
>>> This is the code:
>>>
>>> template<typename T>
>>> shared_obj<T> &shared_obj<T>::operator =( tshared_obj<T> const &ts )
>>> noexcept
>>> {
>>> using namespace std;
>>> // important optimization for RCU-like patterns:
>>> // we're managing the same object like the thread-safe
>>> object-pointer ?
>>> if( (size_t)m_ctrl == ts.m_dca.first().load( memory_order_relaxed )
>>> ) [[likely]]
>>> // yes: nothing to do
>>> return *this;
>>>
>> FYI, this would not be correct for `std::shared_ptr`. With
>> `std::shared_ptr` it is totally possible to have two shared_ptrs owning the
>> same control block yet with different values:
>> std::shared_ptr<int[]> a = std::make_shared<int[]>(2);
>> std::shared_ptr<int> b = std::shared_ptr<int>(a, &a[0]);
>> std::shared_ptr<int> c = std::shared_ptr<int>(a, &a[1]);
>> std::atomic<std::shared_ptr<int>> at;
>> at.store(b);
>> c = at.load(); // Replace this line with the new API, i.e.
>> at.load_into(c); or c = at;
>> assert(c == b); // This had better be true after the load!
>>
>> So you need to look at both halves of the shared_ptr. I believe this
>> requires a double-wide atomic load to avoid tearing-esque problems.
>>
>
> Or you load an actual shared_ptr<T> object, which gives you both pointers
> "atomically" (even if there might be a lock involved), but that involves
> the ref count increment that is apparently the source of the overhead.
>
> Maybe you could atomically get a pair<control_block*, T*> and then if both
> values (or just the first?) equal the values in *this, then you do nothing
> (and we know this is thread-safe because *this must own the pointer, so its
> ref count is greater than 0 and won't drop to zero even if the
> atomic<shared_ptr<T>> is being concurrently modified). If the values are
> not equal, then you need to reload from the atomic, because it could have
> changed since you loaded the pair<control_block*, T*> and invalidated the
> previous values.
>
> But that's not what the posted code does.
>
>
> // erase our object pointer
>>> this->~shared_obj();
>>>
>> As Jonathan said, technically, what you want here is
>> *this = nullptr;
>> Also, the following line repoints `m_ctrl`, but where do you actually
>> repoint the object pointer to point to the same object as `*ts`?
>>
>
>
> It looks like this implementation just has a single pointer (the pointer
> to the control block, maybe?) and not a second pointer to a T*. That's not
> how std::shared_ptr works, and I don't immediately see how to adapt the
> code that was shown to work for std::shared_ptr.
>
>
>> // copy() increments the reference count for ourself
>>> m_ctrl = ts.copy();
>>> return *this;
>>> }
>>>
>>
>>
>> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Received on 2024-11-22 17:50:15