ISOCPP Std Discussion List: Re: Lack of ordering guarantees in shared

From: Nate Eldredge <nate_at_[hidden]>
Date: Sat, 15 Feb 2025 20:34:45 -0700

> On Feb 15, 2025, at 19:22, jenni_at_[hidden] wrote:
>
>> And it seems clear that shared_ptr does have to avoid data races on freeing the control block, which requires there to be some amount of ordering.
>
> But this ordering can be provided by the single total order on RMW operations of mo_relaxed, no?

No. The reference counter is all that provides synchronization between reads of the control block by "non-last" destructors (i.e. those whose decrement did not result in zero), and the deletion of the control block by the "last" destructor. And the only way to get that synchronization is with both a release and an acquire. (Or a consume, but let's discount those for now; they say they're being removed from C++26 anyhow.)

> If you read a refcount of 0 it implies that all other existent shared_ptrs have also completed their destructors at least to the point of decrementing the refcount. By definition there cannot be anyone else who could potentially use the control block while the last owner is inside the == 0 block. Any reads that happen as part of the delete could potentially leak outside of the if statement but any writes have to remain inside, lest a write be invented.

That's how real machines work, of course, but the C++ model doesn't promise it. A control dependency without an acquire barrier doesn't provide any happens-before ordering across threads, and doesn't save you from a data race and consequent UB. There's no distinction between reads and writes in this respect, and so either one would be allowed to "leak out".

So as I understand, ISO C++ would allow for a hypothetical machine that could make a write globally visible before the read on which it depended. For instance, if the machine could peek into the future and see that the read would inevitably return the correct value. Or, if it could make the write globally visible while still speculative, and have some way to roll back the entire system (including other cores) if it turned out to be wrong.

> The only thing that the acquire barrier could be synchronising in that case is the *content* of the control block, which shouldn't need to be synchronised since it's all trivially destructible and not read as part of the delete.
>
> The mo_release on the decrement also seems unnecessary if you're just synchronising the state of the control block. The only case you'd need that would be if you were modifying part of the control block and needed that to be visible on another thread when it does the acquire, which we've established isn't nessecary so the paired release isn't either, putting aside the fact that the refcount is the only thing being modified

The release is needed for the "non-last" case, since you'll have accesses to the control blocked sequenced before the decrement, and they need to not race with the delete in the thread which does do the final decrement. Again, the memory model insists on both sides of the release-acquire pair. And there's no trick like with an acquire fence to have a release barrier conditional on the value read by a RMW.

> As I understand it if shared_ptr doesn't provide any ordering guarantees on accesses to the object then all it requires is a STO on the increments/decrements, which mo_relaxed provides.
>
>> I'd expect it to say something like this: if D is the invocation of the destructor that invokes the deleter, and D' is any other invocation of the destructor of a `shared_ptr` that shares ownership with it, then D' strongly happens before the invocation of the deleter.
>
> Second this as being a good wording to provide ordering between the destructor and the deleter, but would still need additional wording beyond this to ensure that the accesses to the shared object happens-before the destructor also (although it would need to perhaps cover more than just accesses to the shared object? If a thread modifies an object via a pointer stored in the shared object, should that modification be visible in the deleter? Current library behaviour with rel-acq ordering is yes)

I think it's sufficient. It puts it on the programmer to ensure that every access to the shared object (other than its deleter) happens-before the destructor (or reset, move, etc) of at least one of the shared_ptr objects that owns it. And I think that's what people are already doing, based on the general understanding of how shared_ptr "should" work, as you already articulated. (I presume that in the vast majority of cases, this happens-before is achieved simply by sequenced-before; within a single thread, with respect to program order, you access the object through a shared_ptr, then destroy that shared_ptr, and don't access the object again after that.)

Assuming they do that, then they have the guarantee you want, because happens-before is transitive (again I disregard consume for simplicity). So their modifications of the object happen-before the destructor, which happens-before the deleter. Therefore their modifications happen-before the deleter, which means the deleter must observe them.

Received on 2025-02-16 03:34:59

std-discussion

Re: Lack of ordering guarantees in shared_ptr wording