Date: Thu, 24 Aug 2023 08:46:10 -0700
On Thursday, 24 August 2023 00:25:46 PDT Nate Eldredge wrote:
> > On Aug 24, 2023, at 00:20, Thiago Macieira via Std-Discussion
> > <std-discussion_at_[hidden]> wrote:
> >
> > Don't forget that the moves between register files (the MOVD instructions)
> > are also non-trivial and cost 3 cycles each. So there's an argument that
> > locking a mutex and performing this addition a single time can be less
> > expensive than iterating multiple times, because you're paying the same
> > cost for CMPXCHG, but you're definitely doing a single addition and no
> > register file movements.
> By saying “locking a mutex” you’re proposing that std::atomic<float> etc
> should no longer be lock free on this platform. That means that every
> single std::atomic<float> in every program now needs an associated mutex,
> which now has to be locked and unlocked around every single operation,
> including plain loads and stores which currently are a single cheap
> instruction.
I'm not actually proposing this change. That would, as you pointed out, be a
binary-incompatible situation.
I'm just pointing out that the might end up being faster with a mutex,
depending on the contention rate.
> > On Aug 24, 2023, at 00:20, Thiago Macieira via Std-Discussion
> > <std-discussion_at_[hidden]> wrote:
> >
> > Don't forget that the moves between register files (the MOVD instructions)
> > are also non-trivial and cost 3 cycles each. So there's an argument that
> > locking a mutex and performing this addition a single time can be less
> > expensive than iterating multiple times, because you're paying the same
> > cost for CMPXCHG, but you're definitely doing a single addition and no
> > register file movements.
> By saying “locking a mutex” you’re proposing that std::atomic<float> etc
> should no longer be lock free on this platform. That means that every
> single std::atomic<float> in every program now needs an associated mutex,
> which now has to be locked and unlocked around every single operation,
> including plain loads and stores which currently are a single cheap
> instruction.
I'm not actually proposing this change. That would, as you pointed out, be a
binary-incompatible situation.
I'm just pointing out that the might end up being faster with a mutex,
depending on the contention rate.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Software Architect - Intel DCAI Cloud Engineering
Received on 2023-08-24 15:46:12