ISOCPP Std Discussion List: Re: Synchronization of atomic notify and wait

From: Thiago Macieira <thiago_at_[hidden]>
Date: Tue, 26 Jul 2022 08:23:45 -0700

On Tuesday, 26 July 2022 01:26:55 PDT Andrey Semashev via Std-Discussion
wrote:
> No, I meant this:
>
> 1. Load an aligned 32-bit word that contains the smaller (e.g. 8-bit)
> atomic.
> 2. Shift and mask it to get rid of the excess bits.
> 3. Perform the operation.
> 4. Shift and merge the modified value with the original excess bits so
> that the excess bits are unchanged.
> 5. Perform a 32-bit CAS.
> 6. Goto #2 if failed.

I'm assuming a platform that has 8- and 16-bit atomic instructions on its own.
If you have to do the above, then you're on a platform that does not support
that, which means C++ is emulating those two somehow (I don't care how, that's
not a platform I personally care about), which means your technique is
probably acceptable.

> But now that I think of it, it may not work if the target supports
> smaller than 32-bit atomics but not futex.

If it doesn't have an OS offload like futex, then it must be emulating wait/
notify with either a mutex/wait condition or a pure spinlock. But since you
don't have an OS offload, there's no size issue in the first place: everything is
under the control of the Standard Library.

> The architecture may not
> synchronize overlapping atomic operations on different addresses and/or
> of different sizes. Even x86 formally has this restriction, even though
> I think it works in practice and is being used in used at least in glibc
> (though I'm not totally sure). So, in general, it may not work
> efficiently (i.e. you would have to implement smaller atomics with
> 32-bit ones even though smaller atomics are directly supported by the
> hardware).

Right, at this point it becomes a QoI and therefore an architectural question.
And whether the implementers are willing to depend on undocumented
microarchitectural behaviour too (for example, all Intel big cores since
Haswell support atomic 32-byte loads and stores so long as they don't cross a
cacheline, but they didn't before and Atom still doesn't).

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel DPG Cloud Engineering

Received on 2022-07-26 15:23:47

std-discussion