Date: Tue, 26 Jul 2022 18:52:07 +0300
On 7/26/22 18:23, Thiago Macieira via Std-Discussion wrote:
> On Tuesday, 26 July 2022 01:26:55 PDT Andrey Semashev via Std-Discussion
> wrote:
>> No, I meant this:
>>
>> 1. Load an aligned 32-bit word that contains the smaller (e.g. 8-bit)
>> atomic.
>> 2. Shift and mask it to get rid of the excess bits.
>> 3. Perform the operation.
>> 4. Shift and merge the modified value with the original excess bits so
>> that the excess bits are unchanged.
>> 5. Perform a 32-bit CAS.
>> 6. Goto #2 if failed.
>
> I'm assuming a platform that has 8- and 16-bit atomic instructions on its own.
> If you have to do the above, then you're on a platform that does not support
> that, which means C++ is emulating those two somehow (I don't care how, that's
> not a platform I personally care about), which means your technique is
> probably acceptable.
>
>> But now that I think of it, it may not work if the target supports
>> smaller than 32-bit atomics but not futex.
>
> If it doesn't have an OS offload like futex, then it must be emulating wait/
> notify with either a mutex/wait condition or a pure spinlock. But since you
> don't have an OS offload, there's no size issue in the first place: everything is
> under the control of the Standard Library.
The point is to use futex even with atomics smaller than 32-bit because
this is arguably more efficient than a lock/futex pool. But given that
there is a tradeoff of performing all atomic operations using
masking+CAS, and that wait/notify is likely the minority of atomics
usage, it probably isn't worth it in a general case such as std::atomic.
But it may be worth doing in other, more specialized cases.
> On Tuesday, 26 July 2022 01:26:55 PDT Andrey Semashev via Std-Discussion
> wrote:
>> No, I meant this:
>>
>> 1. Load an aligned 32-bit word that contains the smaller (e.g. 8-bit)
>> atomic.
>> 2. Shift and mask it to get rid of the excess bits.
>> 3. Perform the operation.
>> 4. Shift and merge the modified value with the original excess bits so
>> that the excess bits are unchanged.
>> 5. Perform a 32-bit CAS.
>> 6. Goto #2 if failed.
>
> I'm assuming a platform that has 8- and 16-bit atomic instructions on its own.
> If you have to do the above, then you're on a platform that does not support
> that, which means C++ is emulating those two somehow (I don't care how, that's
> not a platform I personally care about), which means your technique is
> probably acceptable.
>
>> But now that I think of it, it may not work if the target supports
>> smaller than 32-bit atomics but not futex.
>
> If it doesn't have an OS offload like futex, then it must be emulating wait/
> notify with either a mutex/wait condition or a pure spinlock. But since you
> don't have an OS offload, there's no size issue in the first place: everything is
> under the control of the Standard Library.
The point is to use futex even with atomics smaller than 32-bit because
this is arguably more efficient than a lock/futex pool. But given that
there is a tradeoff of performing all atomic operations using
masking+CAS, and that wait/notify is likely the minority of atomics
usage, it probably isn't worth it in a general case such as std::atomic.
But it may be worth doing in other, more specialized cases.
Received on 2022-07-26 15:52:09
