Date: Mon, 10 Feb 2020 19:53:28 +0300
On 2020-02-10 19:32, Thiago Macieira via Std-Discussion wrote:
> On Monday, 10 February 2020 08:24:24 PST Thiago Macieira wrote:
>> On Monday, 10 February 2020 00:44:42 PST Andrey Semashev via Std-Discussion
>>
>> wrote:
>>> Yes, x86 is one of such architectures, as is any CAS-based architectures
>>> I'm aware of. The example is mentioned in the standard:
>>> atomic_ref<complex<double>>. Assuming double is 64-bit and
>>> complex<double> is a 128-bit structure, it must be aligned to 16 bytes
>>> for atomic CAS(*), but alignof(complex<double>) is 8.
>>
>> [snip]
>>
>>> But given the definition of required_alignment in the standard, this
>>> code does not guarantee that ref.is_lock_free() will be true. An
>>> implementation where required_alignment is 8 and ref is implemented with
>>> a lock pool is allowed by the standard. Even if such atomic_ref
>>> implementation supports lock-free operations starting at alignment of 16.
>>
>> No, it doesn't guarantee lock-free, it just guarantees atomicity. The need
>> to use a lock is usually related to size, not alignment. x86-64 will do a
>> 16-byte CAS lock-free, i386 will not, regardless of alignment.
>>
>>> What I want is a constant that would be have the value of 16 in this
>>> case. I.e. the alignment that would make atomic_ref lock-free, if it can
>>> be at all for an object of such size. If it can't, let it be equal to
>>> alignof(T).
>
> Sorry, hit Ctrl+Enter by accident
>
> Is there any wording we can add to ask implementations to increase alignment
> so that it is lock-free? Or is it a QoI?
I did not find one, therefore my confusion.
> I agree we don't want two constants. We want one and we want that the
> requested alignment be suitable for lock-free operations, if such operations
> exist on that system.
>
>>> (*) If it is not aligned, atomicity is not guaranteed on x86. I think, a
>>> hardware exception is generated, which can be handled by the kernel. The
>>> exception handler may emulate the atomic CAS (at terrible performance
>>> cost) or it may kill the process with SIGBUS. For our purposes of
>>> atomic_ref, let's assume this is not the desired behavior and unaligned
>>> atomics are banned.
>
> No, atomicity is guaranteed on x86 by the hardware, though with extremely poor
> performance, hence the suggestion to shoot with extreme prejudice. The very
> newest processors announced by Intel have a "Split Lock Detection" feature,
> which is not enabled by default and will cause a HW exception like you
> describe. What you're describing is a future feature, not current and past
> behaviour.
>
> The Linux kernel does not implement a CAS emulation and from what I understand
> from my colleagues, it doesn't want to.
My understanding is that split lock detection allows the kernel to
handle the misaligned atomic operation. However, AFAIK, the requirement
for alignment has always been there, as documented in SDM. Older
processors simply crash with GP exception on alignment violation.
> On Monday, 10 February 2020 08:24:24 PST Thiago Macieira wrote:
>> On Monday, 10 February 2020 00:44:42 PST Andrey Semashev via Std-Discussion
>>
>> wrote:
>>> Yes, x86 is one of such architectures, as is any CAS-based architectures
>>> I'm aware of. The example is mentioned in the standard:
>>> atomic_ref<complex<double>>. Assuming double is 64-bit and
>>> complex<double> is a 128-bit structure, it must be aligned to 16 bytes
>>> for atomic CAS(*), but alignof(complex<double>) is 8.
>>
>> [snip]
>>
>>> But given the definition of required_alignment in the standard, this
>>> code does not guarantee that ref.is_lock_free() will be true. An
>>> implementation where required_alignment is 8 and ref is implemented with
>>> a lock pool is allowed by the standard. Even if such atomic_ref
>>> implementation supports lock-free operations starting at alignment of 16.
>>
>> No, it doesn't guarantee lock-free, it just guarantees atomicity. The need
>> to use a lock is usually related to size, not alignment. x86-64 will do a
>> 16-byte CAS lock-free, i386 will not, regardless of alignment.
>>
>>> What I want is a constant that would be have the value of 16 in this
>>> case. I.e. the alignment that would make atomic_ref lock-free, if it can
>>> be at all for an object of such size. If it can't, let it be equal to
>>> alignof(T).
>
> Sorry, hit Ctrl+Enter by accident
>
> Is there any wording we can add to ask implementations to increase alignment
> so that it is lock-free? Or is it a QoI?
I did not find one, therefore my confusion.
> I agree we don't want two constants. We want one and we want that the
> requested alignment be suitable for lock-free operations, if such operations
> exist on that system.
>
>>> (*) If it is not aligned, atomicity is not guaranteed on x86. I think, a
>>> hardware exception is generated, which can be handled by the kernel. The
>>> exception handler may emulate the atomic CAS (at terrible performance
>>> cost) or it may kill the process with SIGBUS. For our purposes of
>>> atomic_ref, let's assume this is not the desired behavior and unaligned
>>> atomics are banned.
>
> No, atomicity is guaranteed on x86 by the hardware, though with extremely poor
> performance, hence the suggestion to shoot with extreme prejudice. The very
> newest processors announced by Intel have a "Split Lock Detection" feature,
> which is not enabled by default and will cause a HW exception like you
> describe. What you're describing is a future feature, not current and past
> behaviour.
>
> The Linux kernel does not implement a CAS emulation and from what I understand
> from my colleagues, it doesn't want to.
My understanding is that split lock detection allows the kernel to
handle the misaligned atomic operation. However, AFAIK, the requirement
for alignment has always been there, as documented in SDM. Older
processors simply crash with GP exception on alignment violation.
Received on 2020-02-10 10:56:10