Date: Mon, 10 Feb 2020 08:32:08 -0800
On Monday, 10 February 2020 08:24:24 PST Thiago Macieira wrote:
> On Monday, 10 February 2020 00:44:42 PST Andrey Semashev via Std-Discussion
>
> wrote:
> > Yes, x86 is one of such architectures, as is any CAS-based architectures
> > I'm aware of. The example is mentioned in the standard:
> > atomic_ref<complex<double>>. Assuming double is 64-bit and
> > complex<double> is a 128-bit structure, it must be aligned to 16 bytes
> > for atomic CAS(*), but alignof(complex<double>) is 8.
>
> [snip]
>
> > But given the definition of required_alignment in the standard, this
> > code does not guarantee that ref.is_lock_free() will be true. An
> > implementation where required_alignment is 8 and ref is implemented with
> > a lock pool is allowed by the standard. Even if such atomic_ref
> > implementation supports lock-free operations starting at alignment of 16.
>
> No, it doesn't guarantee lock-free, it just guarantees atomicity. The need
> to use a lock is usually related to size, not alignment. x86-64 will do a
> 16-byte CAS lock-free, i386 will not, regardless of alignment.
>
> > What I want is a constant that would be have the value of 16 in this
> > case. I.e. the alignment that would make atomic_ref lock-free, if it can
> > be at all for an object of such size. If it can't, let it be equal to
> > alignof(T).
Sorry, hit Ctrl+Enter by accident
Is there any wording we can add to ask implementations to increase alignment
so that it is lock-free? Or is it a QoI?
I agree we don't want two constants. We want one and we want that the
requested alignment be suitable for lock-free operations, if such operations
exist on that system.
> > (*) If it is not aligned, atomicity is not guaranteed on x86. I think, a
> > hardware exception is generated, which can be handled by the kernel. The
> > exception handler may emulate the atomic CAS (at terrible performance
> > cost) or it may kill the process with SIGBUS. For our purposes of
> > atomic_ref, let's assume this is not the desired behavior and unaligned
> > atomics are banned.
No, atomicity is guaranteed on x86 by the hardware, though with extremely poor
performance, hence the suggestion to shoot with extreme prejudice. The very
newest processors announced by Intel have a "Split Lock Detection" feature,
which is not enabled by default and will cause a HW exception like you
describe. What you're describing is a future feature, not current and past
behaviour.
The Linux kernel does not implement a CAS emulation and from what I understand
from my colleagues, it doesn't want to.
> On Monday, 10 February 2020 00:44:42 PST Andrey Semashev via Std-Discussion
>
> wrote:
> > Yes, x86 is one of such architectures, as is any CAS-based architectures
> > I'm aware of. The example is mentioned in the standard:
> > atomic_ref<complex<double>>. Assuming double is 64-bit and
> > complex<double> is a 128-bit structure, it must be aligned to 16 bytes
> > for atomic CAS(*), but alignof(complex<double>) is 8.
>
> [snip]
>
> > But given the definition of required_alignment in the standard, this
> > code does not guarantee that ref.is_lock_free() will be true. An
> > implementation where required_alignment is 8 and ref is implemented with
> > a lock pool is allowed by the standard. Even if such atomic_ref
> > implementation supports lock-free operations starting at alignment of 16.
>
> No, it doesn't guarantee lock-free, it just guarantees atomicity. The need
> to use a lock is usually related to size, not alignment. x86-64 will do a
> 16-byte CAS lock-free, i386 will not, regardless of alignment.
>
> > What I want is a constant that would be have the value of 16 in this
> > case. I.e. the alignment that would make atomic_ref lock-free, if it can
> > be at all for an object of such size. If it can't, let it be equal to
> > alignof(T).
Sorry, hit Ctrl+Enter by accident
Is there any wording we can add to ask implementations to increase alignment
so that it is lock-free? Or is it a QoI?
I agree we don't want two constants. We want one and we want that the
requested alignment be suitable for lock-free operations, if such operations
exist on that system.
> > (*) If it is not aligned, atomicity is not guaranteed on x86. I think, a
> > hardware exception is generated, which can be handled by the kernel. The
> > exception handler may emulate the atomic CAS (at terrible performance
> > cost) or it may kill the process with SIGBUS. For our purposes of
> > atomic_ref, let's assume this is not the desired behavior and unaligned
> > atomics are banned.
No, atomicity is guaranteed on x86 by the hardware, though with extremely poor
performance, hence the suggestion to shoot with extreme prejudice. The very
newest processors announced by Intel have a "Split Lock Detection" feature,
which is not enabled by default and will cause a HW exception like you
describe. What you're describing is a future feature, not current and past
behaviour.
The Linux kernel does not implement a CAS emulation and from what I understand
from my colleagues, it doesn't want to.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Software Architect - Intel System Software Products
Received on 2020-02-10 10:34:49