Date: Tue, 11 Feb 2020 08:18:13 -0800
On Tuesday, 11 February 2020 00:32:59 PST Andrey Semashev via Std-Discussion
wrote:
> Well, something along the lines how I phrased my post. For example:
>
> static constexpr size_t required_alignment;
>
> 1 Let A be the alignment required for an object to be referenced by
> an atomic reference so that is_always_lock_free is true. If there is no
> such alignment or A is less than alignof(T), required_alignment equals
> alignof(T). Otherwise, required_alignment equals A.
>
> 2 [Note: Hardware could require an object referenced by an atomic_ref
> to have stricter alignment (6.7.6) than other objects of type T. For
> example, lock-free operations on std::complex<double> could be supported
> only if aligned to 2*alignof(double). — end note]
>
> In the note I removed the sentence that "operations on an atomic_ref are
> lock-free could depend on the alignment of the referenced object"
> because apparently they couldn't. is_lock_free() description says it
> returns true or false for *all* objects, not just the ones aligned to
> required_alignment.
Looks good to me.
> However, what should we do with this wording. Should I file a defect
> report or is it worth a formal proposal? I could write a proposal, but I
> won't be able to present it to the committee.
I don't know. My feeling is that this is just a little too big for a DR.
> > But this does not apply to other instructions. For all others, the HW does
> > and has always done (since at least i386) unaligned atomic operations,
> > silently.
> Unfortunately, this is not true. Loads and stores are normally
> implemented with a MOV, and SDM only guarantees atomicity (i.e. a single
> memory load or store) for naturally aligned memory locations (see SDM
> Section 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double
> Quadwords). On other words, the operation silently becomes non-atomic
> for unaligned locations. This concerns all operand sizes from 2 to 8
> bytes, on both x86-32 and x86-64.
Loads and stores with MOV, but you can perform a load with LOCK XADD 0 and you
can store with XCHG, up to register-sized locations. The point is that the
LOCKed instructions (other than LOCK CMPXCHG16B, but apparently including LOCK
CMPXCHG8B) do not require alignment.
I'm not saying that compilers should generate these instructions or that they
should even support unaligned atomics. I'm saying the hardware does, with
suitably horrible performance, for legacy reasons.
wrote:
> Well, something along the lines how I phrased my post. For example:
>
> static constexpr size_t required_alignment;
>
> 1 Let A be the alignment required for an object to be referenced by
> an atomic reference so that is_always_lock_free is true. If there is no
> such alignment or A is less than alignof(T), required_alignment equals
> alignof(T). Otherwise, required_alignment equals A.
>
> 2 [Note: Hardware could require an object referenced by an atomic_ref
> to have stricter alignment (6.7.6) than other objects of type T. For
> example, lock-free operations on std::complex<double> could be supported
> only if aligned to 2*alignof(double). — end note]
>
> In the note I removed the sentence that "operations on an atomic_ref are
> lock-free could depend on the alignment of the referenced object"
> because apparently they couldn't. is_lock_free() description says it
> returns true or false for *all* objects, not just the ones aligned to
> required_alignment.
Looks good to me.
> However, what should we do with this wording. Should I file a defect
> report or is it worth a formal proposal? I could write a proposal, but I
> won't be able to present it to the committee.
I don't know. My feeling is that this is just a little too big for a DR.
> > But this does not apply to other instructions. For all others, the HW does
> > and has always done (since at least i386) unaligned atomic operations,
> > silently.
> Unfortunately, this is not true. Loads and stores are normally
> implemented with a MOV, and SDM only guarantees atomicity (i.e. a single
> memory load or store) for naturally aligned memory locations (see SDM
> Section 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double
> Quadwords). On other words, the operation silently becomes non-atomic
> for unaligned locations. This concerns all operand sizes from 2 to 8
> bytes, on both x86-32 and x86-64.
Loads and stores with MOV, but you can perform a load with LOCK XADD 0 and you
can store with XCHG, up to register-sized locations. The point is that the
LOCKed instructions (other than LOCK CMPXCHG16B, but apparently including LOCK
CMPXCHG8B) do not require alignment.
I'm not saying that compilers should generate these instructions or that they
should even support unaligned atomics. I'm saying the hardware does, with
suitably horrible performance, for legacy reasons.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Software Architect - Intel System Software Products
Received on 2020-02-11 10:20:54