std-discussion: Re: atomic_ref::required

From: Andrey Semashev <andrey.semashev_at_[hidden]>
Date: Tue, 11 Feb 2020 11:32:59 +0300

On 2020-02-11 10:17, Thiago Macieira wrote:
> On Monday, 10 February 2020 08:53:28 PST Andrey Semashev via Std-Discussion
> wrote:
>>> Is there any wording we can add to ask implementations to increase
>>> alignment so that it is lock-free? Or is it a QoI?
>>
>> I did not find one, therefore my confusion.
>
> Sorry, rephrasing: can you come up with the wording so that the
> required_alignment is suitabe for lock_free operations?

Well, something along the lines how I phrased my post. For example:

   static constexpr size_t required_alignment;

  1 Let A be the alignment required for an object to be referenced by
an atomic reference so that is_always_lock_free is true. If there is no
such alignment or A is less than alignof(T), required_alignment equals
alignof(T). Otherwise, required_alignment equals A.

  2 [Note: Hardware could require an object referenced by an atomic_ref
to have stricter alignment (6.7.6) than other objects of type T. For
example, lock-free operations on std::complex<double> could be supported
only if aligned to 2*alignof(double). — end note]

In the note I removed the sentence that "operations on an atomic_ref are
lock-free could depend on the alignment of the referenced object"
because apparently they couldn't. is_lock_free() description says it
returns true or false for *all* objects, not just the ones aligned to
required_alignment.

However, what should we do with this wording. Should I file a defect
report or is it worth a formal proposal? I could write a proposal, but I
won't be able to present it to the committee.

>>> No, atomicity is guaranteed on x86 by the hardware, though with extremely
>>> poor performance, hence the suggestion to shoot with extreme prejudice.
>>> The very newest processors announced by Intel have a "Split Lock
>>> Detection" feature, which is not enabled by default and will cause a HW
>>> exception like you describe. What you're describing is a future feature,
>>> not current and past behaviour.
>>>
>>> The Linux kernel does not implement a CAS emulation and from what I
>>> understand from my colleagues, it doesn't want to.
>>
>> My understanding is that split lock detection allows the kernel to
>> handle the misaligned atomic operation. However, AFAIK, the requirement
>> for alignment has always been there, as documented in SDM. Older
>> processors simply crash with GP exception on alignment violation.
>
> It depends on what instruction we're talking about. As you've discovered,
> CMPXCHG16B does enforce a 16-byte alignment. And since it's the only 16-byte
> atomic instruction, any 16-byte atomic operation on x86-64 requires the
> alignment. Any other access will simply #GP(0).
>
> But this does not apply to other instructions. For all others, the HW does and
> has always done (since at least i386) unaligned atomic operations, silently.

Unfortunately, this is not true. Loads and stores are normally
implemented with a MOV, and SDM only guarantees atomicity (i.e. a single
memory load or store) for naturally aligned memory locations (see SDM
Section 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double
Quadwords). On other words, the operation silently becomes non-atomic
for unaligned locations. This concerns all operand sizes from 2 to 8
bytes, on both x86-32 and x86-64.

> The new thing is that Ice Lake Server and Tiger Lake can detect the situation
> and trap with an #AC (not #GP(0)).
>
> That's the same trap as would be produced by enabling the AC bit in the EFLAGS
> register. But since that flag applies to all accesses, not just atomic, so no
> one ever sets it. The CLAC and STAC instructions are basically unused, so they
> got repurposed for SMAP a few years ago. The new setting applies only to
> atomic accesses and cannot be disabled by user space.
>
> In any case, CMPXCHG16B is a good example. We want that
> atomic_ref<16bytes>::required_alignment be 16 on x86-64.

Agreed.

Received on 2020-02-11 02:35:41