C++ Logo

std-proposals

Advanced search

Re: [std-proposals] std::atomic_pointer_pair

From: Andrey Semashev <andrey.semashev_at_[hidden]>
Date: Tue, 30 Dec 2025 01:28:27 +0300
On 30 Dec 2025 01:11, Jonathan Wakely wrote:
>
>
> On Mon, 29 Dec 2025, 14:53 Andrey Semashev via Std-Proposals, <std-
> proposals_at_[hidden] <mailto:std-proposals_at_[hidden]>> wrote:
>
> On 29 Dec 2025 17:17, Frederick Virchanza Gotham via Std-Proposals
> wrote:
> >
> > I think it's clear what I need to do here:
> >
> > I need to edit the GNU g++ compiler to add a new command line
> option "-
> > mcx16-force", but actually I will name it "-mlockfree2ptrs". When this
> > command line option is given, the following boolean is true at
> compile time:
> >
> > atomic< __uint128_t >::is_always_lock_free
> >
> > And when you work with this type, the assembler is placed inline (i.e.
> > it doesn't call into libatomic).
>
> I think, this should already be the case if you specify a recent enough
> -march (or a sufficient combination of -m flags). It partly is with
> clang, but not with gcc. I think, you should report bugs to compiler
> devs.
>
> It's not a bug.
>
> -march says which instructions can be used for a given translation unit
> (or even for a given function within a translation unit) but the choice
> of how to implement atomic operations on a memory location is not local
> to a single function or a single TU.
>
> If one function uses cmpxchg16b to perform a read-modify-write operation
> on a variable and another function uses a lock, you have a problem.
> There is no requirement for all functions or all TUs to be compiled with
> the same -march option.
>
> So GCC will not use cmpxchg16b for __atomic_compare_exchange, it will
> make a call to libatomic for all TUs. Inside libatomic it will check the
> CPU flags at runtime and use cmpxchg16b if supported, but that means
> it's consistent for all TUs doing 16-byte __atomic_compare_exchange.
> Either all such calls use cmpxchg16b for a given process, or none do.
> The performance is fine, but the compile-time is_always_lock_free
> constant is false because it depends on runtime properties.

Thanks for the explanation. But I still think that there should be a way
to tell the compiler that it should just use cmpxchg16b because that's
my minimum requirement for a CPU. More so if the instruction is required
by the OS even before the program is run.

Ironically, one could try using is_always_lock_free to enforce that
hardware requirement at compile time, but with the current
implementation that wouldn't work.

Received on 2025-12-29 22:28:31