Date: Mon, 29 Dec 2025 22:01:24 +0000
On Mon, 29 Dec 2025, 14:18 Frederick Virchanza Gotham via Std-Proposals, <
std-proposals_at_[hidden]> wrote:
> On Mon, 29 Dec 2025, 1:35 am Jonathan Wakely wrote:
>
>>
>> You need -mcx16 to let GCC use that instruction, and even with that
>> option atomic ops will call into libatomic which decides at runtime whether
>> to use cmpxchg16b or not.
>>
>
>
> There are a few operating systems nowadays that refuse to boot if
> 'cmpxchg16b' is missing. MS-Windows won't boot up since version 8.1, as
> well as openSUSE.
>
> 'cmpxchg16b' became commonplace 19 years ago. It doesn't make sense that a
> programmer in 2025 should be burdened by old hardware from 2006.
>
>
>
>
>>> Why should this type have those conversations to/from integers when
>> std::atomic<void*> doesn't?
>>
>> Why would both signed and unsigned forms be needed?
>>
>
>
> Instead of dealing with a "pointer + pointer", the lockfree container
> might deal with "pointer + tag" or "pointer + counter". Hence the need for
> methods that work with integers. The counter might start at -1, hence
> intptr_t.
>
> I think it's clear what I need to do here:
>
> I need to edit the GNU g++ compiler to add a new command line option
> "-mcx16-force", but actually I will name it "-mlockfree2ptrs". When this
> command line option is given, the following boolean is true at compile time:
>
> atomic< __uint128_t >::is_always_lock_free
>
> And when you work with this type, the assembler is placed inline (i.e. it
> doesn't call into libatomic).
> I will also add a second command line option,
> "-mlockfree2ptrs-main=main2". If you use this command line option, then the
> '_start' routine gets extra instructions as follows:
>
> mov $1, %eax
> cpuid
> bt $13, %ecx # CF = ECX[13]
> jnc main2
>
> So basically if 'cmpxchg16b' is not supported, it jumps into 'main2',
> where you can do something like:
>
> void main2(void)
> {
> puts("Contact Stephen on stephen_at_[hidden] to get the build
> you need for your system -- you need the x86_64 build without atomic
> pointer pairs");
> }
>
> And then the last thing I would do is argue to the GNU decision makers
> that "-mlockfree2ptrs" should be the default, and that you should have to
> disable it with "-mno-lockfree2ptrs".
>
> To still be calling functions in libatomic for 128-Bit numbers on x86_64
> going into 2026 is not good enough -- performance-critical algorithms are
> being slowed down on modern day CPU's in order to accommodate old CPU's
> from 19 years ago. It's not good enough.
>
> In my work, it looks like I'll soon be tasked with a piece of software to
> 'speed up', and it runs on x86_64. Before I even look at the code, I think
> the first thing I'll do is re-build it with my "-mlockfree2ptrs" compiler
> and see if that makes it any faster.
>
> In fact, after building my own compiler, I'll have to rebuild my own
> compiler with my own compiler to make sure that libc and libstdc++ and so
> on also get forced 128-Bit atomics (although I think maybe the GNU build
> system does this itself automatically -- I think it builds 3 times).
>
> Oh and just as an aside, every compiler uses something like this already
> in order to implement a lockfree std::atomic< std::shared_ptr<T> >.
>
No they don't.
> But first thing's first, I will write "-mlockfree2ptrs" into the GNU g++
> compiler.
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
std-proposals_at_[hidden]> wrote:
> On Mon, 29 Dec 2025, 1:35 am Jonathan Wakely wrote:
>
>>
>> You need -mcx16 to let GCC use that instruction, and even with that
>> option atomic ops will call into libatomic which decides at runtime whether
>> to use cmpxchg16b or not.
>>
>
>
> There are a few operating systems nowadays that refuse to boot if
> 'cmpxchg16b' is missing. MS-Windows won't boot up since version 8.1, as
> well as openSUSE.
>
> 'cmpxchg16b' became commonplace 19 years ago. It doesn't make sense that a
> programmer in 2025 should be burdened by old hardware from 2006.
>
>
>
>
>>> Why should this type have those conversations to/from integers when
>> std::atomic<void*> doesn't?
>>
>> Why would both signed and unsigned forms be needed?
>>
>
>
> Instead of dealing with a "pointer + pointer", the lockfree container
> might deal with "pointer + tag" or "pointer + counter". Hence the need for
> methods that work with integers. The counter might start at -1, hence
> intptr_t.
>
> I think it's clear what I need to do here:
>
> I need to edit the GNU g++ compiler to add a new command line option
> "-mcx16-force", but actually I will name it "-mlockfree2ptrs". When this
> command line option is given, the following boolean is true at compile time:
>
> atomic< __uint128_t >::is_always_lock_free
>
> And when you work with this type, the assembler is placed inline (i.e. it
> doesn't call into libatomic).
> I will also add a second command line option,
> "-mlockfree2ptrs-main=main2". If you use this command line option, then the
> '_start' routine gets extra instructions as follows:
>
> mov $1, %eax
> cpuid
> bt $13, %ecx # CF = ECX[13]
> jnc main2
>
> So basically if 'cmpxchg16b' is not supported, it jumps into 'main2',
> where you can do something like:
>
> void main2(void)
> {
> puts("Contact Stephen on stephen_at_[hidden] to get the build
> you need for your system -- you need the x86_64 build without atomic
> pointer pairs");
> }
>
> And then the last thing I would do is argue to the GNU decision makers
> that "-mlockfree2ptrs" should be the default, and that you should have to
> disable it with "-mno-lockfree2ptrs".
>
> To still be calling functions in libatomic for 128-Bit numbers on x86_64
> going into 2026 is not good enough -- performance-critical algorithms are
> being slowed down on modern day CPU's in order to accommodate old CPU's
> from 19 years ago. It's not good enough.
>
> In my work, it looks like I'll soon be tasked with a piece of software to
> 'speed up', and it runs on x86_64. Before I even look at the code, I think
> the first thing I'll do is re-build it with my "-mlockfree2ptrs" compiler
> and see if that makes it any faster.
>
> In fact, after building my own compiler, I'll have to rebuild my own
> compiler with my own compiler to make sure that libc and libstdc++ and so
> on also get forced 128-Bit atomics (although I think maybe the GNU build
> system does this itself automatically -- I think it builds 3 times).
>
> Oh and just as an aside, every compiler uses something like this already
> in order to implement a lockfree std::atomic< std::shared_ptr<T> >.
>
No they don't.
> But first thing's first, I will write "-mlockfree2ptrs" into the GNU g++
> compiler.
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Received on 2025-12-29 22:01:43
