Date: Mon, 29 Dec 2025 00:59:12 +0000
While trying to implement std::atomic< std::chimeric_ptr< Ts... > >, I
wanted to save myself as much work as possible. At first I thought the
compiler might be smart enough to make the following lock free:
std::atomic< std::pair<void*,void*> >
but it turns out that it's not lockfree. So then I tried something even
more simple:
std::atomic< __uint128_t >
Amazingly, this wasn't lockfree either. I'm dealing with the latest 'trunk'
GNU g++ compiler on x86_64 Linux, which has a CPU instruction (compare and
swap 16 bytes) to make this possible. The aarch64 instruction set can do
this too. I think even the old 32-Bit x86 CPU's could do compare and swap
atomically on 8 bytes.
Most kinds of lockfree container will need to access two pointers together
atomically, and I don't think every implementor should have to re-invent
the wheel here. I'm thinking that the C++ standard really needs a lockfree
type called "std::atomic_pointer_pair", which internally would be like this:
struct atomic_pointer_pair {
alignas( 2u * sizeof(void*) ) void *p1;
void *p2;
};
It would have a 'set' member function like:
std::pair<void*,void*> set(void *a, void *b);
Return value: previous value
Also, to make it easy to use integers instead of pointers, the 'set' and
'get' functions would come in nine forms:
pair<void*,void*> get(void);
pair<void*,uintptr_t> get_pu(void);
pair<void*, intptr_t> get_pi(void);
pair<uintptr_t,void*> get_up(void);
pair< intptr_t,void*> get_ip(void);
pair< intptr_t, intptr_t> get_ii(void);
pair< intptr_t,uintptr_t> get_iu(void);
pair<uintptr_t, intptr_t> get_ui(void);
pair<uintptr_t,uintptr_t> get_uu(void);
Similarly the 'set' method would have 9 forms.
On platforms that have a CPU instruction to do atomic compare-and-swap on
two pointers (for example x86_64: cmpxchg16b), sizeof(atomic_pointer_pair)
will be 2 * sizeof(void*).
If a platform genuinely can't do atomic operations on two pointers
together, then either:
a) Force a compiler error
or:
b) Put an atomic_flag inside the struct -- but then
sizeof(atomic_pointer_pair) would no longer be equal to 2*sizeof(void*)
Maybe a compiler flag would be suitable here, something like
"-femulate-lockfree". Without the flag, compilation fails if it cannot be
done lockfree. With the flag, sizeof(atomic_pointer_pair) ==
(2u*sizeof(void*)+sizeof(atomic_flag)).
Last thing:
If we're dealing with an architecture that has data pointers which are
different to code pointers (e.g. different size, different alignment), then
'std::atomic_pointer_pair' works with the bigger or more-strictly-aligned
of the two.
Previously I said that the getters and setters would have 9 forms, but
actually there might be double that number to accommodate function pointers.
wanted to save myself as much work as possible. At first I thought the
compiler might be smart enough to make the following lock free:
std::atomic< std::pair<void*,void*> >
but it turns out that it's not lockfree. So then I tried something even
more simple:
std::atomic< __uint128_t >
Amazingly, this wasn't lockfree either. I'm dealing with the latest 'trunk'
GNU g++ compiler on x86_64 Linux, which has a CPU instruction (compare and
swap 16 bytes) to make this possible. The aarch64 instruction set can do
this too. I think even the old 32-Bit x86 CPU's could do compare and swap
atomically on 8 bytes.
Most kinds of lockfree container will need to access two pointers together
atomically, and I don't think every implementor should have to re-invent
the wheel here. I'm thinking that the C++ standard really needs a lockfree
type called "std::atomic_pointer_pair", which internally would be like this:
struct atomic_pointer_pair {
alignas( 2u * sizeof(void*) ) void *p1;
void *p2;
};
It would have a 'set' member function like:
std::pair<void*,void*> set(void *a, void *b);
Return value: previous value
Also, to make it easy to use integers instead of pointers, the 'set' and
'get' functions would come in nine forms:
pair<void*,void*> get(void);
pair<void*,uintptr_t> get_pu(void);
pair<void*, intptr_t> get_pi(void);
pair<uintptr_t,void*> get_up(void);
pair< intptr_t,void*> get_ip(void);
pair< intptr_t, intptr_t> get_ii(void);
pair< intptr_t,uintptr_t> get_iu(void);
pair<uintptr_t, intptr_t> get_ui(void);
pair<uintptr_t,uintptr_t> get_uu(void);
Similarly the 'set' method would have 9 forms.
On platforms that have a CPU instruction to do atomic compare-and-swap on
two pointers (for example x86_64: cmpxchg16b), sizeof(atomic_pointer_pair)
will be 2 * sizeof(void*).
If a platform genuinely can't do atomic operations on two pointers
together, then either:
a) Force a compiler error
or:
b) Put an atomic_flag inside the struct -- but then
sizeof(atomic_pointer_pair) would no longer be equal to 2*sizeof(void*)
Maybe a compiler flag would be suitable here, something like
"-femulate-lockfree". Without the flag, compilation fails if it cannot be
done lockfree. With the flag, sizeof(atomic_pointer_pair) ==
(2u*sizeof(void*)+sizeof(atomic_flag)).
Last thing:
If we're dealing with an architecture that has data pointers which are
different to code pointers (e.g. different size, different alignment), then
'std::atomic_pointer_pair' works with the bigger or more-strictly-aligned
of the two.
Previously I said that the getters and setters would have 9 forms, but
actually there might be double that number to accommodate function pointers.
Received on 2025-12-29 00:59:14
