Date: Fri, 9 Jan 2026 12:31:56 +0000
On Fri, Jan 9, 2026 at 11:58 AM Thiago Macieira wrote:
>
> > In choosing the name for the type "atomic_pointer_pair", the 'pointer'
> > part really just means "the register width". That's why the member
> > functions of atomic_pointer_pair will allow you to access either of
> > the pointers as intptr_t, so that when you're writing a lockfree
> > container you can have "pointer + counter".
>
> And why wouldn't a plain struct of the same size suffice?
Looking at GNU g++, LLVM clang, Intel ICX, Microsoft MSVC:
https://godbolt.org/z/xxMra1EPh
Only Intel have got this right. The other three use some sort of lock
(which presumably is located in a global container somewhere --
hopefully without dynamic allocation).
So today in 2026, if you use a simple two-pointer struct, you'll end
up with a slow program (except if you use the Intel compiler).
So I suppose there's two things to talk about:
1) There's a quality-of-implementation issue with the other 3 compilers
2) Maybe we should have a type called 'lockfree_pointer_pair' that
is always lockfree, meaning that it uses CPU instructions to achieve
lockfreeness. If you run such a binary on a computer which is missing
the lockfree instruction, you get SIGILL for Illegal Instruction.
I could wait around for the other 3 compilers to pull their socks up,
and I can patch the GNU g++ compiler myself (I'm already tweaking the
implementation of libat_is_lock_free), but in the meantime I want a
class something like 'lockfree_pointer_pair'.
Another Idea: If you try to use lockfree_pointer_pair on a target
missing the required CPU instructions, then you get a compile time
error. However if you define __cplusplus_fake_lockfree, then here's
what you get:
struct TwoPointers {
void *p, *q;
std::atomic_flag f;
};
And so then the 'set' method would be something like:
pair<void*,void*> set( void *const parg, void *const qarg )
{
while ( f.test_and_set() );
pair<void*,void*> const previous = { p, q };
p = parg;
q = qarg;
f.clear();
return previous;
}
I reckon that "spin loop" is a lot better than a global array of
mutexes indexed by a memory address. But if you really want a mutex
then it could be:
struct TwoPointers {
void *p, *q;
std::shared_mutex m;
pair<void*,void*> set( void *const parg, void *const qarg )
{
std::lock_guard mylock(m); // exclusive lock
pair<void*,void*> const previous = { p, q };
p = parg;
q = qarg;
return previous;
}
pair<void*,void*> get(void)
{
std::shared_lock mylock(m); // lock only for reading
return { p, q };
}
};
Yeah it's looking like I'm going to try write fully-portable
standard-compliant code for 'lockfree_pointer_pair' that will work
everywhere.
>
> > In choosing the name for the type "atomic_pointer_pair", the 'pointer'
> > part really just means "the register width". That's why the member
> > functions of atomic_pointer_pair will allow you to access either of
> > the pointers as intptr_t, so that when you're writing a lockfree
> > container you can have "pointer + counter".
>
> And why wouldn't a plain struct of the same size suffice?
Looking at GNU g++, LLVM clang, Intel ICX, Microsoft MSVC:
https://godbolt.org/z/xxMra1EPh
Only Intel have got this right. The other three use some sort of lock
(which presumably is located in a global container somewhere --
hopefully without dynamic allocation).
So today in 2026, if you use a simple two-pointer struct, you'll end
up with a slow program (except if you use the Intel compiler).
So I suppose there's two things to talk about:
1) There's a quality-of-implementation issue with the other 3 compilers
2) Maybe we should have a type called 'lockfree_pointer_pair' that
is always lockfree, meaning that it uses CPU instructions to achieve
lockfreeness. If you run such a binary on a computer which is missing
the lockfree instruction, you get SIGILL for Illegal Instruction.
I could wait around for the other 3 compilers to pull their socks up,
and I can patch the GNU g++ compiler myself (I'm already tweaking the
implementation of libat_is_lock_free), but in the meantime I want a
class something like 'lockfree_pointer_pair'.
Another Idea: If you try to use lockfree_pointer_pair on a target
missing the required CPU instructions, then you get a compile time
error. However if you define __cplusplus_fake_lockfree, then here's
what you get:
struct TwoPointers {
void *p, *q;
std::atomic_flag f;
};
And so then the 'set' method would be something like:
pair<void*,void*> set( void *const parg, void *const qarg )
{
while ( f.test_and_set() );
pair<void*,void*> const previous = { p, q };
p = parg;
q = qarg;
f.clear();
return previous;
}
I reckon that "spin loop" is a lot better than a global array of
mutexes indexed by a memory address. But if you really want a mutex
then it could be:
struct TwoPointers {
void *p, *q;
std::shared_mutex m;
pair<void*,void*> set( void *const parg, void *const qarg )
{
std::lock_guard mylock(m); // exclusive lock
pair<void*,void*> const previous = { p, q };
p = parg;
q = qarg;
return previous;
}
pair<void*,void*> get(void)
{
std::shared_lock mylock(m); // lock only for reading
return { p, q };
}
};
Yeah it's looking like I'm going to try write fully-portable
standard-compliant code for 'lockfree_pointer_pair' that will work
everywhere.
Received on 2026-01-09 12:32:10
