Date: Sun, 21 Jul 2024 18:23:58 -0400
All,
Long time reader, first time writer here. I am working on a sizable project
where we have traditionally provided custom-built abstractions
(pre-C/C++11) for atomic operations. In particular, we are trying to use
LL/SC instructions (load-linked/store-conditional) for some data structures
where available, since they provide stronger guarantees than CAS (avoiding
the ABA problem).
Neither C11 nor C++11 provide a primitive that lets developers utilize
these primitives directly so we are using inline assembly and
configure-time detection. I'd like to get to a point where we don't have to
write inline assembly anymore to utilize this feature.
I wonder whether there have been discussions on adding an abstraction to
the standard that exposes LL/SC capabilities where available? I looked
through papers in the last ten years but nothing struck me as relevant. So
I wonder whether there is interest in adding a facility to utilize these
instructions? Apologies if this has been discussed before, I'd appreciate a
pointer to the discussion/proposal if it has.
Here is my approach: Since LL/SC is a load and a (conditional) store that
work in tandem with potentially arbitrary code in between (vs a single CAS
instruction or well-defined operations if std::atomic is implemented using
LL/SC) they should be encapsulated into one abstraction, like:
```
struct lifo_t {
std::atomic<lifo_element_t*> head;
};
lifo_element_t* new_elem = ...;
/* load current head and exchange with new */
while (!std::atomic_load_store(lifo->head,
[&](list_element_t* elem) {
new_elem->next = elem; // link
return new_elem;
}))
{ /* keep trying if the store fails */ }
```
Both the load and the store will happen inside std::atomic_load_store and
the provided callable takes the loaded value and returns the value to be
stored in its place. If the conditional-store succeeds,
std::atomic_load_store would return true, false otherwise. Similar to
std::atomic_compare_exchange_weak, the store may fail spuriously. If
std::atomic_load_store succeeds, we are sure that no write has happened to
the atomic object in between the load and the store. This avoids the ABA
problem we have with CAS and so we do not need to work on 128bit values (on
64bit machines). Beyond the case I'm describing here, this API could enable
custom floating point atomic operations for example.
I'm not sure that this API could be implemented everywhere but it would
expose an important capability on architectures where it can be implemented
(LL/SC is available, and maybe transactional memory?) so it would have to
be guarded by a preprocessor macro like (__cpp_atomic_load_store) or a
trait (std::have_atomic_load_store<T>) and applications would have to
provide a fallback implementation.
I'm happy to write a full proposal if there is interest in the community
but would probably need some help to get started.
I appreciate any feedback on this.
Thanks
Joseph
Long time reader, first time writer here. I am working on a sizable project
where we have traditionally provided custom-built abstractions
(pre-C/C++11) for atomic operations. In particular, we are trying to use
LL/SC instructions (load-linked/store-conditional) for some data structures
where available, since they provide stronger guarantees than CAS (avoiding
the ABA problem).
Neither C11 nor C++11 provide a primitive that lets developers utilize
these primitives directly so we are using inline assembly and
configure-time detection. I'd like to get to a point where we don't have to
write inline assembly anymore to utilize this feature.
I wonder whether there have been discussions on adding an abstraction to
the standard that exposes LL/SC capabilities where available? I looked
through papers in the last ten years but nothing struck me as relevant. So
I wonder whether there is interest in adding a facility to utilize these
instructions? Apologies if this has been discussed before, I'd appreciate a
pointer to the discussion/proposal if it has.
Here is my approach: Since LL/SC is a load and a (conditional) store that
work in tandem with potentially arbitrary code in between (vs a single CAS
instruction or well-defined operations if std::atomic is implemented using
LL/SC) they should be encapsulated into one abstraction, like:
```
struct lifo_t {
std::atomic<lifo_element_t*> head;
};
lifo_element_t* new_elem = ...;
/* load current head and exchange with new */
while (!std::atomic_load_store(lifo->head,
[&](list_element_t* elem) {
new_elem->next = elem; // link
return new_elem;
}))
{ /* keep trying if the store fails */ }
```
Both the load and the store will happen inside std::atomic_load_store and
the provided callable takes the loaded value and returns the value to be
stored in its place. If the conditional-store succeeds,
std::atomic_load_store would return true, false otherwise. Similar to
std::atomic_compare_exchange_weak, the store may fail spuriously. If
std::atomic_load_store succeeds, we are sure that no write has happened to
the atomic object in between the load and the store. This avoids the ABA
problem we have with CAS and so we do not need to work on 128bit values (on
64bit machines). Beyond the case I'm describing here, this API could enable
custom floating point atomic operations for example.
I'm not sure that this API could be implemented everywhere but it would
expose an important capability on architectures where it can be implemented
(LL/SC is available, and maybe transactional memory?) so it would have to
be guarded by a preprocessor macro like (__cpp_atomic_load_store) or a
trait (std::have_atomic_load_store<T>) and applications would have to
provide a fallback implementation.
I'm happy to write a full proposal if there is interest in the community
but would probably need some help to get started.
I appreciate any feedback on this.
Thanks
Joseph
Received on 2024-07-21 22:24:12