std-proposals: Re: Slim mutexes and locks based on C++20 std::atomic::wait

From: Marko Mäkelä <marko.makela_at_[hidden]>
Date: Sat, 28 Aug 2021 17:50:12 +0300

Tue, Aug 24, 2021 at 09:32:17AM -0700, Thiago Macieira via Std-Proposals wrote:
>On Tuesday, 24 August 2021 00:01:53 PDT Marko Mäkelä via Std-Proposals wrote:
>> Good points. I thought that having this in the standard library would
>> create more pressure to operating system developers to provide some
>> futex-like functionality.
>
>Linux, Windows and Darwin have it, though for Darwin it doesn't appear to be
>documented. Your code has shown OpenBSD does too, something I didn't know.
>FreeBSD definitely has it for its Linux-compatibility layer, but I don't know
>if that has been exposed to the FreeBSD native ABI (just as eventfd hasn't).

Yes, FreeBSD should have it. Solaris might have it too; at one point it
implemented a Linux emulation mode.

>I don't think the standard library can force that much. We'd end up
>with the opposite: the futex functionality is emulated by way of locks.

That is already the status quo with C++20 std::atomic<T>::wait() and
std::atomic<T>::notify_one().

>An example is Linux itself: Linus is quite against extending the futex
>functionality to 64-bit values, so std::atomic<int64_t>::wait will
>likely remain emulated on Linux for a long time.

My proposal is only implementing alternatives to std::mutex and
std::shared_mutex with a known memory footprint and an additional
atomic_shared_mutex::lock_update() mode that is like lock() but allows
concurrent lock_shared().

I just updated https://github.com/dr-m/atomic_sync to align the naming
with std::shared_mutex. I think that my atomic_recursive_shared_mutex is
too specific and complex to be included in the standard. It can serve as
an example of implementing recursion using atomic_shared_mutex.

>If this was your reason, please re-evaluate.

The C++ standard library has been extended to include things that make
programs more portable. Less reinventing of the wheel, avoid the need to
write operating system specific code.

I think that the ability to "embed" a lock inside a concurrent data
structure is a very compelling reason to have std::atomic based
alternatives to std::mutex and std::shared_mutex. For example, if the
cache line size is 64 bytes, one 4-byte atomic_mutex can protect 15
other 4-byte elements located in the same cache line. As a byproduct of
acquiring the mutex, you will already have loaded the data to the L1
cache. (Not necessarily all data, but the "head" of the protected data
structure.)

Analogy: ISO 262 defines a standard set of bolts. ISO 5211 extends that
by defining bolt circle diameters (for mounting valves). Interoperable
components with standardized sizes makes life easier.

Best regards,

Marko Mäkelä

Received on 2021-08-28 09:50:20