C++ Logo

std-proposals

Advanced search

Re: Slim mutexes and locks based on C++20 std::atomic::wait

From: Marko Mäkelä <marko.makela_at_[hidden]>
Date: Fri, 8 Oct 2021 22:48:35 +0300
Tue, Sep 28, 2021 at 01:03:06PM -0700, Thiago Macieira wrote:
>glibc NPTL's pthread_mutex also has that. Which means std::mutex with
>either libstdc++ or libc++ can access it. It used to be a separate
>mutex type, which would require you to re-init the mutex behind
>std::mutex's back (a "void warranty" problem), but it looks like
>nowadays you can enable it by setting an environment variable. See
>https://www.gnu.org/software/libc/manual/html_node/Elision-Tunables.html

This week, I made some experiments with Intel RTM in the database kernel
that I work on. I also implemented the IBM HTM interface that glibc
appears to use and GCC exports on POWER, S/390 and zSeries, but I did
not test that one yet.

The ARM TME does not appear to be really available yet. Because there
does not seem to be any equivalent to the IA-32 user-mode CPUID
instruction, feature detection is done by the operating system. The
Linux kernel does not define a TME feature flag yet. I do not even know
which hardware would support it.

It turns out that memory transactions have to kept small so that the
maximum number of cache lines will not be exceeded and that conflicts
will be unlikely enough. Trivially, an attempt to invoke a system call
would lead to memory transaction abort and re-execution.

So, it is really useful to have fine-grained control when to use a
memory transaction. Memory transactions help when the critical section
is small and we are not going to execute any system calls, not even
something as innocent as std::atomic::notify_one().

As far as I understand, lock elision based on transactional memory
depends on a predicate like atomic_mutex::is_locked(). That is one
benefit that my proposal can offer. Generally, such predicates are
intentionally withheld from mutex libraries, because checking whether a
mutex is locked is considered to be bad programming style, at most
something that could be allowed in assertions.

Motivated by a real use case (lock elision around changing some state
and broadcasting a condition variable), I implemented
atomic_condition_variable that is a simple wrapper around
std::atomic<uint32_t> that counts the pending wait(). You can find it at
the usual location:

https://github.com/dr-m/atomic_sync/

I did not test it in the real scenario yet. The idea would be to replace
a plain mutex and condition variable with atomic_mutex and
atomic_condition_variable.

By the way, it looks like that also spinloops could be beneficial to
specify for individual acquisitions. That is, an optional Boolean
parameter to atomic_mutex::lock() could make sense.

Best regards,

        Marko

Received on 2021-10-08 14:48:44