ISOCPP std-proposals List: [std-proposals] Return Value Optimisation whenever you need it (guaranteed elision)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Sun, 16 Jul 2023 00:58:03 +0100

We can return a mutex by value from a function as follows:

    std::mutex Func(void)
    {
        return std::mutex();
    }

We can do this even though std::mutex can't be moved nor copied. This
is called 'Return Value Optimisation', and it's mandatory for the
compiler to elide the move/copy operation.

But let's change it a little:

    std::mutex Func(void)
    {
        std::mutex mtx;
        return mtx;
    }

Now it no longer compiles. What we have here is 'Named Return Value
Optimisation'. In this circumstance, the compiler may elide the
move/copy operation if it wants to, but still the move/copy
constructor must be accessible -- therefore it won't work with an
std::mutex.

So this begs the question.... Is it at all possible to return a locked
mutex by value from a function? Is it possible to somehow do the
following?

    std::mutex Func(void)
    {
        std::mutex mtx;
        mtx.lock();
        return mtx;
    }

In the System V x86_64 calling convention (which is used by every
x86_64 operating system apart from Microsoft Windows), if a function
returns a class by value, the object is constructed at the address
pointed to by the RDI register. So in theory, if a function wanted to
return a locked mutex by value, all it would need to do is:
(Step 1) Construct a mutex at the address pointed to by RDI
(Step 2) Lock the mutex at the address pointed to by RDI
(Step 3) Return from the function

I'm going to make a little change to 'Func' which I wrote above. I'm
going to change it to the following:

    std::mutex Func( void(*const f)(std::mutex*) )
    {
        std::mutex mtx;
        f(&mtx);
        return mtx;
    }

And so now you can provide your own function to manipulate the object
however you wish. Let's keep it simple and write a lambda to lock the
mutex, and so then we would invoke 'Func' as follows:

    int main(void)
    {
        Func( [](std::mutex *const p) { p->lock; } );
    }

Before I write the x86_64 assembler for 'Func', I'm going to write a
little helper function that will construct the mutex for me:

    void construct_mutex(void *const arg)
    {
        ::new(arg) std::mutex();
    }

(NB: The implementation of the constructor for std::mutex isn't to be
found in the libstdc++ dynamic shared library as it always gets
expanded inline, and so that's why I need the above helper function
instead of just directly invoking the constructor from assembler).

And so now I'll write the x86_64 assembler for 'Func'. If you look at
the function signature for 'Func', we know what will be stored where:
    RDI : Address of return value
    RSI : Address of lambda

Note that RDI and RSI are caller-saved registers, and so I'll push
them before every function call and then pop them again afterward.
Here's the GNU inline assembler for 'Func':

__asm("Func: \n"
      ".intel_syntax noprefix \n"
      " push rdi \n" // save to restore later
      " push rsi \n" // save to restore later
      " call construct_mutex \n" // construct mutex at
return value
      " pop rsi \n" // restore rsi after call
      " pop rdi \n" // restore rsi after call
      " call rsi \n" // call manipulator function
      " ret \n"
      ".att_syntax");

It's really that simple. We can now make use of 'Return Value
Optimisation' wherever we want it, and we can indeed return a locked
mutex by value. Check it out:

    https://godbolt.org/z/WWzWs6zEY

This is really easy to pull off on Linux and Mac. I haven't looked at
Microsoft, nor have I looked at aarch64 nor arm32. I wonder if it's as
easy on those architectures too.

But anyway, maybe we could make the elision of the copy/move operation
mandatory for NRVO by making a language change to allow the following
syntax:

    std::mutex Func(void) -> NRVO(mtx)
    {
        mtx.lock();
        return mtx;
    }

The compiler would treat the above function as though it had been written:

    std::mutex Func(void)
    {
        std::mutex mtx;
        mtx.lock();
        return mtx; // but with the guarantee of elision
    }

I don't think it's reasonable that C++23 still can't return a locked
mutex by value. We should do something about this for C++26.

Received on 2023-07-15 23:58:13