ISOCPP std-proposals List: [std-proposals] PRvalue Parameters (and possibly relocation?)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Wed, 5 Jun 2024 13:44:08 +0100

On Wed, Jun 5, 2024 at 12:02 PM Lénárd Szolnoki wrote:
>
> No, that's not what I am saying. In your paper optional::emplace is
> implemented the following way:
>
> template<typename T>
> T &optional<T>::emplace(T &&&arg)
> {
> this->reset();
> T &retval = *::new(this->buffer) T( arg() );
> this->has_value_bool = true;
> return retval;
> }
>
> This is what happens when called with the "deferred" or "lazy" argument
> `std::move(*opt_str) + " World!`.

The expression:

    std::move(*opt_str) + " World!"

is a function call as follows:

    operator+( std::move(*opt_str), " World!" );

and it returns a PRvalue of type 'std::string'.

As I wrote in my paper under the header "x86_64 Linux implementation",
the compiler must emit a helper function that moves the arguments from
memory to registers, as follows:

    Helper_Function: ; address of return slot is in RDI
         mov rax, rsi ; address of arguments in memory
         mov rsi, [rax+0] ; move address of "*opt_str" from
memory to register
         mov rcx, [rax+8] ; move address of " World!" from
memory to register
         jmp operator+(string&&,char*)

More on this later. . .

> 1. this->reset()
> After this the optional doesn't hold a value anymore.
>
> 2. T& retval = *::new(this->buffer) T( arg() );
> Here `arg()` evaluates `std::move(*opt_str) + " World!`. But `opt_str`
> has no value anymore, therefore `*opt_str` is UB.

The arguments to the-function-which-returns-the-PRvalue get evaluated
before the 'emplace' method is invoked. That is to say, the arguments
to the "operator+" function get evaluated before the 'emplace' method
is invoked. Looking at your original 'foo' function again:

    void foo(std::optional<std::string>& opt_str)
    {
        if ( opt_str.has_value() )
       {
            opt_str.emplace( std::move(*opt_str) + " World!" );
       }
    }

After PRvalue Parameters have been added to the language, the x86_64
assembler on Linux for 'foo' will be something like as follows. (Note
on Linux that the first 3 arguments go in RDI, RSI, RCX, and the
return value goes in RAX).

  Foo: ; address of opt_str is in RDI
        push rdi ; save address of opt_str for later
        call optional<string>::has_value ; returned boolean goes in RAX
        test rax, rax ; check if has_value returned false
        jnz has_value_is_true ; if nonzero jump to has_value_is_true
        pop rdi ; just to restore stack pointer
        ret ; return from function
    has_value_is_true:
        mov rdi, [rsp] ; set 1st param = address of opt_str
        call optional<string>::operator* ; return value goes in RAX
        pop rdi ; address of opt_str in 1st argument
position again
        push "World!" ; 2nd argument wrapped to helper function is "World!"
        push rax ; 1st argument wrapped to helper function
is address of *opt_str
        mov rsi, Helper_Function ; set 2nd argument to address of
Helper_Function
        mov rcx, rsp ; set 3rd argument to current stack pointer
        call optional<string>::emplace(string&&&)
        sub rsp, 16 ; just to restore the stack
        ret

Okay . . . having written this out . . . . I can see now where the problem is:
    The address of '*opt_str' is passed to the Helper_Function, and it
is the 'emplace' method that will invoke the Helper_Function, but the
'emplace' method will invoke "this->reset()" before invoking
Helper_Function, meaning that Helper_Function will be given the
address of a string that has already been destroyed. I wonder if
there's a remedy to this. Specifically the problem is as follows:
    (1) If we destroy the string before calling the helper, then we
get a segfault
    (2) If we don't destroy the string before calling the helper, then
we get a memory leak

I don't think that No. 1 can be remedied at all -- we can't query an
object after its memory has been deallocated. But I think that No. 2
can be remedied if we relocate the object in order to destroy it later
at a different location. So the chain of events inside the 'emplace'
method would be:
    (1) Relocate "*opt_str" to another memory address
    (2) Invoke the helper function (thus overwriting the string in-place)
    (3) Destroy the relocated string

So then the implementation of 'emplace(string&&&)' would become something like:

    template<typename T>
    T &optional<T>::emplace(T &&&arg)
    {
        if ( false == this->has_value() )
        {
            T &retval = *::new(this->buffer) T( arg() );
            this->has_value_bool = true;
            return retval;
        }

        this->has_value_bool = false;

        alignas(string) char unsigned reloc_buf[ sizeof(string) ];
        std::relocate( (string*)reloc_buf, &this->value() );
        Auto( ((string*)reloc_buf)->~string() );

        T &retval = *::new(this->buffer) T( arg() );
        this->has_value_bool = true;

        return retval;
    }

Could something like that work?

Received on 2024-06-05 12:44:22