ISOCPP std-proposals List: Re: [std-proposals] Copy-construct, move-construct, and PR-construct

From: Jason McKesson <jmckesson_at_[hidden]>
Date: Mon, 21 Aug 2023 22:21:56 -0400

On Mon, Aug 21, 2023 at 5:59 PM Frederick Virchanza Gotham via
Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> I reply to a Breno, Jason and Jonathan in series below.
>
>
> Breno Guimarães wrote:
> >
> > int main()
> > {
> > std::optional<std::mutex> om;
> > reemplace(om, FuncThatReturnsMutex);
> > // om = FuncThatReturnsMutex();
> > }
>
>
> That's very similar to what I did on Page 3 of my paper on NRVO:
>
> http://www.virjacode.com/downloads/nrvo/paper_nrvo_latest.pdf
>
>
> Jason McKesson wrote:
> >
> > Any solution to a C++ problem that starts with "let's add a new
> > reference type to the language" is not a solution worth having.
> > There's really no point in considering it further.
>
>
> I don't want to introduce a new kind of reference. I don't want to be
> able to do this:
>
> int ^^a = b;
>
> The use of the unary '^^' operator would only be seen in the parameter
> list of a function that takes one and only one argument -- usually a
> constructor or an overloaded assignment operator.
>
> Jason McKesson continued writing:
> >
> > However, if this were to be entertained further, then some explanation
> > of what's actually happening here is in order. You say that your new
> > reference type means that the "argument must be a prvalue?". OK, but
> > then you say "we need control over when the prvalue gets generated ".
> >
> > This makes no sense. The fact that a variable must be initialized by a
> > prvalue doesn't change how variables work. "when the prvalue gets
> > generated" happened *before the function call*. If you want it to
> > happen differently from that, then you need to explain what those
> > differences are and how that works. Are you saying that the parameter
> > captures the entire expression, leaving it unevaluated until some time
> > *within* the function (which is not what it means for a variable to
> > "be a prvalue")? OK, so how does that work? Must the function be
> > inlined? If not, how can a compiler pass *arbitrary code* to a
> > concrete function that has no idea how any particular parameter was
> > initialized?
>
>
> So let's consider the following:
>
> extern mutex FuncThatReturnsMutex(int,int,int,int,int);
>
> int main(void)
> {
> optional<mutex> om;
> om = FuncThatReturnsMutex(arg1,arg2,arg3,arg4,arg5);
> }
>
> My idea is still half-baked and subject to floatation but here's a
> suggestion of how this could be implemented with the System V x86_64
> calling convention:

Please answer the question that was asked. I asked what the code
meant. I did not ask "how this could be implemented". Implementations
do not matter at this point. You need to start with an explanation of
*what is happening*, not how a compiler will go about making that
happen.

Indeed, you seem to have this idea that implementations are the most
important thing that you could show. But when you're still in the
infancy of a proposal (and this one isn't barely that), the most
important questions are "what it is" and "what problems would it
solve." "How to implement it" is a distant third.

> (Step 1) The 5 arguments are evaluated and then pushed onto the stack
> (none of them are put in registers)
> (Step 2) The member function 'operator=' is invoked with return
> address pushed onto the stack, with the RDI register set to the
> address of 'om', and the RSI register set to the address of a tiny
> thunk-like function. The RDX register is also set to the address of
> another tiny thunk-like function (all will be explained later).
> (Step 3) The body of 'operator=' does whatever it wants until it
> encounters the '__emplace' keyword, at which point it invokes the
> first tiny thunk-like function whose job it is to copy the 5 arguments
> from their original location on the stack into the appropriate
> registers (RSI, RDX, RCX, R8, R9) to be passed as arguments to
> 'FuncThatReturnsMutex'. If the function takes more parameters than can
> go into registers, then the remaining arguments are copied from their
> original location on the stack to the top of the stack (yes there will
> be two copies of the data on the stack -- but this will rarely happen
> as there are plenty of registers).
> (Step 4) After the first tiny thunk-like function returns, the
> function 'FuncThatReturnsMutex' is invoked with the RDI register set
> to the address of the member variable 'buf' which resides inside the
> 'optional' object, and so the mutex gets emplaced in 'buf'.
> (Step 5) The function 'FuncThatReturnsMutex' returns (or throws an
> exception... but for now for simplicity let's just consider what
> happens when it returns normally)
> (Step 6) The remainder of the body of 'operator=' is executed.
> (Step 7) Right before 'operator=' returns, the second tiny thunk-like
> function is invoked and its job is to call the destructors of all the
> arguments that were pushed onto the stack in Step 1, and it also
> increments the stack pointer to its original value
> (Step 8) The function 'operator=' returns.

Attempting to reverse engineer some kind of meaning from this, the
core essence of your proposal appears to be as follows:

You want to take an expression and defer its evaluation to some point
within a completely different scope from where it appears. Typically,
this scope would be within some other function call.

As others have pointed out, this basic idea has been considered
before, with P0927. That paper was way more fully formed (your
`__emplace` gymnastics makes basically no sense. It doesn't even
mention the "variable" in question, so there's no reason that it would
affect its status), and even it didn't seem to get much traction:
https://github.com/cplusplus/papers/issues/288

Just one of the unanswered questions raised by this... idea is this:
why is *part* of the expression evaluated normally and part evaluated
within the function? What determines where each part gets evaluated?
If the expression is `a + b * c`, do `a`, `b`, and `c` get evaluated?
Which operators would get evaluated in the normal place and which get
evaluated inside the call?

Once again you seem to be thinking in terms of a bulldozer. You want
to return `mutex`es by value (for some reason). So you start
bulldozing holes in the standard until that is possible, without
caring about what things might be around them. You think exclusively
about the expression being "a function;" the fact that "a function"
can take many forms doesn't matter to you. You think exclusively about
a function taking a single parameter, giving no thought to what if a
user might want to do this to two separate arguments. Etc.

This is not the proper mindset for developing a good language feature.

> So there's one possible implementation. There would be a tiny little
> bit of gymnastics with the stack pointer when invoking the first tiny
> thunk-like function (because it needs to jump back to the return
> address even though it has pushed more stuff onto the stack), but it's
> doable.
>
>
> Jonathan Wakely wrote:
> >
> > Just stop replying to him. He clearly has no intention of doing anything except
> > throwing silly ideas out again and again and again. Engaging with him clearly
> > doesn't discourage him. Maybe ignoring him will.
>
>
> Reverse psychology is at times uncanny, particularly in the realms of
> international technical discussion, but thank you Jonathan you have
> succeeded in motivating me to take this further.

Well, good luck with that. Don't forget: the actual committee cares a
*lot more* about the motivation for a feature than you seem to.
They're highly unlikely to pay any real attention to a paper that
spends more time on talking about register allocation than explaining
what the behavior is or what the point of the change actually is.

Received on 2023-08-23 14:37:28