On 19/05/2024 21:28, Frederick Virchanza Gotham via Std-Proposals wrote:
> There's already a lengthy paper to provide NRVO:
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html
>
> This paper has been gathering dust for 3 years (actually a few more
> than that if you consider the previous revisions), and it's not
> because it's a bad paper. I think it hasn't had any thrust behind it
> because it's just too much grief to write it into the Standard and for
> compiler vendors to implement it. So if P2025 is going to gather dust
> past C++26, C++29, C++32 and so on, then how about we just try
> something less ambitious.
I think it's a great paper, and I think it's gathering dust for much
more banal reasons, like the original author not having the time or
motivation to update it, somebody else to have the motivation to pick it up.
I would have been interested in picking it up, but I looked at the minutes and the reflector discussion and it seemed like there were significant implementability challenges that are difficult for me to even understand.
Do we even know whether the proposed direction was actually viable?
It's been a long time since I looked closely at it (and I never looked all that close), but my impression was that it was viable enough that it should have been implemented into Clang and/or GCC as a first step. Someone should go through the list of examples in the paper, audit them, and see whether Clang and/or GCC actually do the optimal thing for each of them. Then, patch Clang and/or GCC to do the optimal thing, according to the algorithm in the paper.
The examples aren't always perfectly worded IMHO; e.g.
Example 3 says "Move (possibly elided) on line 4," but in fact there's no way to elide that move; the returned object `w` cannot possibly be in the return slot at that point, because if it were, then we couldn't put `widget()` into the return slot on line 6 (and C++17 says we
must do that). But Example 4 makes it clear that the author understood that point — and Clang+GCC do compile Example 4 optimally.
But e.g. GCC suboptimally compiles
Example 7.
Anyway, if there were a compiler out there that optimally compiled all of P2025's examples, then we could point to it and say "Hey, let's just mandate that everybody do what this guy does." That's essentially the route that gave us "Down with typename!" in C++20: someone (EDG) had a typo-fixing algorithm and said "Hey, everyone should just do this algorithm automatically."
ISTR that there was a push to get Clang to implement the P2025 algorithm a few years ago; maybe Matheus Izvekov was involved? I don't know to what extent it succeeded, but it actually looks really good in Clang trunk:
The only failing example from P2025 is the one where he's actually proposing a semantic change — that `catch (widget w)` should be allowed to introduce an RVO variable, just like `widget w = caught();` would be allowed to introduce an RVO variable. I'm pretty sure that's not conforming today, which is why Clang can't do it.
Lénárd Szolnoki wrote:
> Add a [[clang::nrvo]] attribute on the variable you want to force nrvo on
I have a partial implementation of [[clang::nrvo]] in my Clang fork. It expresses the programmer's intent and verifies (in the code generator) that RVO is actually performed. But it doesn't have any "forcing" effect on optimization — it doesn't actually change what the compiler would have done without the attribute. And it also doesn't permit you to get away without a move constructor; i.e.
std::mutex locked() {
std::mutex m;
m.lock();
[[clang::nrvo]] return m; // don't move, just NRVO
}
is still a hard error in my Clang fork (for now). It's not that I don't want that behavior; it's just that I don't know how to implement it.
my $.02,
Arthur