Date: Mon, 20 May 2024 10:15:33 -0400
On Mon, May 20, 2024 at 9:25 AM Brian Bi via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On Mon, May 20, 2024, 8:34 AM Lénárd Szolnoki via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> On 19/05/2024 21:28, Frederick Virchanza Gotham via Std-Proposals wrote:
>> > There's already a lengthy paper to provide NRVO:
>> >
>> >
>> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html
>> >
>> > This paper has been gathering dust for 3 years (actually a few more
>> > than that if you consider the previous revisions), and it's not
>> > because it's a bad paper. I think it hasn't had any thrust behind it
>> > because it's just too much grief to write it into the Standard and for
>> > compiler vendors to implement it. So if P2025 is going to gather dust
>> > past C++26, C++29, C++32 and so on, then how about we just try
>> > something less ambitious.
>>
>> I think it's a great paper, and I think it's gathering dust for much
>> more banal reasons, like the original author not having the time or
>> motivation to update it, somebody else to have the motivation to pick it
>> up.
>>
>
> I would have been interested in picking it up, but I looked at the minutes
> and the reflector discussion and it seemed like there were significant
> implementability challenges that are difficult for me to even understand.
>
> Do we even know whether the proposed direction was actually viable?
>
It's been a long time since I looked closely at it (and I never looked all
*that* close), but my impression was that it was viable enough that it
should have been implemented into Clang and/or GCC as a first step. Someone
should go through the list of examples in the paper, audit them, and see
whether Clang and/or GCC actually do the optimal thing for each of them.
Then, patch Clang and/or GCC to *do* the optimal thing, according to the
algorithm in the paper.
The examples aren't always perfectly worded IMHO; e.g. Example 3
<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-3>
says "Move (possibly elided) on line 4," but in fact there's no way to
elide that move; the returned object `w` cannot possibly be in the return
slot at that point, because if it were, then we couldn't put `widget()`
into the return slot on line 6 (and C++17 says we *must* do that). But
Example 4 makes it clear that the author understood that point — and
Clang+GCC do compile Example 4 optimally.
But e.g. GCC suboptimally compiles Example 7
<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-7>
.
Anyway, if there were a compiler out there that optimally compiled all of
P2025's examples, then we could point to it and say "Hey, let's just
mandate that *everybody* do what this guy does." That's essentially the
route that gave us "Down with typename!" in C++20: someone (EDG) had a
typo-fixing algorithm and said "Hey, everyone should just do this algorithm
automatically."
ISTR that there was a push to get Clang to implement the P2025 algorithm a
few years ago; maybe Matheus Izvekov was involved? I don't know to what
extent it succeeded, but it actually looks really good in Clang trunk:
https://godbolt.org/z/n8eh5xdTT
The only failing example from P2025 is the one where he's actually
proposing a semantic change — that `catch (widget w)` should be allowed to
introduce an RVO variable, just like `widget w = caught();` would be
allowed to introduce an RVO variable. I'm pretty sure that's not conforming
today, which is why Clang can't do it.
Lénárd Szolnoki wrote:
> Add a [[clang::nrvo]] attribute on the variable you want to force nrvo on
I have a partial implementation of [[clang::nrvo]] in my Clang fork. It
expresses the programmer's intent and verifies (in the code generator) that
RVO is actually performed. But it doesn't have any "forcing" effect on
optimization — it doesn't actually change what the compiler would have done
without the attribute. And it also doesn't permit you to get away without a
move constructor; i.e.
std::mutex locked() {
std::mutex m;
m.lock();
[[clang::nrvo]] return m; // don't move, just NRVO
}
is still a hard error in my Clang fork (for now). It's not that I don't
*want* that behavior; it's just that I don't know how to implement it.
my $.02,
Arthur
std-proposals_at_[hidden]> wrote:
> On Mon, May 20, 2024, 8:34 AM Lénárd Szolnoki via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> On 19/05/2024 21:28, Frederick Virchanza Gotham via Std-Proposals wrote:
>> > There's already a lengthy paper to provide NRVO:
>> >
>> >
>> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html
>> >
>> > This paper has been gathering dust for 3 years (actually a few more
>> > than that if you consider the previous revisions), and it's not
>> > because it's a bad paper. I think it hasn't had any thrust behind it
>> > because it's just too much grief to write it into the Standard and for
>> > compiler vendors to implement it. So if P2025 is going to gather dust
>> > past C++26, C++29, C++32 and so on, then how about we just try
>> > something less ambitious.
>>
>> I think it's a great paper, and I think it's gathering dust for much
>> more banal reasons, like the original author not having the time or
>> motivation to update it, somebody else to have the motivation to pick it
>> up.
>>
>
> I would have been interested in picking it up, but I looked at the minutes
> and the reflector discussion and it seemed like there were significant
> implementability challenges that are difficult for me to even understand.
>
> Do we even know whether the proposed direction was actually viable?
>
It's been a long time since I looked closely at it (and I never looked all
*that* close), but my impression was that it was viable enough that it
should have been implemented into Clang and/or GCC as a first step. Someone
should go through the list of examples in the paper, audit them, and see
whether Clang and/or GCC actually do the optimal thing for each of them.
Then, patch Clang and/or GCC to *do* the optimal thing, according to the
algorithm in the paper.
The examples aren't always perfectly worded IMHO; e.g. Example 3
<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-3>
says "Move (possibly elided) on line 4," but in fact there's no way to
elide that move; the returned object `w` cannot possibly be in the return
slot at that point, because if it were, then we couldn't put `widget()`
into the return slot on line 6 (and C++17 says we *must* do that). But
Example 4 makes it clear that the author understood that point — and
Clang+GCC do compile Example 4 optimally.
But e.g. GCC suboptimally compiles Example 7
<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-7>
.
Anyway, if there were a compiler out there that optimally compiled all of
P2025's examples, then we could point to it and say "Hey, let's just
mandate that *everybody* do what this guy does." That's essentially the
route that gave us "Down with typename!" in C++20: someone (EDG) had a
typo-fixing algorithm and said "Hey, everyone should just do this algorithm
automatically."
ISTR that there was a push to get Clang to implement the P2025 algorithm a
few years ago; maybe Matheus Izvekov was involved? I don't know to what
extent it succeeded, but it actually looks really good in Clang trunk:
https://godbolt.org/z/n8eh5xdTT
The only failing example from P2025 is the one where he's actually
proposing a semantic change — that `catch (widget w)` should be allowed to
introduce an RVO variable, just like `widget w = caught();` would be
allowed to introduce an RVO variable. I'm pretty sure that's not conforming
today, which is why Clang can't do it.
Lénárd Szolnoki wrote:
> Add a [[clang::nrvo]] attribute on the variable you want to force nrvo on
I have a partial implementation of [[clang::nrvo]] in my Clang fork. It
expresses the programmer's intent and verifies (in the code generator) that
RVO is actually performed. But it doesn't have any "forcing" effect on
optimization — it doesn't actually change what the compiler would have done
without the attribute. And it also doesn't permit you to get away without a
move constructor; i.e.
std::mutex locked() {
std::mutex m;
m.lock();
[[clang::nrvo]] return m; // don't move, just NRVO
}
is still a hard error in my Clang fork (for now). It's not that I don't
*want* that behavior; it's just that I don't know how to implement it.
my $.02,
Arthur
Received on 2024-05-20 14:15:48