Date: Thu, 23 May 2024 09:46:02 +0100
On 20/05/2024 15:15, Arthur O'Dwyer via Std-Proposals wrote:
> On Mon, May 20, 2024 at 9:25 AM Brian Bi via Std-Proposals
> <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]>>
> wrote:
>
> On Mon, May 20, 2024, 8:34 AM Lénárd Szolnoki via Std-Proposals
> <std-proposals_at_[hidden]
> <mailto:std-proposals_at_[hidden]>> wrote:
>
> On 19/05/2024 21:28, Frederick Virchanza Gotham via
> Std-Proposals wrote:
> > There's already a lengthy paper to provide NRVO:
> >
> >
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html>
> >
> > This paper has been gathering dust for 3 years (actually a
> few more
> > than that if you consider the previous revisions), and it's not
> > because it's a bad paper. I think it hasn't had any thrust
> behind it
> > because it's just too much grief to write it into the
> Standard and for
> > compiler vendors to implement it. So if P2025 is going to
> gather dust
> > past C++26, C++29, C++32 and so on, then how about we just try
> > something less ambitious.
>
> I think it's a great paper, and I think it's gathering dust for
> much
> more banal reasons, like the original author not having the time or
> motivation to update it, somebody else to have the motivation to
> pick it up.
>
>
> I would have been interested in picking it up, but I looked at the
> minutes and the reflector discussion and it seemed like there were
> significant implementability challenges that are difficult for me to
> even understand.
>
> Do we even know whether the proposed direction was actually viable?
>
>
> It's been a long time since I looked closely at it (and I never looked
> all /that/ close), but my impression was that it was viable enough that
> it should have been implemented into Clang and/or GCC as a first step.
> Someone should go through the list of examples in the paper, audit them,
> and see whether Clang and/or GCC actually do the optimal thing for each
> of them. Then, patch Clang and/or GCC to /do/ the optimal thing,
> according to the algorithm in the paper.
> The examples aren't always perfectly worded IMHO; e.g. Example 3
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-3> says "Move (possibly elided) on line 4," but in fact there's no way to elide that move; the returned object `w` cannot possibly be in the return slot at that point, because if it were, then we couldn't put `widget()` into the return slot on line 6 (and C++17 says we /must/ do that). But Example 4 makes it clear that the author understood that point — and Clang+GCC do compile Example 4 optimally.
>
> But e.g. GCC suboptimally compiles Example 7
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-7>.
>
> Anyway, if there were a compiler out there that optimally compiled all
> of P2025's examples, then we could point to it and say "Hey, let's just
> mandate that /everybody/ do what this guy does." That's essentially the
> route that gave us "Down with typename!" in C++20: someone (EDG) had a
> typo-fixing algorithm and said "Hey, everyone should just do this
> algorithm automatically."
>
> ISTR that there was a push to get Clang to implement the P2025 algorithm
> a few years ago; maybe Matheus Izvekov was involved? I don't know to
> what extent it succeeded, but it actually looks really good in Clang trunk:
> https://godbolt.org/z/n8eh5xdTT <https://godbolt.org/z/n8eh5xdTT>
> The only failing example from P2025 is the one where he's actually
> proposing a semantic change — that `catch (widget w)` should be allowed
> to introduce an RVO variable, just like `widget w = caught();` would be
> allowed to introduce an RVO variable. I'm pretty sure that's not
> conforming today, which is why Clang can't do it.
>
> Lénárd Szolnoki wrote:
> > Add a [[clang::nrvo]] attribute on the variable you want to force nrvo on
>
> I have a partial implementation of [[clang::nrvo]] in my Clang fork. It
> expresses the programmer's intent and verifies (in the code generator)
> that RVO is actually performed. But it doesn't have any "forcing" effect
> on optimization — it doesn't actually change what the compiler would
> have done without the attribute. And it also doesn't permit you to get
> away without a move constructor; i.e.
> std::mutex locked() {
> std::mutex m;
> m.lock();
> [[clang::nrvo]] return m; // don't move, just NRVO
> }
> is still a hard error in my Clang fork (for now). It's not that I don't
> /want/ that behavior; it's just that I don't know how to implement it.
Interesting. I would much rather see the annotation on the variable
declaration rather than the return statement. My main concern is
diagnostics, and more indirection in the non-local reasoning required
for the feature to work.
Immovable f(bool b) {
Immovable a("foo");
Immovable b("bar");
if (b) {
[[clang::nrvo]] return a;
}
[[clang::nrvo]] return b; // ERROR:
/* an other local variable is selected for NRVO by a previous return
statement. (other return statement is on line x, other object
declared on line y) */
}
vs.
Immovable f(bool b) {
[[nrvo]] Immovable a("foo");
[[nrvo]] Immovable b("bar"); // ERROR:
/* can't have two [[nrvo]] objects in the same scope.
(other object declared on line x) */
if (b) {
return a;
}
return b;
}
Or:
Immovable g(bool b) {
Immovable a("foo");
if (b) {
[[clang::nrvo]] return a; // ERROR:
/* can't apply NRVO for a, an other return statement within a's
scope does not return a.
(a declared on line x, other return statement on line y) */
}
return {"bar"};
}
vs.
Immovable g(bool b) {
[[nrvo]] Immovable a("foo");
if (b) {
return a;
}
return {"bar"}; // ERROR:
/* can only return NRVO object a while it's in scope.
(a declared on line x) */
}
Disclaimer: I didn't actually check your fork, so I don't know if these
error messages would be representative for that. Please correct me if
those are better.
In my mental model the non-local reasoning for this feature is like a
graph, where the variable declaration is a hub, and the return
statements are leaves. So if you annotate the leaves you need more
indirection to get to other leaves, as you have to go through the hub
for that first. However if you annotate the hub, then all the leaves
follow from it with a single indirection.
Cheers,
Lénárd
> On Mon, May 20, 2024 at 9:25 AM Brian Bi via Std-Proposals
> <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]>>
> wrote:
>
> On Mon, May 20, 2024, 8:34 AM Lénárd Szolnoki via Std-Proposals
> <std-proposals_at_[hidden]
> <mailto:std-proposals_at_[hidden]>> wrote:
>
> On 19/05/2024 21:28, Frederick Virchanza Gotham via
> Std-Proposals wrote:
> > There's already a lengthy paper to provide NRVO:
> >
> >
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html>
> >
> > This paper has been gathering dust for 3 years (actually a
> few more
> > than that if you consider the previous revisions), and it's not
> > because it's a bad paper. I think it hasn't had any thrust
> behind it
> > because it's just too much grief to write it into the
> Standard and for
> > compiler vendors to implement it. So if P2025 is going to
> gather dust
> > past C++26, C++29, C++32 and so on, then how about we just try
> > something less ambitious.
>
> I think it's a great paper, and I think it's gathering dust for
> much
> more banal reasons, like the original author not having the time or
> motivation to update it, somebody else to have the motivation to
> pick it up.
>
>
> I would have been interested in picking it up, but I looked at the
> minutes and the reflector discussion and it seemed like there were
> significant implementability challenges that are difficult for me to
> even understand.
>
> Do we even know whether the proposed direction was actually viable?
>
>
> It's been a long time since I looked closely at it (and I never looked
> all /that/ close), but my impression was that it was viable enough that
> it should have been implemented into Clang and/or GCC as a first step.
> Someone should go through the list of examples in the paper, audit them,
> and see whether Clang and/or GCC actually do the optimal thing for each
> of them. Then, patch Clang and/or GCC to /do/ the optimal thing,
> according to the algorithm in the paper.
> The examples aren't always perfectly worded IMHO; e.g. Example 3
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-3> says "Move (possibly elided) on line 4," but in fact there's no way to elide that move; the returned object `w` cannot possibly be in the return slot at that point, because if it were, then we couldn't put `widget()` into the return slot on line 6 (and C++17 says we /must/ do that). But Example 4 makes it clear that the author understood that point — and Clang+GCC do compile Example 4 optimally.
>
> But e.g. GCC suboptimally compiles Example 7
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2025r2.html#ex-7>.
>
> Anyway, if there were a compiler out there that optimally compiled all
> of P2025's examples, then we could point to it and say "Hey, let's just
> mandate that /everybody/ do what this guy does." That's essentially the
> route that gave us "Down with typename!" in C++20: someone (EDG) had a
> typo-fixing algorithm and said "Hey, everyone should just do this
> algorithm automatically."
>
> ISTR that there was a push to get Clang to implement the P2025 algorithm
> a few years ago; maybe Matheus Izvekov was involved? I don't know to
> what extent it succeeded, but it actually looks really good in Clang trunk:
> https://godbolt.org/z/n8eh5xdTT <https://godbolt.org/z/n8eh5xdTT>
> The only failing example from P2025 is the one where he's actually
> proposing a semantic change — that `catch (widget w)` should be allowed
> to introduce an RVO variable, just like `widget w = caught();` would be
> allowed to introduce an RVO variable. I'm pretty sure that's not
> conforming today, which is why Clang can't do it.
>
> Lénárd Szolnoki wrote:
> > Add a [[clang::nrvo]] attribute on the variable you want to force nrvo on
>
> I have a partial implementation of [[clang::nrvo]] in my Clang fork. It
> expresses the programmer's intent and verifies (in the code generator)
> that RVO is actually performed. But it doesn't have any "forcing" effect
> on optimization — it doesn't actually change what the compiler would
> have done without the attribute. And it also doesn't permit you to get
> away without a move constructor; i.e.
> std::mutex locked() {
> std::mutex m;
> m.lock();
> [[clang::nrvo]] return m; // don't move, just NRVO
> }
> is still a hard error in my Clang fork (for now). It's not that I don't
> /want/ that behavior; it's just that I don't know how to implement it.
Interesting. I would much rather see the annotation on the variable
declaration rather than the return statement. My main concern is
diagnostics, and more indirection in the non-local reasoning required
for the feature to work.
Immovable f(bool b) {
Immovable a("foo");
Immovable b("bar");
if (b) {
[[clang::nrvo]] return a;
}
[[clang::nrvo]] return b; // ERROR:
/* an other local variable is selected for NRVO by a previous return
statement. (other return statement is on line x, other object
declared on line y) */
}
vs.
Immovable f(bool b) {
[[nrvo]] Immovable a("foo");
[[nrvo]] Immovable b("bar"); // ERROR:
/* can't have two [[nrvo]] objects in the same scope.
(other object declared on line x) */
if (b) {
return a;
}
return b;
}
Or:
Immovable g(bool b) {
Immovable a("foo");
if (b) {
[[clang::nrvo]] return a; // ERROR:
/* can't apply NRVO for a, an other return statement within a's
scope does not return a.
(a declared on line x, other return statement on line y) */
}
return {"bar"};
}
vs.
Immovable g(bool b) {
[[nrvo]] Immovable a("foo");
if (b) {
return a;
}
return {"bar"}; // ERROR:
/* can only return NRVO object a while it's in scope.
(a declared on line x) */
}
Disclaimer: I didn't actually check your fork, so I don't know if these
error messages would be representative for that. Please correct me if
those are better.
In my mental model the non-local reasoning for this feature is like a
graph, where the variable declaration is a hub, and the return
statements are leaves. So if you annotate the leaves you need more
indirection to get to other leaves, as you have to go through the hub
for that first. However if you annotate the hub, then all the leaves
follow from it with a single indirection.
Cheers,
Lénárd
Received on 2024-05-23 08:46:06