Date: Fri, 10 Jun 2022 13:41:52 -0400
On Fri, Jun 10, 2022 at 1:39 PM Marcin Jaczewski via Std-Proposals
<std-proposals_at_[hidden]> wrote:
>
> pt., 10 cze 2022 o 18:28 Arthur O'Dwyer via Std-Proposals
> <std-proposals_at_[hidden]> napisał(a):
> >
> > On Fri, Jun 10, 2022 at 12:00 PM Lénárd Szolnoki <cpp_at_[hidden]> wrote:
> >>
> >> So what do I do if I want to microbenchmark a function with LTO on? Maybe because that's the configuration relevant for my application.
> >
> >
> > I don't think I understand the notion of "microbenchmarking" "with LTO on." Isn't the whole point of LTO to mash all your code together so that it's not "micro" anymore, and its performance will end up depending very heavily on how it's actually used in practice? At that point, you need a "macrobenchmark" so that you're testing the performance of the actual code, because its "micro" performance won't necessarily bear any relationship to its "macro" (real-world) performance. Maybe that means linking your final executable and then running it on some real-world input via a script; or maybe (although this seems very "clever") it means linking your "micro" benchmark function into a .dll and then wrapping a single call to that .dll into your top-level Google Benchmark program.
> >
> > #include <benchmark/benchmark.h>
> > extern void runMyOptimizedMicrobenchmark(int*); // implemented in a .so/DLL somewhere else
> > static void BM_MyThing(benchmark::State& state) {
> > int i = 0;
> > for (auto _ : state) {
> > runMyOptimizedMicrobenchmark(&i);
> > }
> > benchmark::DoNotOptimize(i);
> > }
> > BENCHMARK(BM_MyThing);
> >
> > Either way, I've also lost the thread of what we're trying to accomplish here. Are we still trying to support someone stepping through in the debugger? because people definitely don't do that with microbenchmark code. All I'm saying is, if your goal is simply to mystify the optimizing compiler as to whether a particular variable is dead or whether a particular write to it can be hoisted, literally all you have to do is escape that variable's address into a different translation unit (which is exactly what benchmark::DoNotOptimize does). The `volatile` keyword is both insufficient and unnecessary to achieve that goal.
> >
> btw there is subtle and important difference between this
> `DoNotOptimize` and `[[no_optimize]]` consider initial example:
>
> ```
> bool b = false;
> if (b)
> {
> // ...
> }
> benchmark::DoNotOptimize(b);
> ```
> Compiler will not remove `b` but can remove `if (b)` as it is obvious
> that it will not be taken. Of course you can reorder to confuse the
> compiler enough that he will stop elimitaind dead code.
> But this means that you fight with the compiler to get what you want
> instead of saying this explicitly.
I think the take-home point here for any language feature should be
that it should focus on the code that we want to keep, not the
mechanism by which we keep it. That is, you annotate `if(b)`, the code
which is being removed, not some code related to that statement.
<std-proposals_at_[hidden]> wrote:
>
> pt., 10 cze 2022 o 18:28 Arthur O'Dwyer via Std-Proposals
> <std-proposals_at_[hidden]> napisał(a):
> >
> > On Fri, Jun 10, 2022 at 12:00 PM Lénárd Szolnoki <cpp_at_[hidden]> wrote:
> >>
> >> So what do I do if I want to microbenchmark a function with LTO on? Maybe because that's the configuration relevant for my application.
> >
> >
> > I don't think I understand the notion of "microbenchmarking" "with LTO on." Isn't the whole point of LTO to mash all your code together so that it's not "micro" anymore, and its performance will end up depending very heavily on how it's actually used in practice? At that point, you need a "macrobenchmark" so that you're testing the performance of the actual code, because its "micro" performance won't necessarily bear any relationship to its "macro" (real-world) performance. Maybe that means linking your final executable and then running it on some real-world input via a script; or maybe (although this seems very "clever") it means linking your "micro" benchmark function into a .dll and then wrapping a single call to that .dll into your top-level Google Benchmark program.
> >
> > #include <benchmark/benchmark.h>
> > extern void runMyOptimizedMicrobenchmark(int*); // implemented in a .so/DLL somewhere else
> > static void BM_MyThing(benchmark::State& state) {
> > int i = 0;
> > for (auto _ : state) {
> > runMyOptimizedMicrobenchmark(&i);
> > }
> > benchmark::DoNotOptimize(i);
> > }
> > BENCHMARK(BM_MyThing);
> >
> > Either way, I've also lost the thread of what we're trying to accomplish here. Are we still trying to support someone stepping through in the debugger? because people definitely don't do that with microbenchmark code. All I'm saying is, if your goal is simply to mystify the optimizing compiler as to whether a particular variable is dead or whether a particular write to it can be hoisted, literally all you have to do is escape that variable's address into a different translation unit (which is exactly what benchmark::DoNotOptimize does). The `volatile` keyword is both insufficient and unnecessary to achieve that goal.
> >
> btw there is subtle and important difference between this
> `DoNotOptimize` and `[[no_optimize]]` consider initial example:
>
> ```
> bool b = false;
> if (b)
> {
> // ...
> }
> benchmark::DoNotOptimize(b);
> ```
> Compiler will not remove `b` but can remove `if (b)` as it is obvious
> that it will not be taken. Of course you can reorder to confuse the
> compiler enough that he will stop elimitaind dead code.
> But this means that you fight with the compiler to get what you want
> instead of saying this explicitly.
I think the take-home point here for any language feature should be
that it should focus on the code that we want to keep, not the
mechanism by which we keep it. That is, you annotate `if(b)`, the code
which is being removed, not some code related to that statement.
Received on 2022-06-10 17:43:20