On Fri, Jun 10, 2022 at 12:00 PM Lénárd Szolnoki <cpp@lenardszolnoki.com> wrote:

So what do I do if I want to microbenchmark a function with LTO on? Maybe because that's the configuration relevant for my application.

I don't think I understand the notion of "microbenchmarking" "with LTO on." Isn't the whole point of LTO to mash all your code together so that it's not "micro" anymore, and its performance will end up depending very heavily on how it's actually used in practice? At that point, you need a "macrobenchmark" so that you're testing the performance of the actual code, because its "micro" performance won't necessarily bear any relationship to its "macro" (real-world) performance. Maybe that means linking your final executable and then running it on some real-world input via a script; or maybe (although this seems very "clever") it means linking your "micro" benchmark function into a .dll and then wrapping a single call to that .dll into your top-level Google Benchmark program.

#include <benchmark/benchmark.h>

extern void runMyOptimizedMicrobenchmark(int*); // implemented in a .so/DLL somewhere else
static void BM_MyThing(benchmark::State& state) {
int i = 0;
for (auto _ : state) {
runMyOptimizedMicrobenchmark(&i);
}
benchmark::DoNotOptimize(i);
}
BENCHMARK(BM_MyThing);

Either way, I've also lost the thread of what we're trying to accomplish here. Are we still trying to support someone stepping through in the debugger? because people definitely don't do that with microbenchmark code. All I'm saying is, if your goal is simply to mystify the optimizing compiler as to whether a particular variable is dead or whether a particular write to it can be hoisted, literally all you have to do is escape that variable's address into a different translation unit (which is exactly what benchmark::DoNotOptimize does). The `volatile` keyword is both insufficient and unnecessary to achieve that goal.

–Arthur