On Wed, 30 Mar 2022 at 06:41, Zhihao Yuan <zy@miator.net> wrote:
On Tuesday, March 29th, 2022 at 6:39 PM, Edward Catmur <ecatmur@googlemail.com> wrote:



Are you sure about that: https://godbolt.org/z/PMT3GG56G

Oh, nice. Still, relying on the optimizer virtualizing and inlining isn't always the best idea, and some optimizations (like constant folding, IIRC) don't make it across the function call boundary.

There is no "function call boundary"
involved here. The reason that this
code is inlined is as same as why
a function template specialization is
inlined. function_ref is built in a way
such that, if the compiler can see
both the callee to be erased and
the caller, there is no extra
abstraction cost. If you observed
an inferior codegen, the case may
worth looking into for the vendor.

Devirtualization isn't always a given. I'm sure you're aware of inferior codegen from std::visit implemented as a constexpr array of function pointers, relative to the unrolled jump table used in some implementations.

However, that may be a bit of a singular case. It does seem that here we get just as good codegen from function_ref as from an auto invocable function template parameter. Thanks.