Date: Wed, 30 Mar 2022 10:45:40 +0100
On Wed, 30 Mar 2022 at 06:41, Zhihao Yuan <zy_at_[hidden]> wrote:
> On Tuesday, March 29th, 2022 at 6:39 PM, Edward Catmur <
> ecatmur_at_[hidden]> wrote:
>
>
>>
>> Are you sure about that: https://godbolt.org/z/PMT3GG56G
>>
>
> Oh, nice. Still, relying on the optimizer virtualizing and inlining isn't
> always the best idea, and some optimizations (like constant folding, IIRC)
> don't make it across the function call boundary.
>
>
> There is no "function call boundary"
> involved here. The reason that this
> code is inlined is as same as why
> a function template specialization is
> inlined. function_ref is built in a way
> such that, if the compiler can see
> both the callee to be erased and
> the caller, there is no extra
> abstraction cost. If you observed
> an inferior codegen, the case may
> worth looking into for the vendor.
>
Devirtualization isn't always a given. I'm sure you're aware of inferior
codegen from std::visit implemented as a constexpr array of function
pointers, relative to the unrolled jump table used in some implementations.
However, that may be a bit of a singular case. It does seem that here we
get just as good codegen from function_ref as from an auto invocable
function template parameter. Thanks.
> On Tuesday, March 29th, 2022 at 6:39 PM, Edward Catmur <
> ecatmur_at_[hidden]> wrote:
>
>
>>
>> Are you sure about that: https://godbolt.org/z/PMT3GG56G
>>
>
> Oh, nice. Still, relying on the optimizer virtualizing and inlining isn't
> always the best idea, and some optimizations (like constant folding, IIRC)
> don't make it across the function call boundary.
>
>
> There is no "function call boundary"
> involved here. The reason that this
> code is inlined is as same as why
> a function template specialization is
> inlined. function_ref is built in a way
> such that, if the compiler can see
> both the callee to be erased and
> the caller, there is no extra
> abstraction cost. If you observed
> an inferior codegen, the case may
> worth looking into for the vendor.
>
Devirtualization isn't always a given. I'm sure you're aware of inferior
codegen from std::visit implemented as a constexpr array of function
pointers, relative to the unrolled jump table used in some implementations.
However, that may be a bit of a singular case. It does seem that here we
get just as good codegen from function_ref as from an auto invocable
function template parameter. Thanks.
Received on 2022-03-30 09:45:53