On Monday, January 8, 2024, Thiago Macieira wrote:
But again pointer-equality problems because
the entry point of the function you de-virtualised may be that of the thunk
instead of the actual function.


    So maybe we could use 'std::dethunk' to de-thunk the function pointer.

    q = std::dethunk( p );
    if ( q == p )  cout << "It wasn't a thunk\n";
    else cout << "Actual function address: 0x" << hex << (void*)q << endl;

So then the question is how do we identify thunks at runtime? Well we could binary-search for the address in a global constexpr array of thunk addresses. Or alternatively we could precede every thunk with 8 identifying bytes something like:

    't' 'h' 'u' 'n' 'k' '\0' '\0' '\0'

Or maybe just identify the machine code of a thunk generated from the following assembler:

    add rdi, 8
    jmp ActualFunction

If it adds a constant to RDI and immediately jumps to a constant address then assume it to be a thunk (although this could be problematic if a non-thunk function which begins with a loop (and is misidentified as a thunk)).

I would really like to see a function to retrieve the function pointer from the vtable:

    class MyClass {  . . . };
    MyClass myobj;
    void (*p)(void) = std::devirtualise( &MyClass::SomeMethod, &myobj );

so then you could chain them:

    void (*p)(void) = std::dethunk( std::devirtualise( &MyClass::SomeMethod, &myobj ) );

On computers that don't use thunks, "std::dethunk" shall be optimised away to a nop.