ISOCPP std-proposals List: Re: [std-proposals] Interceptor Function (preserve stack and all registers)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Sun, 28 Jul 2024 17:46:57 +0100

On Fri, Jul 26, 2024 at 5:37 PM Lorand Szollosi wrote:
>
> Hi,
>
> This is, as I read it, essentially the same thing as function jump in c--,
> which allows some optimizations and features in Haskell. It’s very much
> needed in continuation passing / continuation pool style, so slashing it
> with “how many times do you need it” is an invalid argument: I’d use it
> all the time. It’d also help in some cases for which we have multiple
> proposal ideas rotating here already, which is basically multi-return
> (including return twice, return any number of types - not listed in advance,
> e.g., types created by third-party template function, etc.). I actually
> encountered use-cases where we had to wrap a lambda to a
> std::function<…>, thus have a virtual call, instead of passing a lambda to
> the function we’d jump to. This slows down inner loops in message
> processing (HFT) codes.
>
> So yes, even if many people here don’t want full call/cc in C++, the
> jump-to-function could be a useful step in the direction of handling
> continuations

I just realised today that the "interceptor function" doesn't
necessarily need to end with a jump to the original function.

Here's an example scenario:
    We have two files, "processor.exe" and "transport.dll". The
program loads the DLL to invoke functions to manipulate an internet
socket. The DLL file is marketed in its manual as being thread-safe --
however there is a subtle bug in it that can cause a crash if one
thread calls "OpenSocket" simultaneously as another thread calls
"CloseSocket" (even if the socket handle is different). Both the EXE
file and the DLL file are proprietary, we don't have their source
code, and all of their symbols are stripped.

So we create a clone of "transport.dll" that exports all of the same
functions that the original DLL file exports. Our new DLL will act as
a "man in the middle" (hereafter called a "MITM").

In our MITM DLL file, when either "OpenSocket" or "CloseSocket" is
called, we want to lock a mutex, then call the original function, then
unlock the mutex, then return. This scenario is a little more
complicated than what I described in my original post because the
"interceptor function" won't end with a jump to the original function
(instead it will end with a jump back to the caller).

When the EXE file loads our MITM DLL and calls the function
"OpenSocket", it will execute a CPU instruction that performs a
function call, and on x86_64, this is the "call" instruction. Here's
an example using the "call" instruction:

    call *%rax # Call the function whose address is in RAX

It can be rewritten as:

    push %rip # Push return address onto the stack
    jmp *%rax # Jump to the function

So whenever there is a function call, the very last thing that happens
before the jump is that the return address gets pushed onto the stack.
So when our "interceptor function" is entered, we know that the return
address is at the top of the stack, and this means we can play all
sorts of games. Consider the following x86_64 assembler for the
"interceptor function" for 'OpenSocket':

    OpenSocket:
        push_all_registers ; save all register values (except RAX and R10)
        call lock_mutex
        lea mystring(%rip), %rcx ; set 1st parameter to string
        call Record_Function_Call_And_Get_Function_Address
        pop_all_registers ; restore all register values
        mov (%rsp), %r10 ; save return address in r10
        lea come_back_here(%rip), (%rsp) ; change return address on stack
        jmp *%rax ; jump to the original function
    come_back_here:
        push_all_registers ; save all register values (except RAX and R10)
        call unlock_mutex
        pop_all_registers ; restore all register values
        mov %r10, (%rsp) ; restore return address on stack
        jmp *%r10 ; jump back to the caller
    mystring:
        .byte 'O','p','e','n','S','o','c','k','e','t',0

In C++26, if we were to have "interceptor functions", then the above
assembler would become:

    std::mutex m;

    auto OpenSocket(auto) interceptor
    {
        WriteLog( "Function called 'OpenSocket'");
        m.lock();
        auto const h = LoadLibraryA("transport.dll.original");
        auto const pf = (void(*)(void))GetProcAddress(h, "OpenSocket");
        goto pf;
        m.unlock();
        goto return;
    }

I'm 99% certain that this technique will work on every CPU with every
calling convention. Even when supernumerary arguments are pushed onto
the stack, still the last thing pushed onto the stack is the return
address. Some calling conventions put the return address in a
register, but this register can be easily manipulated (off hand I
think 64-Bit ARM aarch64 does this).

So an "interceptor function" can contain two kinds of "goto" statement:

(1) goto p; - where p is the address of the entry point of
another function (or member function)
(2) goto return; - jump back to the caller function

And of course, and "interceptor function" can also use "goto" in its
original meaning, i.e. to jump to different labels within the same
function.

Received on 2024-07-28 16:47:07