C++ Logo

std-proposals

Advanced search

Re: [std-proposals] interceptor functions (tested and working on x86_64)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Fri, 1 May 2026 08:39:01 +0100
In response to David, Jonathan and Thiago chronologically:



Thiago:
> What could possibly be the use-case for intercepting printf? How do you
even
> know what the mangling of "printf" is (hint: it's not `printf`
everywhere)?



Not sure why you're making this point, because the address of 'printf' gets
resolved in C++, not in assembler, and so the compiler handles the
mangling. We don't need to know the mangled name of printf.



Jonathan:
> auto const slow_print = []<typename... T> [[gnu::always_inline]] (T&&...
t)
> {
> using namespace std::chrono;
> std::this_thread::sleep_for(1s);
> return std::printf(std::forward<T>(t)...);
> };



But you can't give me the address of the entry point of that template
lambda. You can give me the address of one instantiation of it -- which is
limited to a certain set of types -- but you can't give me the entry point
of a function that works for all types.



Jonathan:
> How does 'goto -> std::printf' work with different
> target function signatures?



Because you can do something like:

  switch ( some_global_variable )
  {
  case 0: goto -> std::printf ;
  case 1: goto -> boost::printf; // perhaps has a different signature
  case 2: goto -> my_own_printf;
  }



David:
> Why do you think it is so important for the interceptor function to be
> separately compiled and invisible to the calling code, yet general
> enough to handle any function signatures? Where would that be an
> essential feature?



I'll explain exactly what I had in mind when I first conceived interceptor
functions.

Let's say you have an executable file, "prog.exe", and it links at runtime
with "graphics.dll". Dependency Walker shows you that the latter exports 3
functions:

    setCreate
    setPush
    setPop

These 3 names are unmangled so you've no idea what the return type or
parameter types are.

For the sake of either logging or debugging, or for altering behaviour at
runtime, you want to intercept calls to these three functions, so you write
a replacement library something like:

 void (*p_setCreate)(void) = GetProcAddress(...);
 void (*p_setPush )(void) = GetProcAddress(...);
 void (*p_setPop )(void) = GetProcAddress(...);

 [[interceptor]] void setCreate(void) noexcept
 {
  DoLoggingOrWhatever();
  goto -> p_setCreate;
 }

 [[interceptor]] void setPush(void) noexcept
 {
  DoLoggingOrWhatever();
  goto -> p_setPush;
 }

 [[interceptor]] void setPop(void) noexcept
 {
  DoLoggingOrWhatever();
  goto -> p_setPop;
 }
I needed this about a year ago when I spent about a week or two looking for
the cause of a segfault . . . I ended up writing the solution in x86_64
assembler but it would have been so much handier and quicker to have had
interceptor functions. So that's one use case.

But even in cases where we have the C++ source code and a working build
system for 'prog.exe' and 'graphics.dll', we sometimes want to work with
the original faulty binaries to find out what's going wrong. Sometimes when
we rebuild a program in Debug Mode, it no longer malfunctions. And even if
we build those two files in Release Mode but with just a few little tweaks,
the buggy behaviour might go away and we don't know why -- for example if
extra code has bloated the DLL file into using another page of memory and
so now when we run off the end of an array, it's harmless instead of
crashing. If we want to work with the original faulty builds of 'prog.exe'
and 'graphics.dll', the we put an interceptor between the two of them.

As for a second use case, here's what Lorand Szollosi said when I first
floated this idea here on this mailing list about two years ago:

> This is, as I read it, essentially the same thing as function jump in c--,
> which allows some optimizations and features in Haskell. It's very much
needed
> in continuation passing / continuation pool style, so slashing it with
'how many
> times do you need it' is an invalid argument: I'd use it all the time.
I'd also
> help in some cases for which we have multiple proposal ideas rotating
here already,
> which is basically multi-return (including return twice, return any
number of types
> - not listed in advance, e.g., types created by third-party template
function, etc.).
> I actually encountered use-cases where we had to wrap a lambda to a
std::function<>,
> thus have a virtual call, instead of passing a lambda to the function
we'd jump to.
> This slows down inner loops in message processing (HFT) codes.
>
> So yes, even if many people here don't want full call/cc in C++, the
jump-to-function
> could be a useful step in the direction of handling continuations.


So these are the two use cases I'm aware of.

A third use case would be "last resort runtime tweaking" in legitimate
projects (e.g. wanting to slow traffic down a little by intercepting all
calls and putting in a 5 millisecond sleep), or perhaps incrementing an
atomic counter or acquiring a resource.

A fourth use case is recreational hacking but I don't know if that counts
standards-wise.

Today I've added a new feature to [[interceptor]] functions in the GNU
compiler.

Even though the interceptor is designed to be agnostic to the target
function's signature (i.e. it doesn't care about the return type, parameter
types or calling convention), I have given the interceptor function access
to the target function's first parameter (assuming it has one). So on
x86_64, this will be a 64-Bit pointer (i.e the contents of RDI on Linux or
RCX on MS-Windows). On x86_32, it will be the last 32-Bit pointer pushed
onto the stack (before the return address).

What this means is that we can intercept a call to 'fprintf' and swap
'stdout' with 'stderr', like this:

  [[interceptor]] void MyInterceptor(void) noexcept
  {
    /**/ if ( stdout == __arg ) __arg = stderr;
    else if ( stderr == __arg ) __arg = stdout;

    goto -> std::fprintf;
  }

I have implemented this in the GNU compiler by making a change to how the
"thunk + core" work. Specifically I changed the signature of the 'core'
from:

    auto __coreFunc(void) -> void(*)(void)

to:

    auto __coreFunc(void *&) -> void(*)(void)

So the core gets passed the address of the first argument (which has been
backed up on the stack). And as the pointer is passed by reference, the
value can be edited in place. Later the backed up value is restored (which
is into a register on x86_64).

But what happens if the target function doesn't take any parameters? Well
in that case, "__arg" will just have an indeterminate value. No undefined
behaviour, no implementation-defined behaviour -- just indeterminate. On
x86_64 you'll just get whatever had previously been in RDI (or RCX on
MS-Windows), and on x86_32 you'll just get whatever was pushed onto the
stack right before the return address.

Tested and working on both x86_32 and x86_64:

    https://godbolt.org/z/T1v4GP449

Received on 2026-05-01 07:39:06