ISOCPP std-proposals List: Re: [std-proposals] Interceptor Function (preserve stack and all registers)

From: Tiago Freire <tmiguelf_at_[hidden]>
Date: Mon, 29 Jul 2024 09:06:43 +0000

You forgot about the type.

By the way, the technic being described is called dll hijacking, it is not novel, what is novel about it is the completely bonkers setup. Why he doesn't just attach a debugger and run the stack? Or why he doesn't try a disassembler to figure out the parameters? Or use other tools and technics? What is the utility of trying to debug code that you have no source code for, you know it's broken but you have no means of fixing it? No idea, only explanation I got is hackery.

The earlier code explains the setup while the later describes what he wants to do. And in the earlier code "FlushPipe" is described as:

int (*)(void *, int);

i.e. takes in 2 parameters and returns an int.

Then in his setup he explains, he doesn't know the types of the function that he is trying to hijack.
And in the later code "FlushPipe" is cast to

void (*)(void)

i.e. no parameter and nothing returned.
But we already know "FlushPipe" takes in 2 parameters and returns an int.
Not good, it has the wrong type, but to the OP this doesn't matter all they want is an address to jump to, the fact that it has a cast at all is so that it looks like a function. Same thing happens for the interceptor function, it is declared as:

auto Func(auto) interceptor

You would expect this to be an automatically deduced set of arguments with an automatic deduced return type, but NO! The type of "FlushPipe" has been missadvertised as
void (*)(void)

but it expects 2 parameters and returns an int,
this is what also is expected of "Func", but all of that information is gone.

Now put yourself in the role of the compiler, it sees
auto Func(auto) interceptor.

Now let's imagine another function calling Func. How is it supposed to know that it takes 2 arguments (void *, int) and it returns an int?

It can't!

You can never call this function.

The reason why if the jump fails it needs to abort, is because the return object is expected and the passed in parameters might need to be destroyed, and only the jumped to function can resolve it, if it doesn't get called your stack can be corrupted.

This is a completely different story from knowing the inputs and outputs of Func and FlushPipe

In that case you would just declare Func as
int Func(void *, int);

no "interceptor" annotation required, the user of Func just see's it as a regular function that it calls. And Func can then just write it as a call to FlushPipe as a regular function and let the compiler optimize that by just jumping to the function address. The only thing novel would be that all locals would be destroyed before jumping. While interesting as a concept, not what is being proposed.

What is being proposed requires:
* There must be a binary that exports a function
* There must be a second binary that links to the first and a function that calls the function in the first with full knowledge of its type
* Your code needs to be put in a third binary and perform the dll hijacking.
* You must not have the source code of either binary A or B, because if you did you would know the type and not need this.
* You still have to somehow resolve name mangling.
* the "intercept" function itself, it can only ever be exported, it can never be called from a C++ code, and can only be used by tricking the linker at runtime.

C++ doesn't know what a dll is, you cannot call this function from C++ source code, and the function being jumped to is by its use case unknown.
The setup is quite insane, this was never done in the past because no one would solve this problem in this way.

Everything about it is ill formed. Had this actually been used in production code it would have been a hacker's dream.

The user posting this is a known spammer. Everything they post is just absolute ill formed, hacky, absurd. It's paradoxically high skilled and utterly incompetent.

Please stop feeding this spammer!

________________________________

From: Lorand Szollosi <szollosi.lorand_at_[hidden]>
Sent: Monday, July 29, 2024 12:56:44 AM
To: Tiago Freire <tmiguelf_at_[hidden]>
Subject: Re: [std-proposals] Interceptor Function (preserve stack and all registers)

Hi,

On Sun, Jul 28, 2024 at 1:02 AM Tiago Freire <tmiguelf_at_[hidden]<mailto:tmiguelf_at_[hidden]>> wrote:
You made an interpretation of the proposal that has nothing to do with the OP’s proposal.
Please re-read it.
I might have skipped a few steps in the reasoning, I give that to you; however, I suggest we go through OP's mail together (see below) so you'll see how these are connected.

OP would like two things:
1. To be able to export-import, possibly in DLLs, a function under the same name as another function in an existing DLL, and to be able to write that function without having to care about signature.
2. To be able to jump to a function instead of calling it, where jump means as-if we replaced the call to interceptor (say, f_i(...)) in the caller with (f_i(...), f(...)) where f is the function we jump to, expect that arguments to f(...) are kept alive (and set by) f_i(...) call.

Now, 1. has little to do with C++ Standard, as it doesn't define DLLs (or even linking, btw.), it's an ABI-thing. In the code OP wrote, it's also Windows-specific, but that's not a necessary requirement to implement this. How a linker (if the implementation uses a linker) names functions in a DLL (if the target platform uses DLLs) is ABI-specific, so my understanding is that OP either needs Itanium ABI changes, or - more likely - needs a simple tool that allows for mangling symbols manually in their DLL. That's not our focus here, so I skipped that part in the comment.

Let's check what's on our table from point 1, by going through the code samples posted:

> On 26 Jul 2024, at 13:22, Frederick Virchanza Gotham via Std-Proposals <std-proposals_at_[hidden]<mailto:std-proposals_at_[hidden]>> wrote:
> typedef int (*FuncPtr_FlushPipe)(void *pipe, int flags);
>
> template<typename... Params>
> decltype(auto) FlushPipe(Params&&... args)
> {
> WriteLog( "Function called 'FlushPipe'");
> auto const h = LoadLibraryA("monkey.dll");
> auto const pf = (FuncPtr_FlushPipe)GetProcAddress(h, "FlushPipe");
> return pf( static_cast<Params&&>(args)... );
> }
It's entirely possible to make this work via the new goto (see later how) if the linker exposes this function under the same name as the original DLL does (i.e., FlushPipe). Problem is, in real world, templates aren't necessarily reaching linker, in fact, you need to ODR-use or instantiate the template to make sure it's compiled at all (e.g. into the interceptor DLL). Therefore, the code suggests that OP will request a 'base case' for template functions which is always instantiated. Hypothesis here is, it's the case where all we do to arguments is perfect forwarding to tail call / jump, if anything at all. (Technically, we could allow perfect forwarding and extending on the back.)

Let's check that hypothesis on code samples 2 and 3:
> typedef int (*FuncPtr_FlushPipe)(void *pipe, int flags);
>
> extern "C" int FlushPipe(void *const pipe, int const flags)
> {
> WriteLog( "Function called 'FlushPipe'");
> auto const h = LoadLibraryA("monkey.dll");
> auto const pf = (FuncPtr_FlushPipe)GetProcAddress(h, "FlushPipe");
> return pf( pipe, flags );
> }
This, as I read, is OP's way of telling that manual instantiation or ODR-use would have been required, therefore OP resorted to expanding the code here. Nothing's wrong with that, essentially it's still the same code.

> auto Func(auto) interceptor
> {
> WriteLog( "Function called 'FlushPipe'");
> auto const h = LoadLibraryA("monkey.dll");
> auto const pf = (void(*)(void))GetProcAddress(h, "FlushPipe");
> goto pf;
> }
This, as I read, is a proposed syntax for a template that's auto-instantiated for forwarding. It also shows an example of the proposed goto syntax (point 2), we can return to that later.
Keep in mind, that, from OP's perspective it's still the same thing as example 1, i.e., OP is using it in a situation where it's called with exactly the same arguments as in example 1. Unspoken here, but we're talking about a monad defined by function arguments. Now, probably OP is not insisting on this particular syntax, but a solution to achieve this. In particular, we could keep the perfect forwarding syntax, or any other syntax, as long as we allow perfect forwarding in tail calls and gotos (or, if we were to go wild, in any calls). This is nothing new: on a meta-level, you can refer to the particular instantiation of a template class by name inside the class definition (i.e., in template<typename T> class A;, you don't need to write A<T>, you can write A). For goto, this is unambiguous; for tail call, you could use the classic forwarding syntax.
IMHO best were to keep the classic syntax:

auto Func(auto&& args...) /* interceptor - no notation needed */
{
    WriteLog( "Function called 'FlushPipe'");
    auto const h = LoadLibraryA("monkey.dll");
    auto const pf = (void(*)(void))GetProcAddress(h, "FlushPipe");
    ////// use either of these:
    // return pf(std::forward<decltype(args)>(args)...); // this calls pf first, then destructs any locals (tail call)
    // goto pf(std::forward<decltype(args)>(args)...); // this destructs any locals, except function arguments, then calls pf
    goto pf; // this is the same as the previous, shorthand syntax
}

As long as we mandate that the compilation unit should export this as a symbol regardless of arguments, I think it works for OP for point 1. And this can be made to work on stack-based implementations, for the stack effect of a call to Func is the same as a call to pf; therefore, goto will work. Tail call version might or might not be made to work that simple, as it needs locals to be destructed, therefore a new stack frame, etc, but all these we don't have with the jump version.
For point 2, you can already see that goto pf; is a shorthand for goto pf(std::forward<decltype(args)>(args)...); and that it's unambiguous. It simply means to:
- extend the lifetime of arguments (to pf call - in this case, it's the same as the original call) to where pf returns
- destruct any locals, restore exception handlers as-in caller
- jump to pf

The important thing here is, we don't need to know the arguments; similarly to that we don't need to have exact info of the entire monad inside a mapping, or we don't need to list all the member variables in a member function, or we don't need to list template arguments to refer to the particular instantiation in the template class definition. We simply take it as 'context', and it works. Without additional costs, in fact, with further optimisation as these jumps unroll unnecessary parts of the stack.

As for the rest of the mail:
> (1) An exception must not propagate outside of an interceptor before
> the jump takes place. If this occurs then either (1.a) The terminate
> handler is called, or (1.b) undefined behaviour.
I don't think it's necessary, we might simply say that exception handlers of the caller of 'interceptor' (caller of Func) apply from the point where the jump takes place. That is expected to be the simplest to implement.

> (2) If control reaches the end of an interceptor function without a
> "goto" statement, then it's as though the interceptor ended with "goto
> std::abort".
I'd skip this, simply making it the same as-if control reaches the end of a non-void function not returning a value (which is UB in general, but many implementations simply allow it as uninitialized value if return type is POD).

So, all-in-all, I think it's doable (with some adjustments) and it brings the continuations, monads closer to C++.

Thanks for reading,
-lorro

Received on 2024-07-29 09:06:49