C++ Logo

std-proposals

Advanced search

Re: [std-proposals] A drift for c++ decorators;

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Sat, 2 Nov 2024 12:43:30 +0000
On Thu, Oct 31, 2024 at 4:57 PM Thiago Macieira wrote:
>
> > So it's possible to intercept all the method calls on a polymorphic
> > object without needing to allocate a page of writeable executable
> > memory -- in fact it's possible without any dynamic allocation or
> > permission setting at all.
>
> You do realise you have an arbitrary constant of MAX_COUNT_METHODS in your
> code, right? It's pretty easy to have classes with hundreds of virtuals.


Yeah I was debugging it and setting it to lower and lower values to
see if it would crash.

If you want to accommodate 16 thousand virtual methods then do:

    #define MAX_COUNT_METHODS 16384

Of course the VTable will then be about 128 kilobytes but it won't be
a big deal on a modern PC with gigs of RAM.


> You've still not listened to me when I said you're either saving too many
> registers or too few. If you can intercept only at function call boundaries,
> you only need to save the callee-preserve registers. None of the rest can be
> relied upon to have any meaningful value upon entry in the function call nor
> do they have to have any particular value on exit.


I'll have two modes. If you do "#define ENTIRE_CPU_STATE" then it will
push every register. If you don't define it then it will push all the
callee-saved registers along with the argument registers (the argument
regsiters are caller-saved on SystemV x86_64 but of course we still
need to keep their values if we're doing an interception as the
arguments must be passed intact to the original function).


> If you can intercept anywhere, then you must save ALL registers, not just a
> fixed handful. You've missed half of all GPRs on x86-64 (r16 to r31) and seven
> eighths of the vector register state. That is, you managed to miss 83.33% of
> the state of a typical AVX10.2-512 application. Then we have extra states like
> AMX (currently 8192 bytes) or MPX (128 bytes, but deprecated).


I'll look into making an exhaustive list of all registers -- which
might include intermediary instructions to query the CPU on what
registers are available.
By the way can you give me a link(s) to where I can get all this info?
It's actually tricky to find it all by doing web searches (I've
tried).


> And that's not
> counting the fact that your POP_ALL macro is pushing the XMM registers, not
> popping them.


Thanks for spotting that. I had it all written out properly in Intel
syntax and then I fed it into ChatGPT to turn it into AT&T. It did the
pushes properly but messed up the pops.


> Your code doesn't look thread-safe, even if the interception can only be done
> at program start before threads. I also think it will fail at runtime if
> Controlflow Enforcement is enabled, around lines 393-396.


I have a feeling that ControlFlow Enforcement might not notice what
I'm doing. I have no idea how ControlFlow Enforcement is implemented,
but if I were to guess, I would say it monitors the use of the "call"
instruction, and then makes sure that the next "ret" instruction jumps
back to the same address. Well the assembler I wrote uses 'jmp'
instead of 'call', so the ControlFlow Enforcement might just think
it's a loop instead of a function call. And then the final "ret"
actually does jump back to the original address that the "call" came
from, so my trickery might fall below the radar.

Received on 2024-11-02 12:43:41