Date: Sat, 7 Jun 2025 17:39:21 +0100
On Mon, Jun 2, 2025 at 3:43 PM Frederick Virchanza Gotham wrote:
>
>
> You'll be able to break it though by testing out different classes
> because my code to analyse the machine code isn't perfect. I tried to
> use it with a 'stringstream' and its sub-objects 'istream' and
> 'ostream' but it crashed -- presumably because I got the wrong offset
> for the vtable pointer. But it works in principle, and Microsoft can
> make this really easy by providing __get_rtti_complete_object_locator(T).
In my previous posts in this thread, when trying to implement
'std::polyhandle' for the Microsoft compiler, I tried to find the
'RTTICompleteObjectLocator' for a polymorphic class by analysing
machine code at runtime. I got this working perfectly in Release mode
(because the optimised machine code is very minimal and easy to
parse), but it got a bit hairy in Debug mode with the more complex
machine code (which only got worse if you tweaked options like '-Oy'
and so on).
I searched and searched and searched for a way to get the
'RTTICompleteObjectLocator' for a type, but Microsoft hasn't
documented it publicly. My spidey senses are telling me though that
they probably have their own secret operator something like
'__get_rtti_complete_object_locator(T)'. They must have.
Anyway. I came up with another idea. The Microsoft C++ runtime library
has a function called '__RTCastToVoid', and it's what's used when you
perform a 'dynamic_cast' to a ' void * '. So here's what happens when
you do a 'dynamic_cast' to a ' void * ' on the Microsoft compiler:
Step 1: Adjust the object pointer by an offset to find the vtable pointer
Step 2: Put this vtable pointer in the RCX register and call
'__RTCastToVoid'
So I was thinking . . . if I could somehow intercept an invocation of
'__RTCastToVoid', then I could compare my object pointer to the
pointer that gets passed to __RTCastToVoid. Subtract the former from
the latter and that's the number we need. So first I defined a
thread_local variable:
thread_local void const *argument_to_RTCTV = nullptr;
And next I wrote my own interceptor for __RTCastToVoid as follows:
extern "C" void *__RTCastToVoid(void *const arg) noexcept(false)
{
argument_to_RTCTV = arg; // save this to do a subtraction later!
void *hRuntime = ::GetModuleHandleA("vcruntime140.dll");
if ( nullptr == hRuntime ) hRuntime =
::LoadLibraryA("vcruntime140.dll");
if ( nullptr == hRuntime ) return nullptr;
auto const fp = (void*(*)(void *)) ::GetProcAddress(hRuntime,
"__RTCastToVoid");
if ( nullptr == fp ) return nullptr;
return fp(arg);
}
And then I could write a function that gives you the
offset-to-the-vtable for any polymorphic class:
template<class Tref>
requires std::is_polymorphic_v< std::remove_cvref_t<Tref> >
std::uint32_t GetOffsetToVftable(Tref &&obj)
{
typedef std::remove_cvref_t<Tref> T;
T *const p = const_cast<T*>( std::addressof(obj) );
(void)dynamic_cast<void*>(p);
return (char*)argument_to_RTCTV - (char*)p;
}
And guess what. . . it works! Not only does it work... it works on
both x86 and x64, in both Debug Mode and Release Mode.... you can't
break it! Give the compiler whatever optimisation or debug options you
want, its bullet-proof! Furthermore, sizeof(std::polyhandle) ==
sizeof(void*), is really something I want to achieve. Here's the
GodBolt:
https://godbolt.org/z/Px6xobT4E
So I can definitely get 'std::polyhandle' working properly for the
Microsoft compiler (even if it's a little inefficient). It will be
blistering fast when they provide us with
'__get_rtti_complete_object_locator(T)'.
I'm very keen to also get this working for new Apple Silicon computers
that do pointer authentication (i.e. -arch arm64e) but I don't have
access to a new Apple computer. The GitHub Actions runner for
'macos-15' doesn't support pointer authentication. I tried using Qemu
on macOS x86_64 inside a virtual machine but I didn't get anywhere.
But even if someone can provide me with SSH access to a new Apple
Silicon computer somewhere, I can remote in just to run binaries to
test them.
>
>
> You'll be able to break it though by testing out different classes
> because my code to analyse the machine code isn't perfect. I tried to
> use it with a 'stringstream' and its sub-objects 'istream' and
> 'ostream' but it crashed -- presumably because I got the wrong offset
> for the vtable pointer. But it works in principle, and Microsoft can
> make this really easy by providing __get_rtti_complete_object_locator(T).
In my previous posts in this thread, when trying to implement
'std::polyhandle' for the Microsoft compiler, I tried to find the
'RTTICompleteObjectLocator' for a polymorphic class by analysing
machine code at runtime. I got this working perfectly in Release mode
(because the optimised machine code is very minimal and easy to
parse), but it got a bit hairy in Debug mode with the more complex
machine code (which only got worse if you tweaked options like '-Oy'
and so on).
I searched and searched and searched for a way to get the
'RTTICompleteObjectLocator' for a type, but Microsoft hasn't
documented it publicly. My spidey senses are telling me though that
they probably have their own secret operator something like
'__get_rtti_complete_object_locator(T)'. They must have.
Anyway. I came up with another idea. The Microsoft C++ runtime library
has a function called '__RTCastToVoid', and it's what's used when you
perform a 'dynamic_cast' to a ' void * '. So here's what happens when
you do a 'dynamic_cast' to a ' void * ' on the Microsoft compiler:
Step 1: Adjust the object pointer by an offset to find the vtable pointer
Step 2: Put this vtable pointer in the RCX register and call
'__RTCastToVoid'
So I was thinking . . . if I could somehow intercept an invocation of
'__RTCastToVoid', then I could compare my object pointer to the
pointer that gets passed to __RTCastToVoid. Subtract the former from
the latter and that's the number we need. So first I defined a
thread_local variable:
thread_local void const *argument_to_RTCTV = nullptr;
And next I wrote my own interceptor for __RTCastToVoid as follows:
extern "C" void *__RTCastToVoid(void *const arg) noexcept(false)
{
argument_to_RTCTV = arg; // save this to do a subtraction later!
void *hRuntime = ::GetModuleHandleA("vcruntime140.dll");
if ( nullptr == hRuntime ) hRuntime =
::LoadLibraryA("vcruntime140.dll");
if ( nullptr == hRuntime ) return nullptr;
auto const fp = (void*(*)(void *)) ::GetProcAddress(hRuntime,
"__RTCastToVoid");
if ( nullptr == fp ) return nullptr;
return fp(arg);
}
And then I could write a function that gives you the
offset-to-the-vtable for any polymorphic class:
template<class Tref>
requires std::is_polymorphic_v< std::remove_cvref_t<Tref> >
std::uint32_t GetOffsetToVftable(Tref &&obj)
{
typedef std::remove_cvref_t<Tref> T;
T *const p = const_cast<T*>( std::addressof(obj) );
(void)dynamic_cast<void*>(p);
return (char*)argument_to_RTCTV - (char*)p;
}
And guess what. . . it works! Not only does it work... it works on
both x86 and x64, in both Debug Mode and Release Mode.... you can't
break it! Give the compiler whatever optimisation or debug options you
want, its bullet-proof! Furthermore, sizeof(std::polyhandle) ==
sizeof(void*), is really something I want to achieve. Here's the
GodBolt:
https://godbolt.org/z/Px6xobT4E
So I can definitely get 'std::polyhandle' working properly for the
Microsoft compiler (even if it's a little inefficient). It will be
blistering fast when they provide us with
'__get_rtti_complete_object_locator(T)'.
I'm very keen to also get this working for new Apple Silicon computers
that do pointer authentication (i.e. -arch arm64e) but I don't have
access to a new Apple computer. The GitHub Actions runner for
'macos-15' doesn't support pointer authentication. I tried using Qemu
on macOS x86_64 inside a virtual machine but I didn't get anywhere.
But even if someone can provide me with SSH access to a new Apple
Silicon computer somewhere, I can remote in just to run binaries to
test them.
Received on 2025-06-07 16:39:33