C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Standard support for different ABI's for class vtables

From: Simon Schröder <dr.simon.schroeder_at_[hidden]>
Date: Mon, 2 Jun 2025 19:04:22 +0200
Here is another idea to solve the problem: Make modules portable. I believe Microsoft has already a draft for a module interchange format that could be used to exchange modules between different compilers (and hopefully these portable modules would also work with the same compiler using different compiler flags—that would be an added bonus). Portable modules could specify calling conventions and how vtables are implemented (most likely just by specifying which compiler compiled the module or specifying the target ABI).

When imported, the compiler knows the ABI to be used for a specific type or function. However, it would be impossible for one compiler to generate a mixed module with different ABIs for different types. Also, there would be no guarantees that MSVC would be able to consume GCC modules (unless those where compiled to target the MSVC ABI). So, sometimes this might be a one-way street.

> On Jun 2, 2025, at 4:43 PM, Frederick Virchanza Gotham via Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> Okay there's been talk in this thread about how the Standard doesn't
> mandate the use of vtables nor even acknowledge the existence of
> vtables. That's fair enough, we can keep things abstract here. An
> object of a polymorphic class type will have a link to some sort of
> "polymorphic facilitator" -- which today in 2025 for all C++ compilers
> is a vtable. But we can stay abstract and call it a polymorphic
> facilitator.
>
> The polymorphic facilitator can be inside the object -- e.g. in the
> form of a pointer to the vtable, or alternatively there could be a
> global container something like:
>
> std::map< void*, void* > g_pointers_to_polymorphic_facilitators;
> // make sure to protect with mutex
>
> Any system of mapping an object to its polymorphic facilitator is fine
> so long as the compiler, when it's given a pointer to an object of
> type T, knows how to find the polymorphic facilitator. Although the
> global 'std::map' method might not work with trivial destructors
> because the destructor would have to update the map (thereby making
> itself nontrivial) -- and so maybe the polymorphic facilitator (or the
> link to the polymorphic facilitator) has to be inside the polymorphic
> object.
>
> 99% of C++ compilers make this system very simple -- any polymorphic
> object will always have a pointer to its polymorphic facilitator
> located at address [base + 0x00] inside the object. What this means,
> is that when we have a "void*", we _can_ actually do the following two
> things even though the Standard doesn't let us to:
>
> (1) Get the most-derived object (i.e. dynamic_cast to void*)
> (2) Get the type_info
>
> Here are these two features coded up on GodBolt:
>
> https://godbolt.org/z/ajcnb8qda
>
> The above GodBolt works on every C++ compiler ever made. Except for
> one. Microsoft.
>
> On Microsoft it will work properly the vast majority of the time, but
> sometimes it will crash because of the following fact:
> "Where as 99% of compilers have a uniform way of mapping
> an object to its polymorphic facilitator, the Microsoft
> compiler does not have a uniform way -- it can differ by type."
>
> To bring that a bit more down to Earth:
>
> "The Microsoft compiler doesn't always place the
> vtable pointer at the very beginning of the object."
>
> and here's an example in the following GodBolt:
>
> https://godbolt.org/z/44j7or1rG
>
> The above GodBolt shows that the Microsoft compiler places the vtable
> pointer _after_ the non-polymorphic base, specifically at [base +
> 0x08].
>
> So it's because of the Microsoft compiler -- and _only_ because of the
> Microsoft compiler -- that we can't do the two things I talk about
> above. But last night I figured out a possible solution to this
> conundrum.
>
> On the Microsoft compiler, every type has an RTTICompleteObjectLocator
> (sort of like Microsoft's very own personal form of 'std::type_info'):
>
> struct RTTICompleteObjectLocator {
> uint32_t signature; // Always 0 for MSVC
> uint32_t offset; // Offset of the vtable within the
> complete object
> uint32_t cdOffset; // Constructor displacement offset
> struct TypeDescriptor* pTypeDescriptor; // Pointer to
> type_info structure
> struct _RTTIClassHierarchyDescriptor* pClassDescriptor; //
> Inheritance hierarchy
> };
>
> Do you see that second member? It's the number that we need in order
> to find the vtable inside a polymorphic object. Here comes the fun
> part, the reverse-engineering -- which is necessary because of the
> burden Microsoft has placed on the C++ standardisation process.
>
> This would be really easy if Microsoft provided an operator or
> function that gave us the RTTICompleteObjectLocator for any given
> polymorphic type. But they haven't made it that easy (or maybe it _is_
> that easy and they just haven't publicly documented the feature).
> There isn't even any link going from the 'std::type_info' to the
> RTTICompleteObjectLocator either. What I have been able to ascertain
> though, is how to determine the linker symbol for the
> RTTICompleteObjectLocator object, as follows:
>
> Name of class: MyClass
> Mangled name of its RTTICompleteObjectLocator: ??_R4MyClass@@6B@
> ??_R4 — Prefix indicating an RTTI Complete Object Locator.
> MyClass@@ — The class name, with MSVC-style name mangling.
> 6B@ — Suffix indicating the type of RTTI structure.
>
> Then inside a VC++ source file we can access the
> RTTICompleteObjectLocator for 'MyClass' as follows:
>
> extern "C" const void const *const _rtti_locator_MyClass;
> #pragma comment(linker, "/include:??_R4MyClass@@6B@")
>
> Now this is all well and good until we come to templates. How do we
> get the RTTICompleteObjectLocator of the following type:
>
> std::vector< int, MyOwnPersonalAllocatorType >
>
> That one would be tricky.
>
> So . . . instead of trying to work with the linker symbol for the
> RTTICompleteObjectLocator, I had an idea come to me last night.
> Consider the following function:
>
> template<class T>
> std::type_info const &GetTypeInfo(T &&obj)
> {
> return typeid(obj);
> }
>
> The above template function is able to get the 'std::type_info' for
> any polymorphic type. Therefore, the machine code produced for the
> above template function must contain within it the offset to the
> vtable. Let's write a GodBolt to see what assembler we get:
>
> https://godbolt.org/z/Eq7EbbaPK
>
> Here's the assembler we get for GetTypeInfo<Derived1&>:
>
> mov QWORD PTR [rsp+8], rcx ; Store rcx ('this' pointer) on stack
> sub rsp, 40 ; Allocate stack space
> mov rcx, QWORD PTR obj$[rsp] ; Load object pointer into rcx
> (prepare for typeid call)
> call __RTtypeid ; Call MSVC's runtime type
> identification function
> add rsp, 40 ; Restore stack space after function call
> ret
>
> Nothing too crazing going on in the above assembler because the vtable
> pointer is at [base + 0x00].
>
> But now let's look at the assembler for GetTypeInfo<Derived2&>. The
> assembler you'll see on GodBolt checks if the address is null, but
> since we're dealing with a reference instead of a pointer, I've
> removed the null check, leaving us with the following reduced
> assembler:
>
> mov QWORD PTR [rsp+8], rcx ; Store rcx ('this' pointer) on stack
> sub rsp, 56 ; Allocate stack space
> mov rax, QWORD PTR obj$[rsp] ; Load object pointer into rax
> mov rax, QWORD PTR [rax+8] ; Get vtable pointer from the object
> movsxd rax, DWORD PTR [rax+4] ; Sign-extend an offset value
> from vtable (possibly RTTI)
> mov rcx, QWORD PTR obj$[rsp] ; Reload object pointer into rcx
> lea rax, QWORD PTR [rcx+rax+8] ; Compute final RTTI
> pointer using offset
> mov QWORD PTR tv78[rsp], rax ; Store computed RTTI pointer in tv78
> mov rcx, QWORD PTR tv78[rsp] ; Load the computed RTTI
> pointer into rcx
> call __RTtypeid ; Call runtime type
> identification function
> add rsp, 56 ; Restore stack space after
> function call
> ret
>
> Do you see that fourth instruction? This one:
>
> mov rax, QWORD PTR [rax+8] ; Get vtable pointer from the object
>
> This is what we need. That number 8 is the offset. So at runtime we
> can analyse the machine code of GetTypeInfo<Derived2&>, and look for
> the first instruction where it adds a numeric constant to the register
> RAX. But first let's go to the website,
> "https://defuse.ca/online-x86-assembler.htm", and type that
> instruction in to get the machine code, it gives us back:
>
> constexpr char unsigned instruction[] = { 0x48, 0x8B, 0x40, 0x08 };
> // the last byte is the offset 8
>
> So now let's write a function to pluck out the offset from the
> function's machine code:
>
> unsigned PluckOutOffset(void const *const pv)
> {
> constexpr char unsigned instruction[] = { 0x48, 0x8B, 0x40, /*0x08*/ };
> char unsigned const *const p = static_cast<char unsigned const*>(pv);
>
> for ( unsigned n = 0u; ; ++n )
> {
> if ( p[n + 0] == instruction[0] &&
> p[n + 1] == instruction[1] &&
> p[n + 2] == instruction[2])
> {
> return p[n + 3];
> }
> }
> }
>
> And now we can write a function, which given any object of polymorphic
> type, can give us back the offset of the vtable inside the object, as
> follows:
>
> template<class T>
> requires std::is_polymorphic_v< std::remove_cvref_t<T> >
> unsigned GetVTableOffset(T &&obj)
> {
> char unsigned const *const p = (char unsigned
> const*)&GetTypeInfo< std::remove_cvref_t<T>& >;
>
> // Check if it's a very short function with
> // a return instruction at position 26
> if ( (0xc3==p[26]) && (0xcc==p[27]) ) return 0u;
>
> // Okay so we have a long function, let's pluck out the offset:
> return PluckOutOffset(p);
> }
>
> We are 50% of the way there. We are able to get the vtable pointer
> from the object pointer. But now we need to go the other way: we need
> to get the object pointer from the vtable pointer, as follows:
>
> void *VTable_Pointer_to_Object_Pointer(void const *const pvtable)
> {
> // The RTTICompleteObjectLocator is located sizeof(void*)
> bytes before the vtable:
> unsigned const *const locator = *(*(unsigned***)pvtable - 1);
> // The member 'offset' is the second int inside the
> RTTICompleteObjectLocator:
> unsigned const *const pn = locator + 1;
> return (char*)pvtable - *pn;
> }
>
> So now on the Microsoft compiler, we have a way of:
> (1) Getting the vtable pointer from any pointer-to-object
> (2) Getting the object pointer from any pointer-to-vtable
>
> So now let's write a new class, "std::polymorph_handle", which under
> the hood is just a "void*", and which stores the address of the
> pointer-to-vtable inside an object:
>
> struct polymorph_handle {
> void *p;
>
> template<class T>
> requires std::is_polymorphic_v< std::remove_cvref_t<T> >
> polymorph_handle(T &&arg)
> {
> p = arg + GetVTableOffset(arg);
> }
>
> void *get_pointer_to_object(void) const noexcept
> {
> return VTable_Pointer_to_Object_Pointer(p);
> }
> };
>
> So now let's create a global container of polymorph_handle's:
>
> std::vector<polymorph_handle> mypolymorphs;
>
> And let's populate it with all different kinds of polymorphic objects,
> some with their vtable at [base + 0x00], and some with their vtable at
> another location such as [base + 0x08]. Here we go:
>
> https://godbolt.org/z/jWez7dzjc
>
> The above GodBolt works but it's by no means perfect. I wouldn't have
> to analyse the machine code if Microsoft provided a built-in operator
> such as:
>
> __get_rtti_complete_object_locator(T)
>
> (Maybe they actually have such an operator but it's not publicly documented)
>
> Now that we have a working implementation of std::polymorph_handle, we
> can now do the following two things:
>
> (1) Get the most derived object from an std::polymorph_handle
> (2) Get the type_info from an std::polymorph_handle
>
> We could implement these features as member functions as follows:
>
> struct polymorph_handle {
>
> void *p;
>
> . . .
> . . .
> . . .
>
> std::type_info const &GetTypeInfo(void) const noexcept
> {
> struct Dummy { virtual ~Dummy(void) noexcept = default; };
>
> Dummy *const pdummy = static_cast<Dummy*>(this->p);
>
> return typeid(*pdummy);
> }
>
> void *GetMostDerived(void) const noexcept
> {
> struct Dummy { virtual ~Dummy(void) noexcept = default; };
>
> Dummy *const pdummy = static_cast<Dummy*>(this->p);
>
> return dynamic_cast<void*>(pdummy);
> }
> };
>
> And here's a GodBolt testing it out:
>
> https://godbolt.org/z/6KMzEqaT9
>
> You'll be able to break it though by testing out different classes
> because my code to analyse the machine code isn't perfect. I tried to
> use it with a 'stringstream' and its sub-objects 'istream' and
> 'ostream' but it crashed -- presumably because I got the wrong offset
> for the vtable pointer. But it works in principle, and Microsoft can
> make this really easy by providing
> __get_rtti_complete_object_locator(T).
>
> So . . . whereas previously you dealt with a container of " void *
> " representing polymorphic objects, you can now use "polymorph_handle"
> instead, which has zero overhead on 99% of compilers, and only a tiny
> bit of overhead on the Microsoft compiler -- unless of course it can
> be made consteval.
>
> But here's the icing on the cake: It's not even an ABI break for 99%
> of compilers. On 99%, the " void * " for the whole object is that same
> as the pointer held by the polymorph_handle. And when it comes to
> Micrsosoft, well we're making something possible that was previously
> impossible, so you can't have an ABI break on a new feature!
>
> I might write a paper proposing the addition of std::polymorph_handle
> to the language.
>
> P.S. I'll admit that all that assembler messing took me about 6 hours
> to get right
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-06-02 17:04:39