PxxxxR0
std::polyhandle

New Proposal,

This version:
http://virjacode.com/papers/polyhandle_unpublished000.htm
Latest version:
http://virjacode.com/papers/polyhandle.htm
Author:
Thomas PK Healy <healytpk@vir7ja7code7.com> (Remove all sevens from email address)
Audience:
SG17, SG18
Project:
ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21

Abstract

Add a class to the standard library to manage a handle to a polymorphic object, with zero overhead on most compilers.

1. Introduction

The C++ programming language provides powerful support for polymorphism via virtual method dispatch, runtime type identification, and dynamic casting. However, retrieving runtime type information or accessing the most-derived object requires compile-time knowledge of the static base types. This paper introduces std::polyhandle, a compact, type-erased polymorphic handle that encapsulates the identity of any polymorphic object in exactly one datum (void*), and allows runtime access to:

On platforms implementing the Itanium C++ ABI (such as GNU g++, LLVM clang++, Intel ICX), the construction of an std::polyhandle is a total no-op at runtime. On the Microsoft compiler, the constructor adds an offset to the address of the object in order to get the address of the object’s vtable pointer.

2. Motivation

There are many situations where it is useful to store an abstract reference to a polymorphic object in a type-erased form, such as in reflection, serialization, inspection, dynamic casting, and debugging tools. While std::any and other non-template classes offer type erasure, they require more memory to store additional information.

The key insight behind std::polyhandle is that all required runtime information about a polymorphic object is already available via the vtable pointer, and can be accessed with a single void*.

3. API Overview

class polyhandle {
    void *p;
public:
    template<class T> requires is_polymorphic_v< remove_cvref_t<T> >
    polyhandle(T &&obj) noexcept;

    void *object(void) const noexcept;
    void *most_derived(void) const noexcept;
    std::type_info const &typeinfo(void) const noexcept;
};

static_assert( sizeof (polyhandle) == sizeof (void*) );
static_assert( alignof(polyhandle) == alignof(void*) );

4. Compress three pointers into one

In order to fully describe a polymorphic object at runtime, one might store:

  1. The pointer to the current sub-object (void*)

  2. The pointer to the most-derived object (void*)

  3. The pointer to the most-dervied object’s type_info

With std::polyhandle, all three can be recovered from a single pointer to the sub-object’s vtable pointer.

On Itanium ABI systems, the vtable pointer within the object stores enough information so that:

On Microsoft, std::polyhandle stores the adjusted pointer and calculates offsets using the data found within the RTTICompleteObjectLocator.

Hence, std::polyhandle effectively compresses the three pointers into one with no loss of information.

5. ABI-Specific Behavior

5.1. Itanium ABI

The Itanium C++ ABI is used on most compilers, such as GNU g++, LLVM clang++, Intel ICX. The ABI defines the behavior of dynamic_cast<void*>(p) and typeid(*p) such that they rely on the vtable pointer embedded in the object, which conveniently is always located at address [base + 0x00].

Thus, on compilers which implement the Itanium ABI, std::polyhandle is nothing more than:

template<class T> requires is_polymorphic_v< remove_cvref_t<T> >
polyhandle(T &&obj) noexcept
{
    this->p = const_cast< remove_cvref_t<T> * >(  addressof(obj)  );
}

All subsequent queries (most_derived, typeinfo) consult the object’s vtable to find the required data.

5.2. Microsoft

Microsoft’s ABI requires a small adjustment, since the vtable pointer inside a polymorphic object might not be located at address [base + 0x00].

To support full recovery of the most-derived address and dynamic std::type_info, the constructor of std::polyhandle, instead of storing the address of the current object, stores the address of the current object’s vtable pointer, as follows:

template<class T> requires is_polymorphic_v< remove_cvref_t<T> >
polyhandle(T &&obj) noexcept
{
    this->p  = const_cast< remove_cvref_t<T> * >(  addressof(obj)  );
    this->p += __get_vtable_pointer_offset( remove_cvref_t<T> );
}

Note that the Microsoft compiler doesn’t currently provide a built-in operator, __get_vtable_pointer_offset, and so the implementation for Microsoft in this document intercepts calls to __RTCastToVoid in order to ascertain the class type’s offset to the vtable pointer.

All subsequent queries (most_derived, typeinfo) consult the object’s RTTICompleteObjectLocator to find the required data.

5.3. Apple arm64e

Apple computers with an Apple Silicon CPU compile C++ code for the arm64e architecture, which uses Pointer Authentication Code (PAC) technology for the 64-Bit ARM instruction set (also known as the aarch64 instruction set). On these Apple machines, the vtable pointer inside an object is encrypted, and must be decrypted using a 16-Bit secret number called a discriminator.

Therefore, inside the constructor for std::polyhandle, this 16-Bit number must be stored somewhere in order to access the vtable later. 64-Bit ARM CPU’s can access a maximum of 512 terrabytes of memory, and therefore only 49 bits of a pointer are needed. This leaves us 15 bits to store the 16-Bit discriminator. Furthermore, as any polymorphic object on the 64-bit Itanium C++ ABI will start with a vtable pointer, this means that the alignment of any polymorphic class is always >= 8, and therefore the lowest 3 bits of the object’s address will always be zero. This means we have 18 bits available to us to store the 16-Bit discriminator.

The upper 15 bits of a pointer might however be used for Pointer Authentication Code (PAC), Memory Tagging Extension (MTE) or Address Space Layout Randomization (ASLR). I’m hoping I can get around this by applying attributes to the pointer, such as:

class polyhandle {
    void *p __attribute__((no_pac,no_mte));
public:
    . . .
    . . .
};

I’m eager to test this out but I don’t currently have access to a new Apple Silicon computer. Email me if you can give me SSH access to such an machine to test binaries.

6. Properties

7. Usage Example

#include <iostream>    // cout, endl
#include <polyhandle>  // polyhandle
using std::cout, std::endl;

struct Base1 { void *p; };               // not polymorphic
struct Base2 { virtual ~Base2(){} };     // polymorphic
struct Derived : Base1, virtual Base2 { virtual ~Derived(){} };

struct Base1z { void *p[600]; };         // not polymorphic
struct Base2z { virtual ~Base2z(){} };   // polymorphic
struct Base3z { virtual ~Base3z(){} };   // polymorphic
struct Derivedz : Base1z, virtual Base2z, virtual Base3z { virtual ~Derivedz(){} };

int main(void)
{
    Derived obj;
    Base2 &b2 = obj;
    std::polyhandle p(b2);
    cout << (void*)&obj << " == " << p.most_derived() << endl;
    cout << (void*)&b2  << " == " << p.object()       << endl;
    cout << p.typeinfo().name() << endl;

    cout << endl;

    Derivedz objz;
    Base3z &b3z = objz;
    std::polyhandle pz(b3z);
    cout << (void*)&objz << " == " << pz.most_derived() << endl;
    cout << (void*)&b3z  << " == " << pz.object()       << endl;
    cout << pz.typeinfo().name() << endl;
}

8. Implementations

8.1. Itanium ABI

#include <cstdint>        // uintptr_t
#include <memory>         // addressof
#include <type_traits>    // is_polymorphic, remove_cvref
#include <typeinfo>       // type_info

namespace std {
class polyhandle final {
    void *p;
public:
    template<class Tref> requires is_polymorphic_v< remove_cvref_t<Tref> >
    constexpr polyhandle(Tref &&obj) noexcept
      : p( const_cast< remove_cvref_t<Tref> * >(addressof(obj)) ) {}

    constexpr void *object(void) const noexcept { return this->p; }

    constexpr void *most_derived(void) const noexcept
    {
        return static_cast<char*>(this->p) + static_cast<uintptr_t**>(this->p)[0][-2];
    }

    constexpr type_info const &typeinfo(void) const noexcept
    {
        return static_cast<type_info***>(this->p)[0][-1][0];
    }
};

static_assert( sizeof (polyhandle) == sizeof (void*) );
static_assert( alignof(polyhandle) == alignof(void*) );

}  // close namespace std

Tested and working up on GodBolt: https://godbolt.org/z/4vsdjhP7d

8.2. Microsoft

#include <cstdint>        // uint32_t
#include <memory>         // addressof
#include <type_traits>    // is_polymorphic, remove_cvref

extern "C" {
    // Including <Windows.h> is too much
    void *__stdcall GetModuleHandleA(char const*);
    void *__stdcall LoadLibraryA    (char const*);
    void *__stdcall GetProcAddress  (void*,char const*);
}

namespace std {
class polyhandle final {
    void *p;

    // We will use a thread_local variable to keep track of
    // the address that gets passed to __RTCastToVoid, because
    // this address will have been adjusted by the offset to
    // the location of the vtable pointer inside the object.
    inline static thread_local void const *argument_to_RTCTV = nullptr;
    friend void *::__RTCastToVoid(void *const arg) noexcept(false);

    template<class Tref>
    requires std::is_polymorphic_v< std::remove_cvref_t<Tref> >
    std::uint32_t GetOffsetToVftable(Tref &&obj)
    {
        typedef std::remove_cvref_t<Tref> T;
        T *const p = const_cast<T*>( std::addressof(obj) );
        (void)dynamic_cast<void*>(p);
        return (char*)argument_to_RTCTV - (char*)p;
    }

public:
    template<class Tref> requires is_polymorphic_v< remove_cvref_t<Tref> >
    constexpr polyhandle(Tref &&obj) noexcept
    {
        p = (char*)addressof(obj) + GetOffsetToVftable(obj);
    }

    constexpr void *object(void) const noexcept
    {
        uint32_t **const pvtable = *static_cast<uint32_t***>(this->p);
        return (char*)this->p - pvtable[-1][2];
    }

    void *most_derived(void) const noexcept
    {
        uint32_t **const pvtable = *static_cast<uint32_t***>(this->p);
        return (char*)this->p - pvtable[-1][1];
    }

    type_info const &typeinfo(void) const noexcept
    {
        uint32_t const n = static_cast<uint32_t***>(this->p)[0][-1][3];

#ifdef _WIN64
        return *(type_info*)(  (char*)GetModuleHandleA(nullptr) + n  );
#else
        return *(type_info*)n;
#endif
    }
};

static_assert( sizeof (polyhandle) == sizeof (void*) );
static_assert( alignof(polyhandle) == alignof(void*) );

} // close namespaces std

// We intercept calls to this function:
extern "C" {
inline void *__RTCastToVoid(void *const arg) noexcept(false)
{
    std::polyhandle::argument_to_RTCTV = arg;  // save this to do a subtraction later!

    void *hRuntime = ::GetModuleHandleA("vcruntime140.dll");
    if ( nullptr == hRuntime ) hRuntime = ::LoadLibraryA("vcruntime140.dll");
    if ( nullptr == hRuntime ) return nullptr;
    auto const fp = (void*(*)(void *)) ::GetProcAddress(hRuntime, "__RTCastToVoid");
    if ( nullptr == fp ) return nullptr;
    return fp(arg);
}
}

Tested and working up on GodBolt: https://godbolt.org/z/P9dzbhcq5

9. Use Cases

10. Conclusion

std::polyhandle provides a zero-overhead, type-erased handle to polymorphic objects that works transparently across platforms. On Itanium ABI systems, the construction of an std::polyhandle is a complete no-op. On Microsoft, it uses ABI-compliant mechanisms to extract and reconstruct the necessary information.