ISOCPP std-proposals List: [std-proposals] Expose architecture at compile-time (and more at runtime)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Thu, 27 Oct 2022 11:22:40 +0100

We can already garner a lot of information about the architecture at
compile-time, for example:

* Minimal addressable memory unit == CHAR_BIT
* Whether it's 32-Bit or 64-Bit == sizeof(void*)
* Whether it's big or little endian == std::endian::native

Also we can do stuff like:

    #ifdef _WIN32

I propose that there should be a class called "std::this_arch", with
static data members which can be used as follows:

    int main(void)
    {
        cout << std::this_arch::core_instruction_set << endl;
    }

Furthermore there could be a few static member functions which get
information at runtime (e.g. extended instructions sets). For example,
if you compile your program for x86, and then at runtime you want to
determine if the CPU supports the extended instruction set SSE3, then
you could do something like:

    int main(void)
    {
        void *p = nullptr;

        if constexpr ( "x86" == std::this_arch::core_instruction_set )
// compile-time constant
        {
            if ( std::this_arch::x86::supports_SSE3() ) // not a
compile-time constant
            {
                p = dlopen("complex_graphs_sse3.so.2");
                assert( nullptr != p );
            }
        }
    }

I'm not talking simply about the characteristics of the CPU, but also
how the compiler implements things. For example, the 'std::this_arch'
class could give information about the ABI of member function pointers
(e.g. Microsoft does it very differently to GNU), and also about
calling conventions (e.g. __stdcall, __fastcall).

With the architecture more exposed to the programmer, I also think
there should be accompanying features:

( Feature 1 ) Allow the use of the 'sizeof' operator on a function to
get the byte count of the function's machine code. This would allow us
to copy a function in memory, edit a constant in the machine code, set
the memory page as executable and execute it. Currently we can pull
this off by first doing an 'objdump' on the object file to see the
size of the function, but it would be nice if we could do all this in
C++ code. Of course this all only makes sense on computers where
program memory and data memory are the same thing (and sizeof(void*)
== sizeof(void (*)(void))) -- which it is on the majority of
computers. Something like as follows:

    int Func(void); // defined elsewhere

    int main(void)
    {
        char unsigned machine_code[ sizeof Func ];

        std::memcpy_from_prog_mem( machine_code, &Func, sizeof machine_code );

        for ( size_t i = 0u; i != (sizeof machine_code - 1u); ++i )
        {
            if ( (0x72 == machine_code[i]) && (0xa9 == machine_code[i+1u]) )
            {
                machine_code[i + 0u] = 0x25;
                machine_code[i + 1u] = 0xaa;

                Set_Page_Executable(machine_code); // 'mprotect' or
'VirtualProtect''

                 std::prog_pointer_cast<int (*)(void)>(machine_code)();

                 return;
            }
        }
    }

( 2 ) Provide a function to convert a member function pointer to a
generic function pointer. The GNU 'g++' compiler can already do this
if the member function pointer is a compile-time constant:

            void (*p)(void) =
reinterpret_cast<void(*)(void)>(&SomeClass::SomeMethod);

however it fails if it's not a compile-time constant:

            void (SomeClass:: *mp)(void) = &SomeClass::SomeMethod;

            void (*p)(void) = reinterpret_cast<void(*)(void)>(mp); //
This doesn't give you the memory address

The LLVM 'clang++' compiler won't allow you to do this conversion at
all -- even if you try using reinterpret_cast or a C-style cast. I
spent a few hours yesterday trying to write a function for it:

#include <cstdint> // uintptr_t
#include <cstring> // memcpy

template<class T>
void (*address_of_member_function( void (T:: *const mp)(void) ))(void)
{
    static T obj; // I hope it has a default constructor

    std::uintptr_t n;
    std::memcpy(&n, &mp, sizeof n);

    void (*const *const v_table)(void) = *static_cast<void(*const
**)(void)>(static_cast<void*>(&obj));

    return v_table[n >> 3u]; // yes this works on 'g++' and also 'clang++'
}

It would be nice if "std::address_of_member_function" were provided by
the C++ standard library. By the way, the Standard already says that
it's fine to use a function pointer to store the address of *any*
non-member function irrespective of the return type and parameters.
The following is fine:

    extern int Func1(void);
    extern double Func2(int);

    void (*p)(void) = reinterpret_cast<void(*)(void)>(&Func1); // no
loss of data here
    void (*q)(void) = reinterpret_cast<void(*)(void)>(&Func2); // no
loss of data here

The C++ programming language supports multiple paradigms, both
procedural programming and object-orientated programming. I think we
should maintain and expand upon the low-level control that C++
inherited from C.

With regard to arguments such as, "But what if the compiler doesn't
use v-tables at all?", well these features would only be available if
applicable. We already have stuff like this in the language, for
example "std::uintptr_t" isn't guaranteed to exist.

Received on 2022-10-27 10:22:52