ISOCPP std-proposals List: Re: [std-proposals] Expose architecture at compile-time (and more at runtime)

From: Marcin Jaczewski <marcinjaczewski86_at_[hidden]>
Date: Fri, 28 Oct 2022 10:39:50 +0200

czw., 27 paź 2022 o 12:22 Frederick Virchanza Gotham via Std-Proposals
<std-proposals_at_[hidden]> napisał(a):
>
> We can already garner a lot of information about the architecture at
> compile-time, for example:
>
> * Minimal addressable memory unit == CHAR_BIT
> * Whether it's 32-Bit or 64-Bit == sizeof(void*)
> * Whether it's big or little endian == std::endian::native
>
> Also we can do stuff like:
>
> #ifdef _WIN32
>
> I propose that there should be a class called "std::this_arch", with
> static data members which can be used as follows:
>
> int main(void)
> {
> cout << std::this_arch::core_instruction_set << endl;
> }
>
> Furthermore there could be a few static member functions which get
> information at runtime (e.g. extended instructions sets). For example,
> if you compile your program for x86, and then at runtime you want to
> determine if the CPU supports the extended instruction set SSE3, then
> you could do something like:
>
> int main(void)
> {
> void *p = nullptr;
>
> if constexpr ( "x86" == std::this_arch::core_instruction_set )
> // compile-time constant
> {
> if ( std::this_arch::x86::supports_SSE3() ) // not a
> compile-time constant
> {
> p = dlopen("complex_graphs_sse3.so.2");
> assert( nullptr != p );
> }
> }
> }
>
> I'm not talking simply about the characteristics of the CPU, but also
> how the compiler implements things. For example, the 'std::this_arch'
> class could give information about the ABI of member function pointers
> (e.g. Microsoft does it very differently to GNU), and also about
> calling conventions (e.g. __stdcall, __fastcall).
>
> With the architecture more exposed to the programmer, I also think
> there should be accompanying features:
>
> ( Feature 1 ) Allow the use of the 'sizeof' operator on a function to
> get the byte count of the function's machine code. This would allow us
> to copy a function in memory, edit a constant in the machine code, set
> the memory page as executable and execute it. Currently we can pull
> this off by first doing an 'objdump' on the object file to see the
> size of the function, but it would be nice if we could do all this in
> C++ code. Of course this all only makes sense on computers where
> program memory and data memory are the same thing (and sizeof(void*)
> == sizeof(void (*)(void))) -- which it is on the majority of
> computers. Something like as follows:
>
> int Func(void); // defined elsewhere
>
> int main(void)
> {
> char unsigned machine_code[ sizeof Func ];
>
> std::memcpy_from_prog_mem( machine_code, &Func, sizeof machine_code );
>
> for ( size_t i = 0u; i != (sizeof machine_code - 1u); ++i )
> {
> if ( (0x72 == machine_code[i]) && (0xa9 == machine_code[i+1u]) )
> {
> machine_code[i + 0u] = 0x25;
> machine_code[i + 1u] = 0xaa;
>
> Set_Page_Executable(machine_code); // 'mprotect' or
> 'VirtualProtect''
>
> std::prog_pointer_cast<int (*)(void)>(machine_code)();
>
> return;
> }
> }
> }
>

This should never work in C++, you can't assume anything about what
instructions are emitted by the compiler.
There are cases where a complex function that has hundreds of complex
statements is reduced into two instructions.
Or in another direction, the smile loop could be changed with complex
SIMD operations.

Not to mention shifts in binary caused by optimization level,
different compiler or compiler version.
There could be cases that some function could have co-processor or GPU code.

>
> ( 2 ) Provide a function to convert a member function pointer to a
> generic function pointer. The GNU 'g++' compiler can already do this
> if the member function pointer is a compile-time constant:
>
> void (*p)(void) =
> reinterpret_cast<void(*)(void)>(&SomeClass::SomeMethod);
>
> however it fails if it's not a compile-time constant:
>
> void (SomeClass:: *mp)(void) = &SomeClass::SomeMethod;
>
> void (*p)(void) = reinterpret_cast<void(*)(void)>(mp); //
> This doesn't give you the memory address
>
> The LLVM 'clang++' compiler won't allow you to do this conversion at
> all -- even if you try using reinterpret_cast or a C-style cast. I
> spent a few hours yesterday trying to write a function for it:
>
> #include <cstdint> // uintptr_t
> #include <cstring> // memcpy
>
> template<class T>
> void (*address_of_member_function( void (T:: *const mp)(void) ))(void)
> {
> static T obj; // I hope it has a default constructor
>
> std::uintptr_t n;
> std::memcpy(&n, &mp, sizeof n);
>
> void (*const *const v_table)(void) = *static_cast<void(*const
> **)(void)>(static_cast<void*>(&obj));
>
> return v_table[n >> 3u]; // yes this works on 'g++' and also 'clang++'
> }

What exactly is the point for that? beside this is already possible by
trival template:

```
template<typename T, auto Foo>
auto MakeStatic(T p) // T could be make dependant like `get_arg_t<Foo>`
{
return std::invoke(Foo, p);
}

int main()
{
auto ptr = &MakeStatic<Type&, &Type::foo>;
}
```
Alternative you could employ stateless lambdas for this too.

>
> It would be nice if "std::address_of_member_function" were provided by
> the C++ standard library. By the way, the Standard already says that
> it's fine to use a function pointer to store the address of *any*
> non-member function irrespective of the return type and parameters.
> The following is fine:
>
> extern int Func1(void);
> extern double Func2(int);
>
> void (*p)(void) = reinterpret_cast<void(*)(void)>(&Func1); // no
> loss of data here
> void (*q)(void) = reinterpret_cast<void(*)(void)>(&Func2); // no
> loss of data here
>
> The C++ programming language supports multiple paradigms, both
> procedural programming and object-orientated programming. I think we
> should maintain and expand upon the low-level control that C++
> inherited from C.
>
> With regard to arguments such as, "But what if the compiler doesn't
> use v-tables at all?", well these features would only be available if
> applicable. We already have stuff like this in the language, for
> example "std::uintptr_t" isn't guaranteed to exist.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2022-10-28 08:39:59