C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Expose architecture at compile-time (and more at runtime)

From: Chris Ryan <chrisr98008_at_[hidden]>
Date: Thu, 27 Oct 2022 17:50:33 -0700
In https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2603r1.html
You said: >"Even though the programmer explicitly stated that they want to
call Base’s some_virtual_function, "

Not to be argumentative but they did NOT 'explicitly state' that at all.
They said they wanted to call the function of that name/signature, and
since it was a virtual function, they got virtual functionality (just like
any other use of that function) but in this case using a virtual thunk type
mechanism (Base::`vcall'{0}').

First I don't like what you are proposing. But if you were to do it, the
much cleaner way to implement (symmetrical), rather than adding more ugly
syntax, would be to use (for the lack of a better term) the decayed version
like the pointer for an lambda invoker (aka '+' in front of it). Using the
'+' on a non-virtual function would just use the static linkage.


#include <iostream>

struct Base { virtual void Foo() { std::cout <<
"Base::Foo()\n"; } };
struct Derived : Base { virtual void Foo() { std::cout <<
"Derived::Foo()\n"; } };

int main()
{
    auto usingLambda = []() { std::cout << "usingLambda\n"; };
    auto usingInvoker = +[]() { std::cout << "usingInvoker\n"; }; //
decay to pointer to invoker using '+'

    usingLambda();
    usingInvoker(); //via a pointer to function to the invoker

//[proposed example. Does not currently work] :
    void(Base::*pDecay)() = +&Base::Foo; // decay to static linkage
address using '+'
    void(Base::*pDefault)() = &Base::Foo; // use default linkage

    Derived d;

    (d.* pDecay)(); //non-virtual direct call
    (d.* pDefault)(); //virtual call using Base::`vcall'{0}'
}

On Thu, Oct 27, 2022 at 5:45 AM Jarrad Waterloo via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> Your second reason is captured in the following. Please give it a look,
> your comments and your support.
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2603r1.html
>
> Another argument in favor of your first argument or at least playing in
> the same sandbox is the need to standardize [more runtime] metadata in
> order to improve efficiency and simplify the language.
> Possibly at negative offset of functions or constexpr map of function
> address to metadata but the former because it is needed at runtime for
> function pointers.
> 1) maximum error type size and alignment for any given function; possibly
> needed for unified error handling
> 2) maximum error type size and alignment for an entire module; possibly
> needed for unified error handling
> 3) maximum size and alignment of coroutines; possibly for native coroutines
>
> Some reasons against being able to create new function at runtime using
> existing functions as templates.
> 1) Currently ?undefined behavior? and not permitted by many architectures
> for security reasons though runtime code generation is one fundamental
> thing assembly can do that C can't. Java and .NET even have this capability.
> 2) This could violate that functions are constants; constant globals. New
> ones would need to be locals and would need a reference copied pointer to
> manage its life safely.
>
>
> On Thu, Oct 27, 2022 at 6:22 AM Frederick Virchanza Gotham via
> Std-Proposals <std-proposals_at_[hidden]> wrote:
>
>> We can already garner a lot of information about the architecture at
>> compile-time, for example:
>>
>> * Minimal addressable memory unit == CHAR_BIT
>> * Whether it's 32-Bit or 64-Bit == sizeof(void*)
>> * Whether it's big or little endian == std::endian::native
>>
>> Also we can do stuff like:
>>
>> #ifdef _WIN32
>>
>> I propose that there should be a class called "std::this_arch", with
>> static data members which can be used as follows:
>>
>> int main(void)
>> {
>> cout << std::this_arch::core_instruction_set << endl;
>> }
>>
>> Furthermore there could be a few static member functions which get
>> information at runtime (e.g. extended instructions sets). For example,
>> if you compile your program for x86, and then at runtime you want to
>> determine if the CPU supports the extended instruction set SSE3, then
>> you could do something like:
>>
>> int main(void)
>> {
>> void *p = nullptr;
>>
>> if constexpr ( "x86" == std::this_arch::core_instruction_set )
>> // compile-time constant
>> {
>> if ( std::this_arch::x86::supports_SSE3() ) // not a
>> compile-time constant
>> {
>> p = dlopen("complex_graphs_sse3.so.2");
>> assert( nullptr != p );
>> }
>> }
>> }
>>
>> I'm not talking simply about the characteristics of the CPU, but also
>> how the compiler implements things. For example, the 'std::this_arch'
>> class could give information about the ABI of member function pointers
>> (e.g. Microsoft does it very differently to GNU), and also about
>> calling conventions (e.g. __stdcall, __fastcall).
>>
>> With the architecture more exposed to the programmer, I also think
>> there should be accompanying features:
>>
>> ( Feature 1 ) Allow the use of the 'sizeof' operator on a function to
>> get the byte count of the function's machine code. This would allow us
>> to copy a function in memory, edit a constant in the machine code, set
>> the memory page as executable and execute it. Currently we can pull
>> this off by first doing an 'objdump' on the object file to see the
>> size of the function, but it would be nice if we could do all this in
>> C++ code. Of course this all only makes sense on computers where
>> program memory and data memory are the same thing (and sizeof(void*)
>> == sizeof(void (*)(void))) -- which it is on the majority of
>> computers. Something like as follows:
>>
>> int Func(void); // defined elsewhere
>>
>> int main(void)
>> {
>> char unsigned machine_code[ sizeof Func ];
>>
>> std::memcpy_from_prog_mem( machine_code, &Func, sizeof
>> machine_code );
>>
>> for ( size_t i = 0u; i != (sizeof machine_code - 1u); ++i )
>> {
>> if ( (0x72 == machine_code[i]) && (0xa9 ==
>> machine_code[i+1u]) )
>> {
>> machine_code[i + 0u] = 0x25;
>> machine_code[i + 1u] = 0xaa;
>>
>> Set_Page_Executable(machine_code); // 'mprotect' or
>> 'VirtualProtect''
>>
>> std::prog_pointer_cast<int (*)(void)>(machine_code)();
>>
>> return;
>> }
>> }
>> }
>>
>>
>> ( 2 ) Provide a function to convert a member function pointer to a
>> generic function pointer. The GNU 'g++' compiler can already do this
>> if the member function pointer is a compile-time constant:
>>
>> void (*p)(void) =
>> reinterpret_cast<void(*)(void)>(&SomeClass::SomeMethod);
>>
>> however it fails if it's not a compile-time constant:
>>
>> void (SomeClass:: *mp)(void) = &SomeClass::SomeMethod;
>>
>> void (*p)(void) = reinterpret_cast<void(*)(void)>(mp); //
>> This doesn't give you the memory address
>>
>> The LLVM 'clang++' compiler won't allow you to do this conversion at
>> all -- even if you try using reinterpret_cast or a C-style cast. I
>> spent a few hours yesterday trying to write a function for it:
>>
>> #include <cstdint> // uintptr_t
>> #include <cstring> // memcpy
>>
>> template<class T>
>> void (*address_of_member_function( void (T:: *const mp)(void) ))(void)
>> {
>> static T obj; // I hope it has a default constructor
>>
>> std::uintptr_t n;
>> std::memcpy(&n, &mp, sizeof n);
>>
>> void (*const *const v_table)(void) = *static_cast<void(*const
>> **)(void)>(static_cast<void*>(&obj));
>>
>> return v_table[n >> 3u]; // yes this works on 'g++' and also
>> 'clang++'
>> }
>>
>> It would be nice if "std::address_of_member_function" were provided by
>> the C++ standard library. By the way, the Standard already says that
>> it's fine to use a function pointer to store the address of *any*
>> non-member function irrespective of the return type and parameters.
>> The following is fine:
>>
>> extern int Func1(void);
>> extern double Func2(int);
>>
>> void (*p)(void) = reinterpret_cast<void(*)(void)>(&Func1); // no
>> loss of data here
>> void (*q)(void) = reinterpret_cast<void(*)(void)>(&Func2); // no
>> loss of data here
>>
>> The C++ programming language supports multiple paradigms, both
>> procedural programming and object-orientated programming. I think we
>> should maintain and expand upon the low-level control that C++
>> inherited from C.
>>
>> With regard to arguments such as, "But what if the compiler doesn't
>> use v-tables at all?", well these features would only be available if
>> applicable. We already have stuff like this in the language, for
>> example "std::uintptr_t" isn't guaranteed to exist.
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2022-10-28 00:50:47