ISOCPP std-proposals List: Re: [std-proposals] Expose architecture at compile-time (and more at runtime)

From: Jarrad Waterloo <descender76_at_[hidden]>
Date: Thu, 27 Oct 2022 21:25:12 -0400

Thank you for the alternate syntax. Can I have the option of mentioning it
and your name in a future version of the paper? Besides +, I am also not
opposed to &Base::Base::Foo.

Short of syntactical differences, what don't you like?

The Pro(s)
Get rid of a dual dispatch and sometimes an unnecessary thunk
Make all functions bindable to a free function pointer
Mitigates the bifurcation between free and member functions
In future function_ref api(s) being able to control exactly what is bound.
In future runtime trait polymorphic api(s) being able to control exactly
what is bound.
Likely too late to change: some object to deducing this member function
being free function pointer type. Personally I prefer it over the member
function pointer type but if we had this functionality that it could be
made consistent with member functions as is.

The Con(s)
TODO

On Thu, Oct 27, 2022 at 8:50 PM Chris Ryan <chrisr98008_at_[hidden]> wrote:

> In https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2603r1.html
> You said: >"Even though the programmer explicitly stated that they want to
> call Base’s some_virtual_function, "
>
> Not to be argumentative but they did NOT 'explicitly state' that at all.
> They said they wanted to call the function of that name/signature, and
> since it was a virtual function, they got virtual functionality (just like
> any other use of that function) but in this case using a virtual thunk type
> mechanism (Base::`vcall'{0}').
>
> First I don't like what you are proposing. But if you were to do it, the
> much cleaner way to implement (symmetrical), rather than adding more ugly
> syntax, would be to use (for the lack of a better term) the decayed version
> like the pointer for an lambda invoker (aka '+' in front of it). Using the
> '+' on a non-virtual function would just use the static linkage.
>
>
> #include <iostream>
>
> struct Base { virtual void Foo() { std::cout <<
> "Base::Foo()\n"; } };
> struct Derived : Base { virtual void Foo() { std::cout <<
> "Derived::Foo()\n"; } };
>
> int main()
> {
> auto usingLambda = []() { std::cout << "usingLambda\n"; };
> auto usingInvoker = +[]() { std::cout << "usingInvoker\n"; }; //
> decay to pointer to invoker using '+'
>
> usingLambda();
> usingInvoker(); //via a pointer to function to the invoker
>
> //[proposed example. Does not currently work] :
> void(Base::*pDecay)() = +&Base::Foo; // decay to static linkage
> address using '+'
> void(Base::*pDefault)() = &Base::Foo; // use default linkage
>
> Derived d;
>
> (d.* pDecay)(); //non-virtual direct call
> (d.* pDefault)(); //virtual call using Base::`vcall'{0}'
> }
>
> On Thu, Oct 27, 2022 at 5:45 AM Jarrad Waterloo via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> Your second reason is captured in the following. Please give it a look,
>> your comments and your support.
>>
>> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2603r1.html
>>
>> Another argument in favor of your first argument or at least playing in
>> the same sandbox is the need to standardize [more runtime] metadata in
>> order to improve efficiency and simplify the language.
>> Possibly at negative offset of functions or constexpr map of function
>> address to metadata but the former because it is needed at runtime for
>> function pointers.
>> 1) maximum error type size and alignment for any given function; possibly
>> needed for unified error handling
>> 2) maximum error type size and alignment for an entire module; possibly
>> needed for unified error handling
>> 3) maximum size and alignment of coroutines; possibly for native
>> coroutines
>>
>> Some reasons against being able to create new function at runtime using
>> existing functions as templates.
>> 1) Currently ?undefined behavior? and not permitted by many architectures
>> for security reasons though runtime code generation is one fundamental
>> thing assembly can do that C can't. Java and .NET even have this capability.
>> 2) This could violate that functions are constants; constant globals. New
>> ones would need to be locals and would need a reference copied pointer to
>> manage its life safely.
>>
>>
>> On Thu, Oct 27, 2022 at 6:22 AM Frederick Virchanza Gotham via
>> Std-Proposals <std-proposals_at_[hidden]> wrote:
>>
>>> We can already garner a lot of information about the architecture at
>>> compile-time, for example:
>>>
>>> * Minimal addressable memory unit == CHAR_BIT
>>> * Whether it's 32-Bit or 64-Bit == sizeof(void*)
>>> * Whether it's big or little endian == std::endian::native
>>>
>>> Also we can do stuff like:
>>>
>>> #ifdef _WIN32
>>>
>>> I propose that there should be a class called "std::this_arch", with
>>> static data members which can be used as follows:
>>>
>>> int main(void)
>>> {
>>> cout << std::this_arch::core_instruction_set << endl;
>>> }
>>>
>>> Furthermore there could be a few static member functions which get
>>> information at runtime (e.g. extended instructions sets). For example,
>>> if you compile your program for x86, and then at runtime you want to
>>> determine if the CPU supports the extended instruction set SSE3, then
>>> you could do something like:
>>>
>>> int main(void)
>>> {
>>> void *p = nullptr;
>>>
>>> if constexpr ( "x86" == std::this_arch::core_instruction_set )
>>> // compile-time constant
>>> {
>>> if ( std::this_arch::x86::supports_SSE3() ) // not a
>>> compile-time constant
>>> {
>>> p = dlopen("complex_graphs_sse3.so.2");
>>> assert( nullptr != p );
>>> }
>>> }
>>> }
>>>
>>> I'm not talking simply about the characteristics of the CPU, but also
>>> how the compiler implements things. For example, the 'std::this_arch'
>>> class could give information about the ABI of member function pointers
>>> (e.g. Microsoft does it very differently to GNU), and also about
>>> calling conventions (e.g. __stdcall, __fastcall).
>>>
>>> With the architecture more exposed to the programmer, I also think
>>> there should be accompanying features:
>>>
>>> ( Feature 1 ) Allow the use of the 'sizeof' operator on a function to
>>> get the byte count of the function's machine code. This would allow us
>>> to copy a function in memory, edit a constant in the machine code, set
>>> the memory page as executable and execute it. Currently we can pull
>>> this off by first doing an 'objdump' on the object file to see the
>>> size of the function, but it would be nice if we could do all this in
>>> C++ code. Of course this all only makes sense on computers where
>>> program memory and data memory are the same thing (and sizeof(void*)
>>> == sizeof(void (*)(void))) -- which it is on the majority of
>>> computers. Something like as follows:
>>>
>>> int Func(void); // defined elsewhere
>>>
>>> int main(void)
>>> {
>>> char unsigned machine_code[ sizeof Func ];
>>>
>>> std::memcpy_from_prog_mem( machine_code, &Func, sizeof
>>> machine_code );
>>>
>>> for ( size_t i = 0u; i != (sizeof machine_code - 1u); ++i )
>>> {
>>> if ( (0x72 == machine_code[i]) && (0xa9 ==
>>> machine_code[i+1u]) )
>>> {
>>> machine_code[i + 0u] = 0x25;
>>> machine_code[i + 1u] = 0xaa;
>>>
>>> Set_Page_Executable(machine_code); // 'mprotect' or
>>> 'VirtualProtect''
>>>
>>> std::prog_pointer_cast<int (*)(void)>(machine_code)();
>>>
>>> return;
>>> }
>>> }
>>> }
>>>
>>>
>>> ( 2 ) Provide a function to convert a member function pointer to a
>>> generic function pointer. The GNU 'g++' compiler can already do this
>>> if the member function pointer is a compile-time constant:
>>>
>>> void (*p)(void) =
>>> reinterpret_cast<void(*)(void)>(&SomeClass::SomeMethod);
>>>
>>> however it fails if it's not a compile-time constant:
>>>
>>> void (SomeClass:: *mp)(void) = &SomeClass::SomeMethod;
>>>
>>> void (*p)(void) = reinterpret_cast<void(*)(void)>(mp); //
>>> This doesn't give you the memory address
>>>
>>> The LLVM 'clang++' compiler won't allow you to do this conversion at
>>> all -- even if you try using reinterpret_cast or a C-style cast. I
>>> spent a few hours yesterday trying to write a function for it:
>>>
>>> #include <cstdint> // uintptr_t
>>> #include <cstring> // memcpy
>>>
>>> template<class T>
>>> void (*address_of_member_function( void (T:: *const mp)(void) ))(void)
>>> {
>>> static T obj; // I hope it has a default constructor
>>>
>>> std::uintptr_t n;
>>> std::memcpy(&n, &mp, sizeof n);
>>>
>>> void (*const *const v_table)(void) = *static_cast<void(*const
>>> **)(void)>(static_cast<void*>(&obj));
>>>
>>> return v_table[n >> 3u]; // yes this works on 'g++' and also
>>> 'clang++'
>>> }
>>>
>>> It would be nice if "std::address_of_member_function" were provided by
>>> the C++ standard library. By the way, the Standard already says that
>>> it's fine to use a function pointer to store the address of *any*
>>> non-member function irrespective of the return type and parameters.
>>> The following is fine:
>>>
>>> extern int Func1(void);
>>> extern double Func2(int);
>>>
>>> void (*p)(void) = reinterpret_cast<void(*)(void)>(&Func1); // no
>>> loss of data here
>>> void (*q)(void) = reinterpret_cast<void(*)(void)>(&Func2); // no
>>> loss of data here
>>>
>>> The C++ programming language supports multiple paradigms, both
>>> procedural programming and object-orientated programming. I think we
>>> should maintain and expand upon the low-level control that C++
>>> inherited from C.
>>>
>>> With regard to arguments such as, "But what if the compiler doesn't
>>> use v-tables at all?", well these features would only be available if
>>> applicable. We already have stuff like this in the language, for
>>> example "std::uintptr_t" isn't guaranteed to exist.
>>> --
>>> Std-Proposals mailing list
>>> Std-Proposals_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
>

Received on 2022-10-28 01:25:25