On Fri, Apr 14, 2023 at 9:45 AM Breno Guimarães via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
It looks like there is work around supporting JIT in C++: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1609r1.html
I'm not sure what is the status of that.

On Fri, Apr 14, 2023 at 10:19 AM Frederick Virchanza Gotham via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
Since C++11, there has been an implicit conversion from a lambda to a
function pointer so long as the lambda has no captures. If the lambda
has captures, the implicit conversion is disabled. However it's easy to
get a function pointer from a lambda-with-captures if we use global
variables or the heap, something like:

    std::function<void(void)> f;  // global object

    void Func(void)
    {
        f();  // invokes the global object
    }

    void Some_Library_Func(void (*const pf)(void))
    {
        pf();
    }

    int main(int argc, char **argv)
    {
        auto mylambda = [argc](void) -> void
          {
            cout << "Hello " << argc << "!" << endl;
          };

        f = mylambda;

        Some_Library_Func(Func);
    }

It is possible though to procure a normal function pointer from a
lambda-with-captures without making use of global variables or the heap
-- it can all be kept on the stack.

To invoke a capture lambda, we need two pieces of data:
  Datum A: The address of the lambda object
  Datum B: The address of the 'operator()' member function

Datum A is a pointer into data memory.
Datum B is a pointer into code memory.

The technique described in this post will only work on CPU's where the
program counter can be set to an address in data memory, and therefore
we will use 'void*' for Datum B rather than 'void(*)(void)'. I'm open
to correction here but I think this technique will work on every
implementation of C++ in existence today, even on microcontrollers such
as the Texas Instruments F28069 and the Arduino Atmel sam3x8e.

We will define a simple POD struct to hold these two pieces of data:

    struct LambdaInfo {
        void *data, *code;
    };

Let's write a function that invokes a capture lambda, passing the 'this'
pointer as the first argument to the member function:

    void InvokeLambda(LambdaInfo const *const p)
    {
        void (*pf)(void*) = (void (*)(void*))p->code;

        return pf(p->data);
    }

And now let's check what this got compiled to on an x86_64 computer:

       mov    rdx,QWORD PTR [rdi]
       mov    rax,QWORD PTR [rdi+0x8]
       mov    rdi,rdx
       jmp    rax

What we've got here is four instructions. To see a little more clearly
what's going on here, I'm going to replace the function arguments with
numerical constants:

    void InvokeLambda(void)
    {
        void (*pf)(void*) = (void (*)(void*))0x1122334455667788;

        return pf( (void*)0x99aabbccddeeffee );
    }

gets compiled to:

       movabs rdi,0x99aabbccddeeffee
       movabs rax,0x1122334455667788
       jmp    rax

What we've got here now is three simple instructions. Here's the
assembler alongside the machine code:

    movabs rdi,0x99aabbccddeeffee   48 bf ee ff ee dd cc bb aa 99
    movabs rax,0x1122334455667788   48 b8 88 77 66 55 44 33 22 11
    jmp    rax                      ff e0

What we have here is 22 bytes worth of CPU instructions, which we can
put into a byte array as follows:

    char unsigned instructions[22u] = {
        0x48, 0xBF,
        0xEE, 0xFF, 0xEE, 0xDD, 0xCC, 0xBB, 0xAA, 0x99,
        0x48, 0xB8,
        0x88, 0x77, 0x66, 0x55, 0x44, 0x33, 0x22, 0x11,
        0xFF, 0xE0,
    };

This 22-byte array can be our thunk. I'll write a class to manage the
thunk:

    class LambdaThunk {

        char unsigned instructions[22u];

        void SetData(void const volatile *const p) volatile
        {
            char unsigned const volatile *const q =
                (char unsigned const volatile *)&p;

            this->instructions[2] = q[0];
            this->instructions[3] = q[1];
            this->instructions[4] = q[2];
            this->instructions[5] = q[3];
            this->instructions[6] = q[4];
            this->instructions[7] = q[5];
            this->instructions[8] = q[6];
            this->instructions[9] = q[7];
        }

        void SetCode(void const volatile *const p) volatile
        {
            char unsigned const volatile *const q =
                (char unsigned const volatile *)&p;

            this->instructions[12] = q[0];
            this->instructions[13] = q[1];
            this->instructions[14] = q[2];
            this->instructions[15] = q[3];
            this->instructions[16] = q[4];
            this->instructions[17] = q[5];
            this->instructions[18] = q[6];
            this->instructions[19] = q[7];
        }

    public:

        LambdaThunk(void)  // set the opcodes
        {
            this->instructions[ 0u] = 0x48u;  // movabs rdi
            this->instructions[ 1u] = 0xBFu;
            this->instructions[10u] = 0x48u;  // movabs rax
            this->instructions[11u] = 0xB8u;
            this->instructions[20u] = 0xFFu;  // jmp rax
            this->instructions[21u] = 0xE0u;
        }

        template<typename LambdaType>
        void AdaptFrom(LambdaType &arg) volatile
        {
            this->SetData(&arg);
            this->SetCode( (void*)&LambdaType::operator() );
            // The previous line works fine with GNU g++
        }

        template<typename LambdaType>
        LambdaThunk(LambdaType &arg) : LambdaThunk() // set opcodes
        {
            this->AdaptFrom<LambdaType>(arg);
        }

        void (*getfuncptr(void) const volatile)(void)
        {
            return (void(*)(void))&this->instructions;
        }
    };

And now let's write some test code to try it out:

    #include <iostream>  // cout, endl
    using std::cout;
    using std::endl;

    void Some_Library_Func( void (*const pf)(void) )
    {
        pf();
    }

    int main(int argc, char **argv)
    {
        auto mylambda = [argc](void) -> void
          {
            std::cout << "Hello " << argc << "!" << std::endl;
          };

        Some_Library_Func( LambdaThunk(mylambda).getfuncptr() );

        cout << "Last line in Main" << endl;
    }

This works fine, you can see it up on Godbolt here:

    https://godbolt.org/z/r84hEsG1G

Things get a little more complicated if the lambda has a return value,
and several parameters. For example if the lambda returns a struct
containing 17 int's, 33 double's, and if the lambda takes 18 parameters,
then the assembler for 'InvokeLambda' is a little more complicated:

    struct ReturnType {
        int a[17];
        double b[33];
        void (*c)(int);
        std::string d;
    };

    ReturnType InvokeLambda(int arg1, double arg2, float arg3,
                     int arg4, double arg5, float arg6,
                     int arg7, double arg8, float arg9,
                     int arg10, double arg11, float arg12,
                     int arg13, double arg14, float arg15,
                     int arg16, double arg17, float arg18)
    {
        ReturnType (*pf)(void volatile *,int,double,float,
                                         int,double,float,
                                         int,double,float,
                                         int,double,float,
                                         int,double,float,
                                         int,double,float)
         = (ReturnType (*volatile)(void volatile*,
                                   int,double,float,
                                   int,double,float,
                                   int,double,float,
                                   int,double,float,
                                   int,double,float,
                                   int,double,float))0x1122334455667788;

        return pf( (void volatile *volatile)0x99aabbccddeeffee,
                   arg1,arg2,arg3,arg4,arg5,arg6,
                   arg7,arg8,arg9,arg10,arg11,arg12,
                   arg13,arg14,arg15,arg16,arg17,arg18);
    }


gets compiled to:

    push   rbx
    movss  xmm8,DWORD PTR [rsp+0x30]
    mov    rbx,rdi
    sub    rsp,0x8
    movss  DWORD PTR [rsp],xmm8
    push   QWORD PTR [rsp+0x30]
    mov    eax,DWORD PTR [rsp+0x30]
    push   rax
    movss  xmm8,DWORD PTR [rsp+0x30]
    movabs rax,0x1122334455667788
    sub    rsp,0x8
    movss  DWORD PTR [rsp],xmm8
    push   QWORD PTR [rsp+0x30]
    push   r9
    mov    r9d,r8d
    mov    r8d,ecx
    mov    ecx,edx
    mov    edx,esi
    movabs rsi,0x99aabbccddeeffee
    call   rax
    mov    rax,rbx
    add    rsp,0x30
    pop    rbx
    ret

which is 94 bytes worth of instructions instead of 22. Still manageable.

I'm not suggesting that a lambda-with-captures should convert implicitly
to a function pointer, because then we'd have the issue of the lifetime
of the thunk. But as a stepping-stone, maybe the standard library could
provide a function or class that does what I've done above?

I know there are some badly designed C APIs that take a callback that is specified solely as a function pointer and necessitate hacks to create new functions at runtime, but adding a feature to the standard library that makes it easier to use those APIs will just encourage people to write more of them. The proper solution is for the function that accepts the function pointer to also accept an additional `void*` cookie that will be passed to the function pointer upon invocation.

Other than that, I can't see any legitimate use cases for this proposal.

There are also substantial wording issues that would be involved in order to support any proposal that allows functions to be dynamically created and destroyed. (Note that platform-specific facilities such as `dlopen` are outside the scope of the standard, so the standard doesn't need to specify what happens to your program if you use them.)
 
--
Std-Proposals mailing list
Std-Proposals@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals


--
Brian Bi