C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Calling C++ functions in a .so directly from foreign languages and C++ as an interface definition language

From: Sebastian Wittmeier <wittmeier_at_[hidden]>
Date: Sun, 31 Aug 2025 04:45:54 +0200
You want to standardize Itanium name mangling and some other parts. But they are standardized as Itanium C++ ABI.   You want to standardize it in the C++ standard, not separately, although  - each implementation of C++ may have its own name mangling  - it goes into areas not covered by the C++ standard   And the reason is the hope that people would create C++ bindings as the Itanium ABI is more official now?   -----Ursprüngliche Nachricht----- Von:Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> Gesendet:So 31.08.2025 01:55 Betreff:[std-proposals] Calling C++ functions in a .so directly from foreign languages and C++ as an interface definition language An:std-proposals_at_[hidden]; CC:Adrian Johnston <adrian3_at_[hidden]>; Hello, Calling a C++ API implemented in a shared object (.so/.dll) from a foreign language should be convenient instead of impossible. Foreign languages and their libraries are often implemented in C and so providing a mechanism for C to use the C++ function calling convention without requiring a translation layer around everything would reduce friction. I would like to propose we somehow standardize what is part of the Itanium C++ ABI's section 5.1 External Names (a.k.a. Mangling) and certain other parts. This should allow a script to call C++ constructors/destructors on a buffer, use C++ operators and pass data by reference to C++. Providing C++ reflection data in a format accessible to a foreign language is also discussed as a second part to this proposal below. Once again, please forgive my contrarian tendencies. https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling This proposal is the result of an experiment I did lately where I used libclang to reimplement a C++ object model in Python by parsing C++ headers and then generating a script that could call the C++ symbols in a .so directly. The main challenge was that I had to hijack Python's support for calling C functions and use implementation-defined knowledge of the C++ ABI to do so. The only real problem I found was that it is impossible to return an object with a non-trivial destructor by value from a C++ function to a foreign language without that language knowing the C++ calling convention. The motivation for this experiment was that the state of the art for binding C++ to Python is not too great. If you are willing to write a lot of C wrapper code using the Python C API then you will have a good solution. Otherwise, if you want to use C++ based script binding tools, the ones I tried have performance issues. I was burning 50% of my compile time and 50% of my .so size. I hate to name names, because they seem well written, are more mature than my efforts and represent a lot of effort, but that was pybind11 and nanobind. Possibly things will improve if someone uses the new C++ reflection code to generate bindings without using templates. Overall, having a tool to write out my C++ script bindings in 0.5 seconds using C++ as an interface definition language was a great experience by comparison. The point of using script is to be able to iterate quickly and this helped enable that. Here is an example of an overloaded C++ constructor being called directly from Python. The C++ symbols have been loaded into the Python module's global namespace and so they appear to be called directly. The self arg is a "this" pointer to a buffer.        def __init__(self,*_Args,**_Kwargs):                match _Len(_Args):                        case 0:                                return _ZN12OperatorTestC1Ev(_Ctypes.byref(self))                        case 1:                                return _ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0]) My biggest complaint was that I needed any of this technology at all. Arguably, the C/C++ compiler could emit a form of pre-compiled header that described a part of the C/C++ API found in a C/C++ header.  Then every scripting engine could just load that instead of needing the normal script binding boilerplate that is used. The C++ symbols already have type information encoded in them and so it seems strange to be manually configuring marshaling code for them in another language. (Please forgive me if I am rubbing you the wrong way for the second time in this email.) For those who are really curious, the script for parsing a C++ API and generating a direct call wrapper is here: https://github.com/whatchamacallem/hatchlingplatform/blob/main/entanglement_example/src/entanglement.py Nota Bene: There was one bug I can't fix with the current design of C++ and Python. Returning a class by value will result in it being destructed without being copied first. I am happy to pull together the parts of the Itanium ABI that would need to be standardized into a proposal if anyone is interested. This is step 1: "float the idea." The first part would be to allow C to identify and call C++ function pointers (in this case directly out of a .so, although that detail has not been standardized) with code written only in C that was not compiled with the types involved. The manner in which it is done could still be implementation defined as long as there was agreement between the two languages as to how to operate the additional machinery C++ needs for the particular platforms ABI. The second optional part would be to standardize a set of requirements for reflection data for a subset of C++ that can be read from a shared object directly or stored along side one. Regards, Adrian -- Std-Proposals mailing list Std-Proposals_at_[hidden] https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-08-31 02:57:03