C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Calling C++ functions in a .so directly from foreign languages and C++ as an interface definition language

From: Adrian Johnston <adrian3_at_[hidden]>
Date: Sat, 30 Aug 2025 17:16:38 -0700
It is practically impossible to ask a team to develop software that is
dependent on a single platform or ABI just so that they get better
script bindings. I'd love it if the Python devs (for example) would
even consider using the C++ ABI but that is currently considered taboo
by the entire industry.

Otherwise, yes, the Itanium ABI is quite well documented.

On Sat, Aug 30, 2025 at 4:57 PM Andre Kostur <andre_at_[hidden]> wrote:
>
> That would seem to be something that a platform would document, not the C++ language. I think you’ve got the responsibility backwards.
>
> On Sat, Aug 30, 2025 at 4:55 PM Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
>>
>> Hello,
>>
>> Calling a C++ API implemented in a shared object (.so/.dll) from a
>> foreign language should be convenient instead of impossible. Foreign
>> languages and their libraries are often implemented in C and so
>> providing a mechanism for C to use the C++ function calling convention
>> without requiring a translation layer around everything would reduce
>> friction.
>>
>> I would like to propose we somehow standardize what is part of the
>> Itanium C++ ABI's section 5.1 External Names (a.k.a. Mangling) and
>> certain other parts. This should allow a script to call C++
>> constructors/destructors on a buffer, use C++ operators and pass data
>> by reference to C++. Providing C++ reflection data in a format
>> accessible to a foreign language is also discussed as a second part to
>> this proposal below. Once again, please forgive my contrarian
>> tendencies.
>>
>> https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
>>
>> This proposal is the result of an experiment I did lately where I used
>> libclang to reimplement a C++ object model in Python by parsing C++
>> headers and then generating a script that could call the C++ symbols
>> in a .so directly. The main challenge was that I had to hijack
>> Python's support for calling C functions and use
>> implementation-defined knowledge of the C++ ABI to do so. The only
>> real problem I found was that it is impossible to return an object
>> with a non-trivial destructor by value from a C++ function to a
>> foreign language without that language knowing the C++ calling
>> convention.
>>
>> The motivation for this experiment was that the state of the art for
>> binding C++ to Python is not too great. If you are willing to write a
>> lot of C wrapper code using the Python C API then you will have a good
>> solution. Otherwise, if you want to use C++ based script binding
>> tools, the ones I tried have performance issues. I was burning 50% of
>> my compile time and 50% of my .so size. I hate to name names, because
>> they seem well written, are more mature than my efforts and represent
>> a lot of effort, but that was pybind11 and nanobind. Possibly things
>> will improve if someone uses the new C++ reflection code to generate
>> bindings without using templates.
>>
>> Overall, having a tool to write out my C++ script bindings in 0.5
>> seconds using C++ as an interface definition language was a great
>> experience by comparison. The point of using script is to be able to
>> iterate quickly and this helped enable that.
>>
>> Here is an example of an overloaded C++ constructor being called
>> directly from Python. The C++ symbols have been loaded into the Python
>> module's global namespace and so they appear to be called directly.
>> The self arg is a "this" pointer to a buffer.
>>
>> def __init__(self,*_Args,**_Kwargs):
>> match _Len(_Args):
>> case 0:
>> return
>> _ZN12OperatorTestC1Ev(_Ctypes.byref(self))
>> case 1:
>> return
>> _ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0])
>>
>> My biggest complaint was that I needed any of this technology at all.
>> Arguably, the C/C++ compiler could emit a form of pre-compiled header
>> that described a part of the C/C++ API found in a C/C++ header. Then
>> every scripting engine could just load that instead of needing the
>> normal script binding boilerplate that is used. The C++ symbols
>> already have type information encoded in them and so it seems strange
>> to be manually configuring marshaling code for them in another
>> language. (Please forgive me if I am rubbing you the wrong way for the
>> second time in this email.)
>>
>> For those who are really curious, the script for parsing a C++ API and
>> generating a direct call wrapper is here:
>>
>> https://github.com/whatchamacallem/hatchlingplatform/blob/main/entanglement_example/src/entanglement.py
>>
>> Nota Bene: There was one bug I can't fix with the current design of
>> C++ and Python. Returning a class by value will result in it being
>> destructed without being copied first.
>>
>> I am happy to pull together the parts of the Itanium ABI that would
>> need to be standardized into a proposal if anyone is interested. This
>> is step 1: "float the idea." The first part would be to allow C to
>> identify and call C++ function pointers (in this case directly out of
>> a .so, although that detail has not been standardized) with code
>> written only in C that was not compiled with the types involved. The
>> manner in which it is done could still be implementation defined as
>> long as there was agreement between the two languages as to how to
>> operate the additional machinery C++ needs for the particular
>> platforms ABI. The second optional part would be to standardize a set
>> of requirements for reflection data for a subset of C++ that can be
>> read from a shared object directly or stored along side one.
>>
>> Regards,
>> Adrian
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-08-31 00:16:52