C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Calling C++ functions in a .so directly from foreign languages and C++ as an interface definition language

From: Adrian Johnston <adrian3_at_[hidden]>
Date: Sun, 31 Aug 2025 10:02:41 -0700
Thanks for unpacking that with me. Sorry for being this contrarian.

"If your language added FFI bindings for C++, I would assume that
means those bindings on all their primary platforms?" "If those
languages did want to support C++, they could."

It seems people don't normally want to limit open source to "primary
platforms." Instead they use C to abstract the ABI because that gives
them access to exotic microprocessors. If you want to get lectured
about proper behavior as an engineer then just try questioning that
one. You are describing what people could do when they aren't willing
to discuss the matter as it stands.

There are certainly challenges involved in calling C++. How about we
don't support making a call to a virtual function without knowing what
you are doing. I am only suggesting being able to call a C++ function
that is found from C in the linker map. The operation of a virtual
table is not part of that requirement. In my case I was literally
using a hash table to implement virtual functions in Python because
that is how Python works. So, yeah I already did that thing you said.

I am not suggesting that the authors of different languages want to
work in C++ and the only thing stopping them is the ABI. Arguably they
are motivated by wanting to use different high level abstractions
entirely. (And can you blame them when they can't even find a C++
symbol dynamically.) Although, I predict we will soon see a small
scripting engine using the C++26 reflection code to expose the
interpreter's C++ API itself directly back into script.

My concern right now is all the C++ developers who have to live in a
world where C++ is not supported at the system level anywhere else.
The moment your boss asks you to make your high performance C++
library available to the Python devs you end up going right back to
the stone age. Yes, I see the issue with backwards compatibility. In
that case we could have a look-up table added to the .so. I was just
asking for limited reflection data in the ELF file format too. As per
your objections, it isn't the C++ standards problem to solve the
details.

As for the generality of this limitation, my counter argument would be
that C++ enjoys a special place relative to C as a system language.
C++ doesn't need to abandon being the more modern system language of
choice because new languages tend to end up with a moat build around
them.

Regards,
Adrian

On Sat, Aug 30, 2025 at 7:01 PM Oliver Hunt <oliver_at_[hidden]> wrote:
>
>
>
> > On Aug 30, 2025, at 5:16 PM, Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >
> > It is practically impossible to ask a team to develop software that is
> > dependent on a single platform or ABI just so that they get better
> > script bindings.
>
> I’m not sure I understand this comment? If your language added FFI bindings for C++, I would assume that means those bindings on all their primary platforms? The FFI bindings for C also have to deal with different ABIs on different platforms.
>
> > I'd love it if the Python devs (for example) would
> > even consider using the C++ ABI but that is currently considered taboo
> > by the entire industry.
> >
> > Otherwise, yes, the Itanium ABI is quite well documented.
>
> The platform ABI is not part of the C++ specification, it does not specify details below the abstract machine: For example the C++ specification does not talk about v-tables: the mechanism by which polymorphism is not specified, in principle you could implement C++ on top of the objective-c runtime, in which case method look up would be performed as a string look up in a map. _If_ C++ specified an ABI, not only would it need to substantially increase how much of the underlying specification, it would also be requiring every platform to ship multiple ABIs - which is simply not possible as you could not rely on objects of the “same” type created in different places being binary compatible.
>
> The issue here is that other languages find it easy to write C bindings, largely because C does not meaningfully acknowledge types, does not support polymorphism, etc, and the C bindings that are presented essentially exist to pass either trivial types (structs, the primitives), or path object references from host language->C->host language.
>
> If those languages did want to support C++, they could, it is simply much harder to do than C, because C is a tiny language with very few places where it _could_ represent higher level concepts. By the same token they could bind to other common languages directly: python->ruby, python to C#, etc. The reason you are dealing with mangling yourself, is because the language runtime/standard library chooses to only support C.
>
> The arguments you’re making about C++ here apply to every other language as well: if you want to call rust from your language of choice, you will have to deal with rust’s mangling as well.
>
> —Oliver
>
> >
> > On Sat, Aug 30, 2025 at 4:57 PM Andre Kostur <andre_at_[hidden]> wrote:
> >>
> >> That would seem to be something that a platform would document, not the C++ language. I think you’ve got the responsibility backwards.
> >>
> >> On Sat, Aug 30, 2025 at 4:55 PM Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >>>
> >>> Hello,
> >>>
> >>> Calling a C++ API implemented in a shared object (.so/.dll) from a
> >>> foreign language should be convenient instead of impossible. Foreign
> >>> languages and their libraries are often implemented in C and so
> >>> providing a mechanism for C to use the C++ function calling convention
> >>> without requiring a translation layer around everything would reduce
> >>> friction.
> >>>
> >>> I would like to propose we somehow standardize what is part of the
> >>> Itanium C++ ABI's section 5.1 External Names (a.k.a. Mangling) and
> >>> certain other parts. This should allow a script to call C++
> >>> constructors/destructors on a buffer, use C++ operators and pass data
> >>> by reference to C++. Providing C++ reflection data in a format
> >>> accessible to a foreign language is also discussed as a second part to
> >>> this proposal below. Once again, please forgive my contrarian
> >>> tendencies.
> >>>
> >>> https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
> >>>
> >>> This proposal is the result of an experiment I did lately where I used
> >>> libclang to reimplement a C++ object model in Python by parsing C++
> >>> headers and then generating a script that could call the C++ symbols
> >>> in a .so directly. The main challenge was that I had to hijack
> >>> Python's support for calling C functions and use
> >>> implementation-defined knowledge of the C++ ABI to do so. The only
> >>> real problem I found was that it is impossible to return an object
> >>> with a non-trivial destructor by value from a C++ function to a
> >>> foreign language without that language knowing the C++ calling
> >>> convention.
> >>>
> >>> The motivation for this experiment was that the state of the art for
> >>> binding C++ to Python is not too great. If you are willing to write a
> >>> lot of C wrapper code using the Python C API then you will have a good
> >>> solution. Otherwise, if you want to use C++ based script binding
> >>> tools, the ones I tried have performance issues. I was burning 50% of
> >>> my compile time and 50% of my .so size. I hate to name names, because
> >>> they seem well written, are more mature than my efforts and represent
> >>> a lot of effort, but that was pybind11 and nanobind. Possibly things
> >>> will improve if someone uses the new C++ reflection code to generate
> >>> bindings without using templates.
> >>>
> >>> Overall, having a tool to write out my C++ script bindings in 0.5
> >>> seconds using C++ as an interface definition language was a great
> >>> experience by comparison. The point of using script is to be able to
> >>> iterate quickly and this helped enable that.
> >>>
> >>> Here is an example of an overloaded C++ constructor being called
> >>> directly from Python. The C++ symbols have been loaded into the Python
> >>> module's global namespace and so they appear to be called directly.
> >>> The self arg is a "this" pointer to a buffer.
> >>>
> >>> def __init__(self,*_Args,**_Kwargs):
> >>> match _Len(_Args):
> >>> case 0:
> >>> return
> >>> _ZN12OperatorTestC1Ev(_Ctypes.byref(self))
> >>> case 1:
> >>> return
> >>> _ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0])
> >>>
> >>> My biggest complaint was that I needed any of this technology at all.
> >>> Arguably, the C/C++ compiler could emit a form of pre-compiled header
> >>> that described a part of the C/C++ API found in a C/C++ header. Then
> >>> every scripting engine could just load that instead of needing the
> >>> normal script binding boilerplate that is used. The C++ symbols
> >>> already have type information encoded in them and so it seems strange
> >>> to be manually configuring marshaling code for them in another
> >>> language. (Please forgive me if I am rubbing you the wrong way for the
> >>> second time in this email.)
> >>>
> >>> For those who are really curious, the script for parsing a C++ API and
> >>> generating a direct call wrapper is here:
> >>>
> >>> https://github.com/whatchamacallem/hatchlingplatform/blob/main/entanglement_example/src/entanglement.py
> >>>
> >>> Nota Bene: There was one bug I can't fix with the current design of
> >>> C++ and Python. Returning a class by value will result in it being
> >>> destructed without being copied first.
> >>>
> >>> I am happy to pull together the parts of the Itanium ABI that would
> >>> need to be standardized into a proposal if anyone is interested. This
> >>> is step 1: "float the idea." The first part would be to allow C to
> >>> identify and call C++ function pointers (in this case directly out of
> >>> a .so, although that detail has not been standardized) with code
> >>> written only in C that was not compiled with the types involved. The
> >>> manner in which it is done could still be implementation defined as
> >>> long as there was agreement between the two languages as to how to
> >>> operate the additional machinery C++ needs for the particular
> >>> platforms ABI. The second optional part would be to standardize a set
> >>> of requirements for reflection data for a subset of C++ that can be
> >>> read from a shared object directly or stored along side one.
> >>>
> >>> Regards,
> >>> Adrian
> >>> --
> >>> Std-Proposals mailing list
> >>> Std-Proposals_at_[hidden]
> >>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> > --
> > Std-Proposals mailing list
> > Std-Proposals_at_[hidden]
> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2025-08-31 17:02:56