Date: Sun, 31 Aug 2025 20:48:34 -0700
Thinking about our conversation I realize I am really only talking
about interpreters. If you are doing code gen then you are in pretty
deep and may already have a relationship with the local C++ compiler's
back end (e.g. Objective C++) which would make things a lot easier.
And I know it looks like we are just disagreeing, but I appreciate the
opportunity to have the conversation. I think I can adequately address
your concerns.
"a function symbol name is necessarily some combination of type +
name. You would need to do this work if you were target rust, swift,
Haskell, etc."
This doesn't seem like a significant barrier to me. If the host
language simply can't handle overloaded function names then it would
be unable to use them directly as such. Plain C would be an example of
a language where you would have to have a dispatch system, as I am
advocating.
In my experiment I was able to automatically generate Python code to
dynamically select C++ functions by arity and then by the type of the
first arg. Default parameters would have been easy to add. That level
of support is normal for C++ in Python. It wasn't perfect, but I am
not sure is has to be. If an implementation wants to reject corner
cases that doesn't seem fatal. And I don't see any immediate reason to
support function parameters, decltype return values or anything really
language specific. The main issue as I see it for function overloads
in languages with different rules is overhead.
"Because you are having to combine name+type that means, again not
just for C++, that you need to be sufficiently aware of the type
system to represent those types."
I am leading up to saying "just don't support templates if there is no
demand for that" but let me try and respond directly first.
The Itanium C++ ABI provides a very clear specification of a type
system for doing exactly what you are saying. It claims to be as
precise as you can be using C++ itself. Knowing class/struct sizeof is
also critical. Having the size and locations of their fields sounds
very useful but could actually be kept opaque.
"so things like std::string become std::basic_string<…> or something."
The ABI uses substitution rules to shorten things like std::string. To
quote the specification:
void f(std::string, std::string) { } // mangles as _Z1fSsB1XS_
So it isn't quite so bad as it was designed. It would be easy to add
convenience mechanisms to support the standard. As you pointed out,
aliases would be nice in general.
"doing so manually is hard because it requires determining the
mangling for a method, which means manually determining things like
the full real type signature (post alias expansion), etc. Which is not
fun for any language where you need to do so."
My ideal world would involve a small open source C library that could
load the C++ symbol table from a .so/.dll and unpack it as needed.
Then the host language could talk to that C library and obtain all the
facilities it needed. That way an interpreter could simply populate
its module hierarchy with the contents of the .so automatically. I'd
love to have C++ namespaces, enums, classes, structs, operators and
functions just show up in interpreted languages without users doing
any work.
Perhaps I should just go write it. It looks like a very hard product
to promote with the C++ standard essentially rejecting that kind of
activity. For example, something like Lua which gets used everywhere
would have a very hard time depending on a library for core
functionality that used undefined behavior all day long.
"anything that tries to depend on a single universal mangling scheme
is still going to encounter fundamental impedance mismatches between
the types as written and the actual types"
I don't think this problem is that important. Most of the standard
library is pretty useless in other languages. You don't really have
any business instantiating and then calling std::string methods in
Python. That just sounds insane. Similarly std::list and
std::unordered_map are totally redundant.
What is most valuable is having a straight forward object model where
you can pass arrays of fundamental types and arrays of classes/structs
efficiently.
So my first suggestion would be to disallow using templates from other
languages in version one. I would rather be able to write a C++ API at
all than lose that fight over template typedef resolution.
You are completely right about the headache involved in instantiating
a template all the ways required and then getting it into the .so.
Templates are actually somewhat unnatural as a language feature to use
externally when other languages don't necessarily have the concept of
instantiating C++ code. That said, you can ship templates without
providing the source code. And in the rare use cases where that makes
sense to someone I can see the demand for using them externally too.
You could call my initial requirements for external C++ use "C with classes."
On Sun, Aug 31, 2025 at 2:34 PM Oliver Hunt <oliver_at_[hidden]> wrote:
>
>
>
> > On Aug 31, 2025, at 10:02 AM, Adrian Johnston <adrian3_at_[hidden]> wrote:
> >
> > Thanks for unpacking that with me. Sorry for being this contrarian.
> >
> > "If your language added FFI bindings for C++, I would assume that
> > means those bindings on all their primary platforms?" "If those
> > languages did want to support C++, they could."
> >
> > It seems people don't normally want to limit open source to "primary
> > platforms." Instead they use C to abstract the ABI because that gives
> > them access to exotic microprocessors. If you want to get lectured
> > about proper behavior as an engineer then just try questioning that
> > one. You are describing what people could do when they aren't willing
> > to discuss the matter as it stands.
>
> That still requires supporting things like calling conventions, struct layout, etc - which vary across platforms even on the same underlying system architecture.
>
> >
> > There are certainly challenges involved in calling C++. How about we
> > don't support making a call to a virtual function without knowing what
> > you are doing. I am only suggesting being able to call a C++ function
> > that is found from C in the linker map. The operation of a virtual
> > table is not part of that requirement. In my case I was literally
> > using a hash table to implement virtual functions in Python because
> > that is how Python works. So, yeah I already did that thing you said.
>
> If you want to call a language that permits type based method or function overloading you will out of necessity have to:
>
> * Have a basic understanding of the types involved - because they impact the method name, you might be able to get away with “just” knowing the name, but that may not be trivial either
> * Have an understanding of how to go from name+function type to symbol name
>
> This is not unique to C++, any language that supports parameter based overloading of functions presents the same issue: a function symbol name is necessarily some combination of type + name. You would need to do this work if you were target rust, swift, Haskell, etc.
>
> Because you are having to combine name+type that means, again not just for C++, that you need to be sufficiently aware of the type system to represent those types.
>
> I think the most irksome thing, and this applies to all languages that allow type aliases: The types of functions and methods necessarily have to fully resolve aliases which means that all the types being used - possibly the only ones even C++ devs interact with - get turned into the alias free definitions, so things like std::string become std::basic_string<…> or something.
>
> But again this applies to rust, swift, and similar languages. I _think_ Haskell supports opaque aliases, which might allow them to not expand such aliases. This is not to suggest Haskell is something people are clamoring to call from python :D
>
> > I am not suggesting that the authors of different languages want to
> > work in C++ and the only thing stopping them is the ABI.
>
>
> I don’t think anyone thought that’s what you were saying :D
>
> The issue is that most languages do not provide support for bridging to C++ automatically, and doing so manually is hard because it requires determining the mangling for a method, which means manually determining things like the full real type signature (post alias expansion), etc. Which is not fun for any language where you need to do so.
>
> > Arguably they
> > are motivated by wanting to use different high level abstractions
> > entirely. (And can you blame them when they can't even find a C++
> > symbol dynamically.) Although, I predict we will soon see a small
> > scripting engine using the C++26 reflection code to expose the
> > interpreter's C++ API itself directly back into script.
>
> I’m aware of existing projects that use clang tooling to extract interface information as well.
>
> > My concern right now is all the C++ developers who have to live in a
> > world where C++ is not supported at the system level anywhere else.
>
> I’m not sure what you mean by supported at a system level here sorry.
>
> > The moment your boss asks you to make your high performance C++
> > library available to the Python devs you end up going right back to
> > the stone age.
>
>
> Scripting or reflection are realistically the best path forward, because of the aforementioned type aliasing - anything that tries to depend on a single universal mangling scheme is still going to encounter fundamental impedance mismatches between the types as written and the actual types. e.g, consider a developer exporting a function
>
> `void foo(std::string)`
>
> The *actual* type is (with libstdc++, you get similar with libc++, or the windows stdlib)
>
> `foo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)`
>
> There is something to be said for a “standard" attribute or similar (as stated any of this is outside the scope of the C++ spec) for declaring “exported for FFI”, “type name for FFI”, etc. Such an approach would also mitigate people trying to use internal methods rather than APIs.
>
> The JavaScriptCore objective-c bridge did this, but it’s only possible because objective-c allows exciting amounts of runtime type introspection (and modification :-O): https://developer.apple.com/documentation/javascriptcore/jsexport?language=objc
>
> > Yes, I see the issue with backwards compatibility. In
> > that case we could have a look-up table added to the .so. I was just
> > asking for limited reflection data in the ELF file format too. As per
> > your objections, it isn't the C++ standards problem to solve the
> > details.
> >
> > As for the generality of this limitation, my counter argument would be
> > that C++ enjoys a special place relative to C as a system language.
> > C++ doesn't need to abandon being the more modern system language of
> > choice because new languages tend to end up with a moat build around
> > them.
>
> While writing this I did realize that there is an additional potentially “real world expectation" issue when trying to bridge to C++ entirely at runtime: you have to work entirely in terms of specific instantiations of template methods and classes - e.g. (and ignore whatever aliases may be involved) your bridge can’t use `std::vector<some_type>` unless there are existing specializations of every method you end up needing for `vector<some_type>` in the already generated code, and even if that does exist, a function that the dev then tries to pass the vector to may not exist anyway.
>
> —Oliver
>
> > Regards,
> > Adrian
> >
> > On Sat, Aug 30, 2025 at 7:01 PM Oliver Hunt <oliver_at_[hidden]> wrote:
> >>
> >>
> >>
> >>> On Aug 30, 2025, at 5:16 PM, Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >>>
> >>> It is practically impossible to ask a team to develop software that is
> >>> dependent on a single platform or ABI just so that they get better
> >>> script bindings.
> >>
> >> I’m not sure I understand this comment? If your language added FFI bindings for C++, I would assume that means those bindings on all their primary platforms? The FFI bindings for C also have to deal with different ABIs on different platforms.
> >>
> >>> I'd love it if the Python devs (for example) would
> >>> even consider using the C++ ABI but that is currently considered taboo
> >>> by the entire industry.
> >>>
> >>> Otherwise, yes, the Itanium ABI is quite well documented.
> >>
> >> The platform ABI is not part of the C++ specification, it does not specify details below the abstract machine: For example the C++ specification does not talk about v-tables: the mechanism by which polymorphism is not specified, in principle you could implement C++ on top of the objective-c runtime, in which case method look up would be performed as a string look up in a map. _If_ C++ specified an ABI, not only would it need to substantially increase how much of the underlying specification, it would also be requiring every platform to ship multiple ABIs - which is simply not possible as you could not rely on objects of the “same” type created in different places being binary compatible.
> >>
> >> The issue here is that other languages find it easy to write C bindings, largely because C does not meaningfully acknowledge types, does not support polymorphism, etc, and the C bindings that are presented essentially exist to pass either trivial types (structs, the primitives), or path object references from host language->C->host language.
> >>
> >> If those languages did want to support C++, they could, it is simply much harder to do than C, because C is a tiny language with very few places where it _could_ represent higher level concepts. By the same token they could bind to other common languages directly: python->ruby, python to C#, etc. The reason you are dealing with mangling yourself, is because the language runtime/standard library chooses to only support C.
> >>
> >> The arguments you’re making about C++ here apply to every other language as well: if you want to call rust from your language of choice, you will have to deal with rust’s mangling as well.
> >>
> >> —Oliver
> >>
> >>>
> >>> On Sat, Aug 30, 2025 at 4:57 PM Andre Kostur <andre_at_[hidden]> wrote:
> >>>>
> >>>> That would seem to be something that a platform would document, not the C++ language. I think you’ve got the responsibility backwards.
> >>>>
> >>>> On Sat, Aug 30, 2025 at 4:55 PM Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Calling a C++ API implemented in a shared object (.so/.dll) from a
> >>>>> foreign language should be convenient instead of impossible. Foreign
> >>>>> languages and their libraries are often implemented in C and so
> >>>>> providing a mechanism for C to use the C++ function calling convention
> >>>>> without requiring a translation layer around everything would reduce
> >>>>> friction.
> >>>>>
> >>>>> I would like to propose we somehow standardize what is part of the
> >>>>> Itanium C++ ABI's section 5.1 External Names (a.k.a. Mangling) and
> >>>>> certain other parts. This should allow a script to call C++
> >>>>> constructors/destructors on a buffer, use C++ operators and pass data
> >>>>> by reference to C++. Providing C++ reflection data in a format
> >>>>> accessible to a foreign language is also discussed as a second part to
> >>>>> this proposal below. Once again, please forgive my contrarian
> >>>>> tendencies.
> >>>>>
> >>>>> https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
> >>>>>
> >>>>> This proposal is the result of an experiment I did lately where I used
> >>>>> libclang to reimplement a C++ object model in Python by parsing C++
> >>>>> headers and then generating a script that could call the C++ symbols
> >>>>> in a .so directly. The main challenge was that I had to hijack
> >>>>> Python's support for calling C functions and use
> >>>>> implementation-defined knowledge of the C++ ABI to do so. The only
> >>>>> real problem I found was that it is impossible to return an object
> >>>>> with a non-trivial destructor by value from a C++ function to a
> >>>>> foreign language without that language knowing the C++ calling
> >>>>> convention.
> >>>>>
> >>>>> The motivation for this experiment was that the state of the art for
> >>>>> binding C++ to Python is not too great. If you are willing to write a
> >>>>> lot of C wrapper code using the Python C API then you will have a good
> >>>>> solution. Otherwise, if you want to use C++ based script binding
> >>>>> tools, the ones I tried have performance issues. I was burning 50% of
> >>>>> my compile time and 50% of my .so size. I hate to name names, because
> >>>>> they seem well written, are more mature than my efforts and represent
> >>>>> a lot of effort, but that was pybind11 and nanobind. Possibly things
> >>>>> will improve if someone uses the new C++ reflection code to generate
> >>>>> bindings without using templates.
> >>>>>
> >>>>> Overall, having a tool to write out my C++ script bindings in 0.5
> >>>>> seconds using C++ as an interface definition language was a great
> >>>>> experience by comparison. The point of using script is to be able to
> >>>>> iterate quickly and this helped enable that.
> >>>>>
> >>>>> Here is an example of an overloaded C++ constructor being called
> >>>>> directly from Python. The C++ symbols have been loaded into the Python
> >>>>> module's global namespace and so they appear to be called directly.
> >>>>> The self arg is a "this" pointer to a buffer.
> >>>>>
> >>>>> def __init__(self,*_Args,**_Kwargs):
> >>>>> match _Len(_Args):
> >>>>> case 0:
> >>>>> return
> >>>>> _ZN12OperatorTestC1Ev(_Ctypes.byref(self))
> >>>>> case 1:
> >>>>> return
> >>>>> _ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0])
> >>>>>
> >>>>> My biggest complaint was that I needed any of this technology at all.
> >>>>> Arguably, the C/C++ compiler could emit a form of pre-compiled header
> >>>>> that described a part of the C/C++ API found in a C/C++ header. Then
> >>>>> every scripting engine could just load that instead of needing the
> >>>>> normal script binding boilerplate that is used. The C++ symbols
> >>>>> already have type information encoded in them and so it seems strange
> >>>>> to be manually configuring marshaling code for them in another
> >>>>> language. (Please forgive me if I am rubbing you the wrong way for the
> >>>>> second time in this email.)
> >>>>>
> >>>>> For those who are really curious, the script for parsing a C++ API and
> >>>>> generating a direct call wrapper is here:
> >>>>>
> >>>>> https://github.com/whatchamacallem/hatchlingplatform/blob/main/entanglement_example/src/entanglement.py
> >>>>>
> >>>>> Nota Bene: There was one bug I can't fix with the current design of
> >>>>> C++ and Python. Returning a class by value will result in it being
> >>>>> destructed without being copied first.
> >>>>>
> >>>>> I am happy to pull together the parts of the Itanium ABI that would
> >>>>> need to be standardized into a proposal if anyone is interested. This
> >>>>> is step 1: "float the idea." The first part would be to allow C to
> >>>>> identify and call C++ function pointers (in this case directly out of
> >>>>> a .so, although that detail has not been standardized) with code
> >>>>> written only in C that was not compiled with the types involved. The
> >>>>> manner in which it is done could still be implementation defined as
> >>>>> long as there was agreement between the two languages as to how to
> >>>>> operate the additional machinery C++ needs for the particular
> >>>>> platforms ABI. The second optional part would be to standardize a set
> >>>>> of requirements for reflection data for a subset of C++ that can be
> >>>>> read from a shared object directly or stored along side one.
> >>>>>
> >>>>> Regards,
> >>>>> Adrian
> >>>>> --
> >>>>> Std-Proposals mailing list
> >>>>> Std-Proposals_at_[hidden]
> >>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> >>> --
> >>> Std-Proposals mailing list
> >>> Std-Proposals_at_[hidden]
> >>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> >>
>
about interpreters. If you are doing code gen then you are in pretty
deep and may already have a relationship with the local C++ compiler's
back end (e.g. Objective C++) which would make things a lot easier.
And I know it looks like we are just disagreeing, but I appreciate the
opportunity to have the conversation. I think I can adequately address
your concerns.
"a function symbol name is necessarily some combination of type +
name. You would need to do this work if you were target rust, swift,
Haskell, etc."
This doesn't seem like a significant barrier to me. If the host
language simply can't handle overloaded function names then it would
be unable to use them directly as such. Plain C would be an example of
a language where you would have to have a dispatch system, as I am
advocating.
In my experiment I was able to automatically generate Python code to
dynamically select C++ functions by arity and then by the type of the
first arg. Default parameters would have been easy to add. That level
of support is normal for C++ in Python. It wasn't perfect, but I am
not sure is has to be. If an implementation wants to reject corner
cases that doesn't seem fatal. And I don't see any immediate reason to
support function parameters, decltype return values or anything really
language specific. The main issue as I see it for function overloads
in languages with different rules is overhead.
"Because you are having to combine name+type that means, again not
just for C++, that you need to be sufficiently aware of the type
system to represent those types."
I am leading up to saying "just don't support templates if there is no
demand for that" but let me try and respond directly first.
The Itanium C++ ABI provides a very clear specification of a type
system for doing exactly what you are saying. It claims to be as
precise as you can be using C++ itself. Knowing class/struct sizeof is
also critical. Having the size and locations of their fields sounds
very useful but could actually be kept opaque.
"so things like std::string become std::basic_string<…> or something."
The ABI uses substitution rules to shorten things like std::string. To
quote the specification:
void f(std::string, std::string) { } // mangles as _Z1fSsB1XS_
So it isn't quite so bad as it was designed. It would be easy to add
convenience mechanisms to support the standard. As you pointed out,
aliases would be nice in general.
"doing so manually is hard because it requires determining the
mangling for a method, which means manually determining things like
the full real type signature (post alias expansion), etc. Which is not
fun for any language where you need to do so."
My ideal world would involve a small open source C library that could
load the C++ symbol table from a .so/.dll and unpack it as needed.
Then the host language could talk to that C library and obtain all the
facilities it needed. That way an interpreter could simply populate
its module hierarchy with the contents of the .so automatically. I'd
love to have C++ namespaces, enums, classes, structs, operators and
functions just show up in interpreted languages without users doing
any work.
Perhaps I should just go write it. It looks like a very hard product
to promote with the C++ standard essentially rejecting that kind of
activity. For example, something like Lua which gets used everywhere
would have a very hard time depending on a library for core
functionality that used undefined behavior all day long.
"anything that tries to depend on a single universal mangling scheme
is still going to encounter fundamental impedance mismatches between
the types as written and the actual types"
I don't think this problem is that important. Most of the standard
library is pretty useless in other languages. You don't really have
any business instantiating and then calling std::string methods in
Python. That just sounds insane. Similarly std::list and
std::unordered_map are totally redundant.
What is most valuable is having a straight forward object model where
you can pass arrays of fundamental types and arrays of classes/structs
efficiently.
So my first suggestion would be to disallow using templates from other
languages in version one. I would rather be able to write a C++ API at
all than lose that fight over template typedef resolution.
You are completely right about the headache involved in instantiating
a template all the ways required and then getting it into the .so.
Templates are actually somewhat unnatural as a language feature to use
externally when other languages don't necessarily have the concept of
instantiating C++ code. That said, you can ship templates without
providing the source code. And in the rare use cases where that makes
sense to someone I can see the demand for using them externally too.
You could call my initial requirements for external C++ use "C with classes."
On Sun, Aug 31, 2025 at 2:34 PM Oliver Hunt <oliver_at_[hidden]> wrote:
>
>
>
> > On Aug 31, 2025, at 10:02 AM, Adrian Johnston <adrian3_at_[hidden]> wrote:
> >
> > Thanks for unpacking that with me. Sorry for being this contrarian.
> >
> > "If your language added FFI bindings for C++, I would assume that
> > means those bindings on all their primary platforms?" "If those
> > languages did want to support C++, they could."
> >
> > It seems people don't normally want to limit open source to "primary
> > platforms." Instead they use C to abstract the ABI because that gives
> > them access to exotic microprocessors. If you want to get lectured
> > about proper behavior as an engineer then just try questioning that
> > one. You are describing what people could do when they aren't willing
> > to discuss the matter as it stands.
>
> That still requires supporting things like calling conventions, struct layout, etc - which vary across platforms even on the same underlying system architecture.
>
> >
> > There are certainly challenges involved in calling C++. How about we
> > don't support making a call to a virtual function without knowing what
> > you are doing. I am only suggesting being able to call a C++ function
> > that is found from C in the linker map. The operation of a virtual
> > table is not part of that requirement. In my case I was literally
> > using a hash table to implement virtual functions in Python because
> > that is how Python works. So, yeah I already did that thing you said.
>
> If you want to call a language that permits type based method or function overloading you will out of necessity have to:
>
> * Have a basic understanding of the types involved - because they impact the method name, you might be able to get away with “just” knowing the name, but that may not be trivial either
> * Have an understanding of how to go from name+function type to symbol name
>
> This is not unique to C++, any language that supports parameter based overloading of functions presents the same issue: a function symbol name is necessarily some combination of type + name. You would need to do this work if you were target rust, swift, Haskell, etc.
>
> Because you are having to combine name+type that means, again not just for C++, that you need to be sufficiently aware of the type system to represent those types.
>
> I think the most irksome thing, and this applies to all languages that allow type aliases: The types of functions and methods necessarily have to fully resolve aliases which means that all the types being used - possibly the only ones even C++ devs interact with - get turned into the alias free definitions, so things like std::string become std::basic_string<…> or something.
>
> But again this applies to rust, swift, and similar languages. I _think_ Haskell supports opaque aliases, which might allow them to not expand such aliases. This is not to suggest Haskell is something people are clamoring to call from python :D
>
> > I am not suggesting that the authors of different languages want to
> > work in C++ and the only thing stopping them is the ABI.
>
>
> I don’t think anyone thought that’s what you were saying :D
>
> The issue is that most languages do not provide support for bridging to C++ automatically, and doing so manually is hard because it requires determining the mangling for a method, which means manually determining things like the full real type signature (post alias expansion), etc. Which is not fun for any language where you need to do so.
>
> > Arguably they
> > are motivated by wanting to use different high level abstractions
> > entirely. (And can you blame them when they can't even find a C++
> > symbol dynamically.) Although, I predict we will soon see a small
> > scripting engine using the C++26 reflection code to expose the
> > interpreter's C++ API itself directly back into script.
>
> I’m aware of existing projects that use clang tooling to extract interface information as well.
>
> > My concern right now is all the C++ developers who have to live in a
> > world where C++ is not supported at the system level anywhere else.
>
> I’m not sure what you mean by supported at a system level here sorry.
>
> > The moment your boss asks you to make your high performance C++
> > library available to the Python devs you end up going right back to
> > the stone age.
>
>
> Scripting or reflection are realistically the best path forward, because of the aforementioned type aliasing - anything that tries to depend on a single universal mangling scheme is still going to encounter fundamental impedance mismatches between the types as written and the actual types. e.g, consider a developer exporting a function
>
> `void foo(std::string)`
>
> The *actual* type is (with libstdc++, you get similar with libc++, or the windows stdlib)
>
> `foo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)`
>
> There is something to be said for a “standard" attribute or similar (as stated any of this is outside the scope of the C++ spec) for declaring “exported for FFI”, “type name for FFI”, etc. Such an approach would also mitigate people trying to use internal methods rather than APIs.
>
> The JavaScriptCore objective-c bridge did this, but it’s only possible because objective-c allows exciting amounts of runtime type introspection (and modification :-O): https://developer.apple.com/documentation/javascriptcore/jsexport?language=objc
>
> > Yes, I see the issue with backwards compatibility. In
> > that case we could have a look-up table added to the .so. I was just
> > asking for limited reflection data in the ELF file format too. As per
> > your objections, it isn't the C++ standards problem to solve the
> > details.
> >
> > As for the generality of this limitation, my counter argument would be
> > that C++ enjoys a special place relative to C as a system language.
> > C++ doesn't need to abandon being the more modern system language of
> > choice because new languages tend to end up with a moat build around
> > them.
>
> While writing this I did realize that there is an additional potentially “real world expectation" issue when trying to bridge to C++ entirely at runtime: you have to work entirely in terms of specific instantiations of template methods and classes - e.g. (and ignore whatever aliases may be involved) your bridge can’t use `std::vector<some_type>` unless there are existing specializations of every method you end up needing for `vector<some_type>` in the already generated code, and even if that does exist, a function that the dev then tries to pass the vector to may not exist anyway.
>
> —Oliver
>
> > Regards,
> > Adrian
> >
> > On Sat, Aug 30, 2025 at 7:01 PM Oliver Hunt <oliver_at_[hidden]> wrote:
> >>
> >>
> >>
> >>> On Aug 30, 2025, at 5:16 PM, Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >>>
> >>> It is practically impossible to ask a team to develop software that is
> >>> dependent on a single platform or ABI just so that they get better
> >>> script bindings.
> >>
> >> I’m not sure I understand this comment? If your language added FFI bindings for C++, I would assume that means those bindings on all their primary platforms? The FFI bindings for C also have to deal with different ABIs on different platforms.
> >>
> >>> I'd love it if the Python devs (for example) would
> >>> even consider using the C++ ABI but that is currently considered taboo
> >>> by the entire industry.
> >>>
> >>> Otherwise, yes, the Itanium ABI is quite well documented.
> >>
> >> The platform ABI is not part of the C++ specification, it does not specify details below the abstract machine: For example the C++ specification does not talk about v-tables: the mechanism by which polymorphism is not specified, in principle you could implement C++ on top of the objective-c runtime, in which case method look up would be performed as a string look up in a map. _If_ C++ specified an ABI, not only would it need to substantially increase how much of the underlying specification, it would also be requiring every platform to ship multiple ABIs - which is simply not possible as you could not rely on objects of the “same” type created in different places being binary compatible.
> >>
> >> The issue here is that other languages find it easy to write C bindings, largely because C does not meaningfully acknowledge types, does not support polymorphism, etc, and the C bindings that are presented essentially exist to pass either trivial types (structs, the primitives), or path object references from host language->C->host language.
> >>
> >> If those languages did want to support C++, they could, it is simply much harder to do than C, because C is a tiny language with very few places where it _could_ represent higher level concepts. By the same token they could bind to other common languages directly: python->ruby, python to C#, etc. The reason you are dealing with mangling yourself, is because the language runtime/standard library chooses to only support C.
> >>
> >> The arguments you’re making about C++ here apply to every other language as well: if you want to call rust from your language of choice, you will have to deal with rust’s mangling as well.
> >>
> >> —Oliver
> >>
> >>>
> >>> On Sat, Aug 30, 2025 at 4:57 PM Andre Kostur <andre_at_[hidden]> wrote:
> >>>>
> >>>> That would seem to be something that a platform would document, not the C++ language. I think you’ve got the responsibility backwards.
> >>>>
> >>>> On Sat, Aug 30, 2025 at 4:55 PM Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Calling a C++ API implemented in a shared object (.so/.dll) from a
> >>>>> foreign language should be convenient instead of impossible. Foreign
> >>>>> languages and their libraries are often implemented in C and so
> >>>>> providing a mechanism for C to use the C++ function calling convention
> >>>>> without requiring a translation layer around everything would reduce
> >>>>> friction.
> >>>>>
> >>>>> I would like to propose we somehow standardize what is part of the
> >>>>> Itanium C++ ABI's section 5.1 External Names (a.k.a. Mangling) and
> >>>>> certain other parts. This should allow a script to call C++
> >>>>> constructors/destructors on a buffer, use C++ operators and pass data
> >>>>> by reference to C++. Providing C++ reflection data in a format
> >>>>> accessible to a foreign language is also discussed as a second part to
> >>>>> this proposal below. Once again, please forgive my contrarian
> >>>>> tendencies.
> >>>>>
> >>>>> https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
> >>>>>
> >>>>> This proposal is the result of an experiment I did lately where I used
> >>>>> libclang to reimplement a C++ object model in Python by parsing C++
> >>>>> headers and then generating a script that could call the C++ symbols
> >>>>> in a .so directly. The main challenge was that I had to hijack
> >>>>> Python's support for calling C functions and use
> >>>>> implementation-defined knowledge of the C++ ABI to do so. The only
> >>>>> real problem I found was that it is impossible to return an object
> >>>>> with a non-trivial destructor by value from a C++ function to a
> >>>>> foreign language without that language knowing the C++ calling
> >>>>> convention.
> >>>>>
> >>>>> The motivation for this experiment was that the state of the art for
> >>>>> binding C++ to Python is not too great. If you are willing to write a
> >>>>> lot of C wrapper code using the Python C API then you will have a good
> >>>>> solution. Otherwise, if you want to use C++ based script binding
> >>>>> tools, the ones I tried have performance issues. I was burning 50% of
> >>>>> my compile time and 50% of my .so size. I hate to name names, because
> >>>>> they seem well written, are more mature than my efforts and represent
> >>>>> a lot of effort, but that was pybind11 and nanobind. Possibly things
> >>>>> will improve if someone uses the new C++ reflection code to generate
> >>>>> bindings without using templates.
> >>>>>
> >>>>> Overall, having a tool to write out my C++ script bindings in 0.5
> >>>>> seconds using C++ as an interface definition language was a great
> >>>>> experience by comparison. The point of using script is to be able to
> >>>>> iterate quickly and this helped enable that.
> >>>>>
> >>>>> Here is an example of an overloaded C++ constructor being called
> >>>>> directly from Python. The C++ symbols have been loaded into the Python
> >>>>> module's global namespace and so they appear to be called directly.
> >>>>> The self arg is a "this" pointer to a buffer.
> >>>>>
> >>>>> def __init__(self,*_Args,**_Kwargs):
> >>>>> match _Len(_Args):
> >>>>> case 0:
> >>>>> return
> >>>>> _ZN12OperatorTestC1Ev(_Ctypes.byref(self))
> >>>>> case 1:
> >>>>> return
> >>>>> _ZN12OperatorTestC1Ei(_Ctypes.byref(self),_Args[0])
> >>>>>
> >>>>> My biggest complaint was that I needed any of this technology at all.
> >>>>> Arguably, the C/C++ compiler could emit a form of pre-compiled header
> >>>>> that described a part of the C/C++ API found in a C/C++ header. Then
> >>>>> every scripting engine could just load that instead of needing the
> >>>>> normal script binding boilerplate that is used. The C++ symbols
> >>>>> already have type information encoded in them and so it seems strange
> >>>>> to be manually configuring marshaling code for them in another
> >>>>> language. (Please forgive me if I am rubbing you the wrong way for the
> >>>>> second time in this email.)
> >>>>>
> >>>>> For those who are really curious, the script for parsing a C++ API and
> >>>>> generating a direct call wrapper is here:
> >>>>>
> >>>>> https://github.com/whatchamacallem/hatchlingplatform/blob/main/entanglement_example/src/entanglement.py
> >>>>>
> >>>>> Nota Bene: There was one bug I can't fix with the current design of
> >>>>> C++ and Python. Returning a class by value will result in it being
> >>>>> destructed without being copied first.
> >>>>>
> >>>>> I am happy to pull together the parts of the Itanium ABI that would
> >>>>> need to be standardized into a proposal if anyone is interested. This
> >>>>> is step 1: "float the idea." The first part would be to allow C to
> >>>>> identify and call C++ function pointers (in this case directly out of
> >>>>> a .so, although that detail has not been standardized) with code
> >>>>> written only in C that was not compiled with the types involved. The
> >>>>> manner in which it is done could still be implementation defined as
> >>>>> long as there was agreement between the two languages as to how to
> >>>>> operate the additional machinery C++ needs for the particular
> >>>>> platforms ABI. The second optional part would be to standardize a set
> >>>>> of requirements for reflection data for a subset of C++ that can be
> >>>>> read from a shared object directly or stored along side one.
> >>>>>
> >>>>> Regards,
> >>>>> Adrian
> >>>>> --
> >>>>> Std-Proposals mailing list
> >>>>> Std-Proposals_at_[hidden]
> >>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> >>> --
> >>> Std-Proposals mailing list
> >>> Std-Proposals_at_[hidden]
> >>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> >>
>
Received on 2025-09-01 03:48:51