C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Calling C++ functions in a .so directly from foreign languages and C++ as an interface definition language

From: Paul Caprioli <paul_at_[hidden]>
Date: Mon, 1 Sep 2025 03:54:23 +0000
> [Need to have] an understanding of how to go from name+function type to symbol name The C++ compiler knows how to do this. Why do it yourself? The Python C API can be called from C++ code. I would argue that Python's ctypes, which provides C compatible data types and allows calling functions in shared libraries, is the wrong tool. > The issue is that most languages do not provide support for bridging to C++ automatically, and doing so manually is hard because it requires determining the mangling for a method, which means manually determining things like the full real type signature (post alias expansion), etc. Well, support is provided by packages like pybind11 (requiring C++11) and nanobind (requiring C++17). It's true that they're not part of the Python standard library, but that's not so important. I have seen additions made to the Python C API specifically at the request of nanobind. I have seen Python core developers reach out to pybind11, nanobind, and other interoperability frameworks to request their moving away from APIs that Python would like to deprecate years in the future. These requests have been mutually agreed upon and honored. I've experienced it to be a cooperative, mutually respectful community of developers. >> The moment your boss asks you to make your high performance C++ >> library available to the Python devs you end up going right back to >> the stone age. My company's FFT library has been made available to Python developers using nanobind. A clean build using -O3 and LTO takes less than 15 seconds on a high-end desktop that is only a couple of years old. That includes building libnanobind-static.a, which is 429KB. I am using Intel's LLVM compiler 2025.2.0. The output, hpk.cpython-311-x86_64-linux-gnu.so, is 1.1MB. The Python command `import hpk` loads this module (this shared library) into Python. Nanobind is not magic--it's just C++ code. It's easier for me than calling the Python C API directly. But one does not have to use these binding tools. The point I'm making is that the Python C API functions can be called from C++ code, and the C++ code can be compiled into a shared library that is loaded into Python. > While writing this I did realize that there is an additional potentially “real world expectation" issue when trying to bridge to C++ entirely at runtime: you have to work entirely in terms of specific instantiations of template methods and classes - e.g. (and ignore whatever aliases may be involved) your bridge can’t use `std::vector<some_type>` unless there are existing specializations of every method you end up needing for `vector<some_type>` in the already generated code Yes, but that's true for pure C++ as well (unless a library has a header-only implementation). I had to explicitly instantiate templated classes and functions to build libhpk_fft_avx512_fp64.so. That library is for type double and contains, for example, the class hpk::fft::InplaceCC<double> for computing in-place FFTs in complex double precision. If somebody wanted hpk::fft::InplaceCC<std::float64_t>, they would not find it it in any shared library, and they could not use it. Using nanobind, I had to explicitly bind hpk::fft::InplaceCC<T> for each T, i.e., for double, float, _Float16. Similarly, factory functions had to be bound three times, once for each T. These three factory functions are overloads from the Python perspective. In other words, C++ template argument deduction is accomplished using Python function overloading. These C++ factory functions return a std::unique_ptr. Nanobind does the work of transferring ownership to Python and arranging to call the C++ Deleter when the Python reference count drops to zero. Furthermore, hpk::fft::InplaceCC<double> is a C++ abstract base class. Calling its compute methods is making a C++ virtual function call. Derived classes, for example, have optimized code for small, medium, and large size FFTs and have hidden visibility. It all just works, today, without standardizing name mangling.

Received on 2025-09-01 03:54:25