C++ Logo

std-proposals

Advanced search

Re: [std-proposals] ABI

From: Hans <hguijtra_at_[hidden]>
Date: Fri, 12 Jul 2024 12:35:20 +0200
On 11/07/2024 22:16, Thiago Macieira via Std-Proposals wrote:
> On Thursday 11 July 2024 10:31:02 GMT-7 Hans via Std-Proposals wrote:
>> If the proposal is accepted, classes that are newly added to the
>> standard will be open for on-going optimisation and evolution. Existing
>> classes will not be because of backward compatibility reasons, but those
>> can be replaced by newer versions (std::regex2, std2::regex, something
>> like that), and the new versions will be newly added and therefore open
>> to improvement.
>
> You need to talk about the transition phase, when 99.9999% of the C++ programs
> are still using the non-stable API/ABI. Given that, the number of codebases
> that could benefit from the newly-declared-stable API/ABI is virtually
> indistinguishable from zero. So why would anyone use the new content? And who
> would they use it with?

It is likely that there will always be libraries that stick to passing
unstable classes through their public API. There is no way to stop this
from happening. We can only make it more attractive to choose the
standard tools (as proposed by me, and which enforce class stability),
and document more clearly that you should not rely on class stability in
your public interfaces.

If a library adopts the standard tools for creating public interfaces,
it will be safe to call no matter what ABI changes occur thereafter. And
if a library disregards all the warning signs and just happily passes
std::unordered_map<std::jthread, std::pair<std::regex, std::mutex>>
across a public interface - well, at least they were warned.

Earlier you stated that my plan requires a time machine. It does not; it
is a plan for the future. It applies to new classes (not even to
existing classes, since those are already in use by public APIs today).
It won't magically fix every problem overnight, but rather over years,
as libraries slowly adopt the new mechanisms.

> I also think you're trying to solve a non-problem. First, who is your
> audience? Who benefits from this? Who are they interacting with that needs this
> to be solved?

The audience is library authors. They have to modify their public
interfaces to only pass stable types.

And everybody benefits. Once adopted, new standard library classes can
benefit from ongoing optimisation work, unburdened by the need to
maintain a fixed ABI. The performance benefits will be available to
everyone.

> Second, why isn't the current status quo sufficient? The libstdc++ ABI has been
> kept stable for 20 years now. By that I mean it was intended to be kept
> stable: all breakages that have happened (and they have) were unintentional,
> including that of std::__cxx11::basic_string. If they were the result of
> mistakes, there's nothing preventing mistakes from happening using the newly-
> marked classes too, so why would your proposal make a more stable ABI than
> what libstdc++ has delivered for two decades?

The current status quo is not sufficient because it fossilizes every
class, the moment it is first released. Look at the woes Linux went
through when the standard mandated SSO over CoW in std::string! Look at
the performance complaints related to std::regex, std::deque,
std::unordered_map, etc.! These things happen because of the current
status quo.

> Third, what is the cost of this? As in both performance cost for the code in
> question, the cost of porting code over to it, and the cost of writing new
> classes using this method, which itself is more costly to the developers?

Performance: any transfer through an std::stable class does incur a
cost, but if carefully designed, that cost should not be more than that
of an std::move. That is, std::stable::string, if it has a large enough
SSO buffer, can handle a move from std::string either by copying the SSO
buffer, or by just moving the pointer. This still implies two moves take
place (into, and out of std::stable::string), potentially copying as
much as 64 bytes.

It may be difficult (and even undesirable) to accomodate custom
allocators in std::stable::string. In that case, the use of custom
allocators may necessitate an allocation when passing through
std::stable::string.

std::stable::string is the most complex of all classes currently
proposed for std::stable. std::vector does not have an SSO buffer, and
the two view classes are essentially trivial.

Porting: this requires adding markup to functions in the public API,
ensuring that all parameters involved are stable (the compiler will not
allow unstable parameters), and likely a change to the build script
since the toolset has to be told it is creating a static or dynamic
library (this can potentially affect code generation: any function that
is not part of the public API can be more aggressively optimized. See
section "new optimisation opportunities" in the paper).

Ensuring that all parameters are stable may require the introduction of
stable wrapper classes that encapsulate access to unstable classes.

The only cost in writing new classes is considering if they should have
a stable ABI (and if so, what that ABI should be), and adding the
contextual keyword 'stable' to such stable classes. While designing a
stable class requires careful consideration, I would argue that this was
always the case, even without formalizing the concept of stability in
the standard.

As a final cost, the standard may wish to adopt new classes that replace
existing classes with problematic performance: std::deque2, std::regex2,
etc. (or name them something else, it's not important). These can start
life as straight-up copies of std::deque, std::regex, etc., and be
optimized over time, as they are no longer burdened by a fixed ABI.

> Fourth, assuming there is a need that isn't solved, is your audience best
> served with a different type of solution? It almost sounds like the data
> exchange should happen on a network-like serialisation protocol.

Ok, that's wild. Let's see:

- Serialisation to a network-like protocol would incur absolutely
massive overhead, compared to just passing a few variables on the stack.

- It doesn't solve the problem, it just moves it to a different spot.
Different implementations of unstable classes would still serialize
differently from each other, unless you specify their ABI. This can be
done much more cheaply using the method I proposed.

- It also still doesn't solve the problem that people would still use
unstable classes in public interfaces, unless you plan to specify the
serialisation ABI for every class in the known universe. My solution
calls for a handful of stable classes, not millions, and as such seems
more practical.


Hans Guijt

Received on 2024-07-12 10:35:21