Date: Wed, 17 Jul 2024 20:00:47 +0200
On 14/07/2024 01:10, Thiago Macieira via Std-Proposals wrote:
> You're asking that the Standard Library add a new set of classes that promises
> stability. But you have not provided any evidence or method by which such new
> classes would be more stable than the current classes are right now and have
> been for two decades.
Existing classes are defacto stable: not because they are defined to be
stable, but because we decided changing them would be worse than leaving
them as-is. Using such a class in a public interface works...
sometimes... and could conceivably stop working tomorrow, because no
guarantee of it remaining stable indefinitely was ever provided.
The proposed stable classes aren't stable thanks to any deep magic in
their implementation, but because the standard formally declares them to
be stable (and provides the conditions for making that happen). You can
use them in a public interface, and code written a century from now will
still correctly interoperate with it.
>> You can find the proposal here:
>> https://docs.google.com/document/d/1P1mL1J0rXJlRnLYrcquzLVqE3jPd_IC6uQYnMIE6
>> 8vA/edit
>
> Fair.
>
> In "What are the issues with ABI?", you say that there is no guarantee of
> interoperability between different versions of a single implementation of the
> standard library. That's false. All three major implementations do promise it.
Only in a very shallow sense, though. If I switch msvc build mode from
release to debug, sizeof (std::string) changes from 32 bytes to 40
bytes. I also had to get two 3rd-party manufacturers to upgrade their
compilers (from msvc2010 to msvc2017, IIRC) because they were using
std::strings in their public interface, and their std::strings were not
the same as mine.
Of course gcc famously changed from CoW to SSO, and caused quite a few
problems on Linux as a result. And finally, there is no agreement
between compilers on what an std::string looks like, see
https://devblogs.microsoft.com/oldnewthing/20240510-00/?p=109742
> Next, you claim that implementors have reluctance in optimising. That's false
> too: there is a lot of optimisation work that goes in all three in every
> release. Yes, their hands are tied by the ABI, but that does not preclude all
> optimisation. You are correct that some forms would be disallowed, but your
> statement does not say that. It's overly broad. Plus, you have to show
> evidence that such disallowed optimisation would be wanted: please provide a
> few examples of where that could have happened but for the need to keep ABI.
It does qualify that statement by saying "...if that breaks ABI" though,
so it isn't talking about all optimisation.
Other examples would be std::deque, which is almost the same as
std::list on msvc but cannot change because of ABI concerns. std::string
could probably do better on all platforms (by leveraging more space for
the SSO buffer). std::unordered_<*> has been indicated in the past as
sub-optimal but that may primarily be because of its ill-advised bucket
interface. std::jthread wouldn't have been needed if std::thread could
have been extended. I'm not aware of other classes.
> std::regex is not such an example. The reason it isn't being changed is not
> ABI, but behaviour, in particular of the flavour of regular expression that
> class supports. Applications are written with an expectation in mind and that
> is what the Standard is for. Changing std::regex would break the Standard's
> promise, so it would be far easier to add a new class and deprecate the old
> one until it can be safely removed (q.v. Qt5 adding QRegularExpression that
> replaced the old QRegExp, and Qt6 removed the latter).
Searching for 'slow std::regex' returns numerous hits, including one
that explains the slow performance as something that cannot be fixed
because of ABI concerns:
https://stackoverflow.com/questions/70583395/why-is-stdregex-notoriously-much-slower-than-other-regular-expression-librarie
Maybe he's wrong, of course...
> The next bullet point says there's a reluctance in the committee members in
> evolving classes for fear of breaking the ABI. While that is true to some
> extent, your paper would benefit from more examples. You cannot generalise on
> one example alone.
Fair enough. Having said that, I do not think I can change the opinion
of the committee on this issue. Either they already agree it is a
problem that needs looking at, or they don't, and what I say won't
change that either way. The section was added out of a desire to
describe the status quo; I never had the illusion that I would change
anyone's mind on this.
> You refer to two strategies to keeping ABI from C, but you failed to note that
> as a derivative from C, C++ already has the two available. More importantly,
> C++ has been using those two strategies for 30 years. The majority of the
> Standard Library class implementations are actually of the "set in stone" type
> of structure, while the opaque handle type is what Qt has very successfully > used for 30 years too. So saying "good candidate for adoption into
C++" is
> unnecessary, since C++ already adopted the two 30 years ago.
>
> And given that C++ *has* adopted them and you claim that there are ABI
> problems, you need to discuss why.
That's a good point. I don't think I'm clear enough in this area. I'll
work on rephrasing it.
> Your solution is proposed to declare which classes are stable and which may
> not. I'm claiming we already have that, it just happens that the entirety of
> the Standard Library is in the former category, while the experimental stuff is
> found in Technical Reports. Moreover, library implementors have availed
> themselves of techniques to mark their implementations of the future-stable
> Standard Library as "in development". Therefore, I am claiming that your
> proposal was adopted with great success, 30 years ago.
>
> Providing tools to enforce by the compiler so one knows whether they are using
> unstable ABI in their content that itself marked for stability would be nice.
> Can you expand your paper with the actual mechanisms by which the compiler
> should do that? Not a placeholder.
I will try to come up with something that has minimal syntactic
disruption, but it's also something that offers considerable opportunity
for bikeshedding, of course...
> You're saying that a stable class must be standard layout. Why? That seems
> unnecessary.
It's a first basic attempt at defining what the requirements on stable
types would be, and I acknowledge that it requires refinement. I
deferred that work because it is a lot of effort, and I wanted to make
sure it would be necessary before committing to it.
My reason for choosing standard layout is because it is a concept that
already exists, and because it rules out the presence of a vptr/vtable.
I do not know if these are implemented identically in various compilers
(at least, within the same platform), and demanding that they all adopt
the same ABI for this would be a non-starter.
> You should answer your question on pointers too; don't leave the question
> open. In doing so, you should also ponder what happens to a class like
> std::unique_ptr<T> where T is not explicitly marked stable.
Good point.
> In discussing the stability of the standard library, you say that it does not
> help with interfacing with other standard libraries. That's true. It's not
> desired, so it's also not a problem.
That's a matter of opinion, isn't it? Interoperability, in the broadest
sense of the word (i.e. between standard libraries, between build modes,
and between languages) seems a desirable trait to me.
> You then propose that the current classes in the Standard Library not be
> marked as stable. That has a huge effect that you're not addressing at all in
> your paper. Developers *expect* stability and have come to rely on it for the
> Standard Library as it is today. Voiding that contract is either going to be
> shot down very quickly, or you need to provide a path to mitigating the
> effects.
I am aware that proposing an ABI break is not going to be accepted, and
I'm not proposing that.
If you mark existing library classes as stable, you cannot use 'stable'
to indicate a known, fixed ABI, and so you are effectively removing its
interoperability guarantee. This is why I choose not to mark these
classes as stable.
However, not marking these classes as stable does not imply they can or
must be changed. It merely preserves the status quo: they can
theoretically be changed, but in practice it won't ever happen.
This is not the same for _future_ classes though! Once stability marking
is part of the standard, new classes should be expected to change.
> In the std::stable namespace section, you say "committee-supplied ABI" but
> that's woefully short of details. Is it just a description of the class layout
> in the form of a C structure (and hence the "standard layout" requirement from
> above")? If so, say so. For me, ABI implies much more, including the size and
> alignment of types, which the committee could never provide for all possible
> target architectures.
It's in the form of a C structure. The ABI that you talk about here is
the platform ABI, which is out of scope for both this paper and the
standard, for the reason that you give.
> There are other ABI problems that would need addressing. For example, whether
> a parameter passed by value is destroyed by the caller or by the callee. This
> differs between the MinGW ABI used by Clang and GCC on Windows and the MSVC
> ABI. If your intent was to allow passing of non-trivially-destructible types
> as parameters across Standard Libraries (assuming the name mangling is not an
> impediment), you'd need to somehow fix this problem. Another one is for the two
> classes that allocate memory in your proposal: std::string and std::vector.
> Without solving the problem of name mangling, the there may be two ::operator
> new / ::operator delete pairs, and therefore deleting memory allocated by the
> other cannot be guaranteed to work.
This is an excellent point, and one I had not considered sufficiently.
As you say, operator new / delete are specific to their standard
library, and mixing this across libraries is going to be tricky or
impossible. I was worried about allocator support, but the problem is
much more fundamental than that.
I suppose one option is to include some kind of allocator-like
interface. This would also solve the more general allocator problem, but
it would come with a cost, of course.
> I am also very skeptical that there is a problem to be solved in the first
> place. The implementations are stable, the cases where a change was desired
> but not permitted due to stability are few and far between, usually minor, and
> communication between Standard Libraries may be desired by users, but not by
> the implementors themselves so attempt to move in that direction is unlikely
> to gain traction.
It just not being considered as a problem was always an option. The
community at large seems to believe there is a problem (at least, that's
the impression one could get from reading /r/cpp), but it may not be as
bad as they think.
> Finally, your working on the proposal may spur other ideas in parallel tracks
> that are workable. In particular, I am very interested in a way to mark
> functions and classes exported from libraries and, by consequence, marking
> what is internal and thus subject to change.
Although I added it in support of stability marking, I also think it's
useful to have on its own.
> But here's a suggestion: your paper seems to be addressing three things in
> one. It might be adviseable to split and address each in sequence:
>
> a) a core language change to the syntax that would allow marking of classes
> and functions as stable
> (I'm not particularly interested in the stability argument, but as I said
> above, a syntax for exporting and importing across dynamic library boundaries
> is very much needed)
I would consider this to be the main purpose of this paper.
> b) an addition to the Standard Library of some classes whose memory layout is
> more tightly defined by the Standard, with an eye towards communication between
> Standard Library implementations and possibly with other languages. You can
> rely on all the C ABI, but you'll need to define every single behaviour above
> that of C, like when destructors get run.
> (Personally, I think this is the most interesting portion)
I thought that providing an initial set of classes would be helpful, but
deferring it to a follow-on paper is fine too. I'm fine with removing
that section.
> c) a discussion of relaxing the current stability requirements in the regular
> classes in the Standard Library, with the possible fall-outs and transition
> mechanisms.
> (this one I would vehemently oppose)
That's easy enough: I'm not proposing relaxing anything in this area. I
just don't want to formally mark them as stable, since they are defacto
stable, instead of formally stable.
Hans Guijt
> You're asking that the Standard Library add a new set of classes that promises
> stability. But you have not provided any evidence or method by which such new
> classes would be more stable than the current classes are right now and have
> been for two decades.
Existing classes are defacto stable: not because they are defined to be
stable, but because we decided changing them would be worse than leaving
them as-is. Using such a class in a public interface works...
sometimes... and could conceivably stop working tomorrow, because no
guarantee of it remaining stable indefinitely was ever provided.
The proposed stable classes aren't stable thanks to any deep magic in
their implementation, but because the standard formally declares them to
be stable (and provides the conditions for making that happen). You can
use them in a public interface, and code written a century from now will
still correctly interoperate with it.
>> You can find the proposal here:
>> https://docs.google.com/document/d/1P1mL1J0rXJlRnLYrcquzLVqE3jPd_IC6uQYnMIE6
>> 8vA/edit
>
> Fair.
>
> In "What are the issues with ABI?", you say that there is no guarantee of
> interoperability between different versions of a single implementation of the
> standard library. That's false. All three major implementations do promise it.
Only in a very shallow sense, though. If I switch msvc build mode from
release to debug, sizeof (std::string) changes from 32 bytes to 40
bytes. I also had to get two 3rd-party manufacturers to upgrade their
compilers (from msvc2010 to msvc2017, IIRC) because they were using
std::strings in their public interface, and their std::strings were not
the same as mine.
Of course gcc famously changed from CoW to SSO, and caused quite a few
problems on Linux as a result. And finally, there is no agreement
between compilers on what an std::string looks like, see
https://devblogs.microsoft.com/oldnewthing/20240510-00/?p=109742
> Next, you claim that implementors have reluctance in optimising. That's false
> too: there is a lot of optimisation work that goes in all three in every
> release. Yes, their hands are tied by the ABI, but that does not preclude all
> optimisation. You are correct that some forms would be disallowed, but your
> statement does not say that. It's overly broad. Plus, you have to show
> evidence that such disallowed optimisation would be wanted: please provide a
> few examples of where that could have happened but for the need to keep ABI.
It does qualify that statement by saying "...if that breaks ABI" though,
so it isn't talking about all optimisation.
Other examples would be std::deque, which is almost the same as
std::list on msvc but cannot change because of ABI concerns. std::string
could probably do better on all platforms (by leveraging more space for
the SSO buffer). std::unordered_<*> has been indicated in the past as
sub-optimal but that may primarily be because of its ill-advised bucket
interface. std::jthread wouldn't have been needed if std::thread could
have been extended. I'm not aware of other classes.
> std::regex is not such an example. The reason it isn't being changed is not
> ABI, but behaviour, in particular of the flavour of regular expression that
> class supports. Applications are written with an expectation in mind and that
> is what the Standard is for. Changing std::regex would break the Standard's
> promise, so it would be far easier to add a new class and deprecate the old
> one until it can be safely removed (q.v. Qt5 adding QRegularExpression that
> replaced the old QRegExp, and Qt6 removed the latter).
Searching for 'slow std::regex' returns numerous hits, including one
that explains the slow performance as something that cannot be fixed
because of ABI concerns:
https://stackoverflow.com/questions/70583395/why-is-stdregex-notoriously-much-slower-than-other-regular-expression-librarie
Maybe he's wrong, of course...
> The next bullet point says there's a reluctance in the committee members in
> evolving classes for fear of breaking the ABI. While that is true to some
> extent, your paper would benefit from more examples. You cannot generalise on
> one example alone.
Fair enough. Having said that, I do not think I can change the opinion
of the committee on this issue. Either they already agree it is a
problem that needs looking at, or they don't, and what I say won't
change that either way. The section was added out of a desire to
describe the status quo; I never had the illusion that I would change
anyone's mind on this.
> You refer to two strategies to keeping ABI from C, but you failed to note that
> as a derivative from C, C++ already has the two available. More importantly,
> C++ has been using those two strategies for 30 years. The majority of the
> Standard Library class implementations are actually of the "set in stone" type
> of structure, while the opaque handle type is what Qt has very successfully > used for 30 years too. So saying "good candidate for adoption into
C++" is
> unnecessary, since C++ already adopted the two 30 years ago.
>
> And given that C++ *has* adopted them and you claim that there are ABI
> problems, you need to discuss why.
That's a good point. I don't think I'm clear enough in this area. I'll
work on rephrasing it.
> Your solution is proposed to declare which classes are stable and which may
> not. I'm claiming we already have that, it just happens that the entirety of
> the Standard Library is in the former category, while the experimental stuff is
> found in Technical Reports. Moreover, library implementors have availed
> themselves of techniques to mark their implementations of the future-stable
> Standard Library as "in development". Therefore, I am claiming that your
> proposal was adopted with great success, 30 years ago.
>
> Providing tools to enforce by the compiler so one knows whether they are using
> unstable ABI in their content that itself marked for stability would be nice.
> Can you expand your paper with the actual mechanisms by which the compiler
> should do that? Not a placeholder.
I will try to come up with something that has minimal syntactic
disruption, but it's also something that offers considerable opportunity
for bikeshedding, of course...
> You're saying that a stable class must be standard layout. Why? That seems
> unnecessary.
It's a first basic attempt at defining what the requirements on stable
types would be, and I acknowledge that it requires refinement. I
deferred that work because it is a lot of effort, and I wanted to make
sure it would be necessary before committing to it.
My reason for choosing standard layout is because it is a concept that
already exists, and because it rules out the presence of a vptr/vtable.
I do not know if these are implemented identically in various compilers
(at least, within the same platform), and demanding that they all adopt
the same ABI for this would be a non-starter.
> You should answer your question on pointers too; don't leave the question
> open. In doing so, you should also ponder what happens to a class like
> std::unique_ptr<T> where T is not explicitly marked stable.
Good point.
> In discussing the stability of the standard library, you say that it does not
> help with interfacing with other standard libraries. That's true. It's not
> desired, so it's also not a problem.
That's a matter of opinion, isn't it? Interoperability, in the broadest
sense of the word (i.e. between standard libraries, between build modes,
and between languages) seems a desirable trait to me.
> You then propose that the current classes in the Standard Library not be
> marked as stable. That has a huge effect that you're not addressing at all in
> your paper. Developers *expect* stability and have come to rely on it for the
> Standard Library as it is today. Voiding that contract is either going to be
> shot down very quickly, or you need to provide a path to mitigating the
> effects.
I am aware that proposing an ABI break is not going to be accepted, and
I'm not proposing that.
If you mark existing library classes as stable, you cannot use 'stable'
to indicate a known, fixed ABI, and so you are effectively removing its
interoperability guarantee. This is why I choose not to mark these
classes as stable.
However, not marking these classes as stable does not imply they can or
must be changed. It merely preserves the status quo: they can
theoretically be changed, but in practice it won't ever happen.
This is not the same for _future_ classes though! Once stability marking
is part of the standard, new classes should be expected to change.
> In the std::stable namespace section, you say "committee-supplied ABI" but
> that's woefully short of details. Is it just a description of the class layout
> in the form of a C structure (and hence the "standard layout" requirement from
> above")? If so, say so. For me, ABI implies much more, including the size and
> alignment of types, which the committee could never provide for all possible
> target architectures.
It's in the form of a C structure. The ABI that you talk about here is
the platform ABI, which is out of scope for both this paper and the
standard, for the reason that you give.
> There are other ABI problems that would need addressing. For example, whether
> a parameter passed by value is destroyed by the caller or by the callee. This
> differs between the MinGW ABI used by Clang and GCC on Windows and the MSVC
> ABI. If your intent was to allow passing of non-trivially-destructible types
> as parameters across Standard Libraries (assuming the name mangling is not an
> impediment), you'd need to somehow fix this problem. Another one is for the two
> classes that allocate memory in your proposal: std::string and std::vector.
> Without solving the problem of name mangling, the there may be two ::operator
> new / ::operator delete pairs, and therefore deleting memory allocated by the
> other cannot be guaranteed to work.
This is an excellent point, and one I had not considered sufficiently.
As you say, operator new / delete are specific to their standard
library, and mixing this across libraries is going to be tricky or
impossible. I was worried about allocator support, but the problem is
much more fundamental than that.
I suppose one option is to include some kind of allocator-like
interface. This would also solve the more general allocator problem, but
it would come with a cost, of course.
> I am also very skeptical that there is a problem to be solved in the first
> place. The implementations are stable, the cases where a change was desired
> but not permitted due to stability are few and far between, usually minor, and
> communication between Standard Libraries may be desired by users, but not by
> the implementors themselves so attempt to move in that direction is unlikely
> to gain traction.
It just not being considered as a problem was always an option. The
community at large seems to believe there is a problem (at least, that's
the impression one could get from reading /r/cpp), but it may not be as
bad as they think.
> Finally, your working on the proposal may spur other ideas in parallel tracks
> that are workable. In particular, I am very interested in a way to mark
> functions and classes exported from libraries and, by consequence, marking
> what is internal and thus subject to change.
Although I added it in support of stability marking, I also think it's
useful to have on its own.
> But here's a suggestion: your paper seems to be addressing three things in
> one. It might be adviseable to split and address each in sequence:
>
> a) a core language change to the syntax that would allow marking of classes
> and functions as stable
> (I'm not particularly interested in the stability argument, but as I said
> above, a syntax for exporting and importing across dynamic library boundaries
> is very much needed)
I would consider this to be the main purpose of this paper.
> b) an addition to the Standard Library of some classes whose memory layout is
> more tightly defined by the Standard, with an eye towards communication between
> Standard Library implementations and possibly with other languages. You can
> rely on all the C ABI, but you'll need to define every single behaviour above
> that of C, like when destructors get run.
> (Personally, I think this is the most interesting portion)
I thought that providing an initial set of classes would be helpful, but
deferring it to a follow-on paper is fine too. I'm fine with removing
that section.
> c) a discussion of relaxing the current stability requirements in the regular
> classes in the Standard Library, with the possible fall-outs and transition
> mechanisms.
> (this one I would vehemently oppose)
That's easy enough: I'm not proposing relaxing anything in this area. I
just don't want to formally mark them as stable, since they are defacto
stable, instead of formally stable.
Hans Guijt
Received on 2024-07-17 18:00:51