C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Possible clean API break solution

From: Marcin Jaczewski <marcinjaczewski86_at_[hidden]>
Date: Thu, 1 Sep 2022 20:37:10 +0200
czw., 1 wrz 2022 o 03:28 Thiago Macieira <thiago_at_[hidden]> napisał(a):
>
> On Wednesday, 31 August 2022 19:00:09 -03 you wrote:
> > I do not think that `#ifdef` will be a problem, as any error will have
> > clear link error as each version has a distinct
> > mangled name.
>
> It's required for the old language/epoch to parse the file that declares stuff
> in the first place.
>
> > I would see problem more in that:
> > ```
> > extern "C++20"
> > {
> > std::string a;
> > }
> >
> > extern "C++30"
> > {
> > std::string b;
> > }
> >
> > static_assert(!std::is_same_v <decltype(a), decltype(b)>);
> > ```
>
> I'm not understanding your position.
>
> I am assuming the above is guaranteed. I understand some people may find that a
> a problem, but doing otherwise is a recipe for bigger problems.
>

I mean this is a bigger problem that needs in headers for `#ifdef` but
I do not think it is big enough to be a deal breaker.


> > `std::_30::string` and `std::_20::string` are two diffrent types. With
> > completely different name mangling.
>
> Sure. And they would be so today.
>
> > `std::string` is one of these two based on what context of `extern
> > "???"` is used.
>
> No. It's one thing only and never changes. As I said above, things that change
> depending on context are a recipe for trouble.
>

If every user needs to use this every day then yes, but most users
should never touch `extern`.
This is an expert level feature for handling multiple versions of ABI at once.
Most programs need only one.

> > And what version is `std::string` is controlled by the compiler as you
> > can now choose `-std=c++20` or `-std=c++23`.
>
> No, it should not change.
>
> And it should ESPECIALLY never be the -std= option. If you want an ABI-
> breaking option, choose something else. May I recommend the -x option, which
> selects the language?

It needs to do this when `-std=c++23` (or c++30) by p2028r0 breaks all mangling.
Beside `-x` do not allow easy mixing diffrent ABI as only one function
could need spec ABI.


>
> > Code NEED to behave differently and have different `std::string`
> > otherwise compiling old versions will be impossible.
>
> No, code needs to remain unchanged, otherwise recipe for trouble due to subtle
> bugs and incorrect things getting mixed. If my code is written for one
> language, don't make it be subtly different with another, which would then try
> to call non-existent functions in my shared library.

If old code needs an old type then it should have old mangling unless
you explicitly override it.
Any confusion will result in linking error, no silent errors. With
modules this will be even reported during
compilation process.

>
> > This will need a new compiler to compile everything or at least to
> > prepare only specific objects files that could be correctly linked
> > in the old compiler.
>
> That didn't make sense. Please clarify.
>
> I presume you're not talking about rebuilding the entire C++ world.
>

Imagine we have a big code base compiled by an old compiler and you
need to add some new code but its only available in C++30,
In my proposal you do not need to recompile everything, only new part
and explicitly force it to publish part of ABI in the old version.
Then the object file or module will be able to correctly link with the
rest of old objects.

Alternative, if you can modify code base and recompile everything in
new compiler but can't use fully new version of standard library
you can simply add `extern "C++20"` everywhere to force compiler to
use old version of standard library
and again any error in this process will be detected in link time
because by default mangling change is viral.

> > Yes, for current compilers it would need `#ifdef` but this use case
> > will happen in C++23 or C++26
>
> That assumes the new core language has no new syntactic changes that would
> thwart the parsing of the old compiler. That's a very unlikely assumption.
>
> Unless the extern "Lang" { .. } is parsed as "balanced token soup", which was
> rejected for the if constexpr case.
>

I only suggest breaking `std` lib not the whole C++, in theory it
could affect more things (like `extern "C"` prevent overloads)
but I think behavior of pure user code (that does not touch any
standard types) should never change,
aside from diffrent name mangling in each version of standard.
Maybe not exactly every version, more a "decade"/epoch of standards,
C++17 and C++20 have the same mangling, C++23 and C++30 have diffrent
one compared to C++20.

> > > > #if __cplusplus > 202002L
> > > > extern "C++20" {
> > > > #endif
> > > >
> > > > //...
> > > >
> > > > #if __cplusplus > 202002L
> > > > } // extern "C++20"
> > > > #endif
> > >
> > > This doesn't look correct. Either you meant extern "C++" or the ... should
> > > be inside the #if too.
> >
> > Example is correct, this is for C++ code that needs work now and
> > should have linkage like C++20 even when compiled in C++26 or newer
> > version.
>
> Then I don't think it'll work. As I said, given a library that is already
> compiled, it works with exactly one ABI. You can't apply another in the
> header.
>

Yes, an object file is compiled in a specific ABI, but who compiles the header?
New compiler with different ABI.
Puting `extern "C++20" {` during compilation in new compiler forces it
to obey old ABI
and as a result allow it to correctly link with old object file. And
this could be even done
by end user if he has only very old object files and headers.


> > I would see rolling out this break would take two standard versions,
> > first would introduce all these new features that are required for it
> > to work.
> > Next version will make breaking changes by bumping name mangling.
>
> See above.
>
> And you can't say "mangling" in the standard.

Ok, it could have diffrent world for this and define that `extern
"C++20"` in declaration make is diffrent
object than one declared using `extern "C++30"`.

>
> > Another thing if we already have some control over mangling then why
> > not allow some thing like `extern "C++VS"` or `extern "C++GCC"` as
> > extension.
> > This would improve portability between different compilers.
>
> We don't have control over mangling and this type of compatibility will not
> happen. All compilers that have an interest in being compatible with one
> another already are. The one that doesn't (Microsoft) has no interest in
> making the lives of everyone else easier or being compatible with them.
>

This is not a requirement but a possibility, if it will be used then
fine, if not then its fine too.
Something like this could be used by cygwin as it live in both GCC and
Windows worlds.
Right now they manage to do this but the difference would be this
could be less custom extension
but more standard code.

> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
> Software Architect - Intel DCAI Cloud Engineering
>
>
>



Here some example how trim down std header (or module as C++ should in
next 10y migrate there) would look like:
```
namespace std {

extern "C++20" class vector { }; //old vector with `bool` specialization
extern "C++30" class vector { };

extern "C++20" using string = ...; //extern there do not affect
mangling only visibility and what by default manes are visible on
right hand
extern "C++30" using string = ...;

namespace _20 {
   using vector = extern "C++20" ::std::vector; //this using should
have some marking that make it visible in any version, maybe like
`extern "C++ALL"`?
   using string = extern "C++20" ::std::string ;
}

namespace _30 {
   using vector = extern "C++30" ::std::vector;
   using string = extern "C++30" ::std::string ;
}

}
```
in the same namespace scope we declare duplicated names, but only one
is by default visible
based on the context where it is used. Any confusion what version is
used could be easily detected by
compiler (during compilation or linking), as old and new mangling
could be similar enough that having one you can check for another
and correctly report mismatches.

Another thing is that the existence of all `extern "C++20"` types
could be implementation defined.



------------------------------other email---------------------------



czw., 1 wrz 2022 o 14:51 Thiago Macieira via Std-Proposals
<std-proposals_at_[hidden]> napisał(a):
>
> On Wednesday, 31 August 2022 20:13:00 -03 Marcin Jaczewski via Std-Proposals
> wrote:
> > Basic idea from p2028r0 was that it will handle this case too. We do
> > not tweak some std types, we break everything, user types too.
> > in C++30 `myFunction` will mangle as `_Y10myFunction4Name` and will
> > not link with `_Z10myFunction4Name`.
>
> Why do you think breaking everything is a good idea?
>
> If the already-compiled library provides only the old symbol, you get a linker
> error. The problem is that the old library's header doesn't know that someone
> down the line wants to use the new ABI.

I would say this is necessary evil, as Arthur shows that only tweaking
standard types does not solve problems.
On its own it would be even bad idea but with parts I propose it become "ok".

>
> Now make this three libraries:
> 1) one using the old ABI
> 2) one using the new ABI
> 3) one needs to use the two above
>
> How do you mix them in the same translation unit now? Bear in mind that the
> out-of-line content of each of libraries (1) and (2) is already compiled, but
> you're parsing and compiling the headers of them right now.
>

I already show examples of this case:
```
extern "C++20" {
#include <oldApi.hpp>
}
#include <newApi.hpp>
```
This will force all things from first header to have same mangling as
old C++ even when
compiled in C++30.

> Now make this worse by having 3 ABIs, not just two, because you opened the
> door.
>

This method would allow even 10 versions of ABI, if we break C++ on a
10y interval then we have the next 100y covered.


> At a minimum, the new ABI must be opt-in everywhere. That's why I said this
> can never be triggered by the same thing that selects new core language
> features. Requiring that ALL the existing C++ headers mark themselves as the
> old ABI is not going to happen, which leaves the marking to the new ABI.
>

This is mainly a problem for headers files as on their own do not
carry any info about how to mangle type names.
In case of precompiled modules all info is available and each symbol
has a predefined mangled name.
Linking a module to any other module will not change it in any way.

And for what ABI is default, whole point of breaking it is to have
`std::vector<bool>` that behave like other vectors
or `std::unique_ptr` that will have `[[trivial_abi]]`, if you compile
C++26 you expect this behavior, not old that
is compatible with old ABI. At this point probably better would be to
simply drop the whole `std::` and use `std2::` instead.
Doing a mass ABI break makes sense only when 99.9% cases source code
compatibility is maintained and you need to do some hacks for the last
0.1%.
If my "hello world" program for C++26 needed to include `extern` to
fully use the new standard then a feature like this would miss the
mark completely.
In the long run at some point C++ needs to switch to a new ABI, and
this means ABI will be strongly linked to C++ version.

> Anyway, please try some experiments. You don't need to modify the compiler, we
> already have the concept of ABI tag in GCC because of the last time we broke
> ABI. The abi_tag attribute propagates somewhat.
>

Right now more important problem is how my `extern` would interact
with other language features,
we have ADL, friend functions, using types without declaration (like
`void foo(struct B*);`)

Because I could not make it interact cleany, the whole idea should be scrapped.

> > When a type involving an ABI tag is used as the type of a variable or return
> > type of a function where that tag is not already present in the signature
> > of the function, the tag is automatically applied to the variable or
> > function.
>
> But it doesn't propagate to other types that use this type as members. It
> should have done that, which would have shown how much breakage that decision
> was. See https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Attributes.html
>

In some way this is why tagging everything by default has some merits.

> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
> Software Architect - Intel DCAI Cloud Engineering
>
>
>

Received on 2022-09-01 18:37:22