Date: Tue, 22 Oct 2024 16:00:55 -0700
On Tuesday 22 October 2024 11:54:18 Pacific Daylight Time Federico Kircheis via
Std-Discussion wrote:
> > ODR violations only appear if the variable has global linkage.
>
> What is global linkage?
> I know only external, internal and weak.
Imprecise wording on my part. I meant external linkage.
> const in the first example implies internal linkage, so why should it be
> an ODR violation?
Because, by definition, internal linkage symbols can't have ODR violations.
You can still violate ODR by having the same class or enum defined in two
different ways, even with no external linkage symbols appearing, but we've long
ignored those IFNDR problems and lived happily with them.
> I'm going to ignore this part because I have no idea what it means.
> dynamic linking, for example for windows executables, works differently.
And I've already said this discussion is not about Windows. We're exclusively
discussing "shared libraries on Unix behave as if 'just another TU'".
> > And I've pointed out it does NOT work when compiled as an application if
> > you compile it in the way that library linking would: the issue is that
> > you're duplicating lib0.cpp in your executable. This is the thesis here:
> > that libraries are "just another TU" and all the effects of it apply. And
> > that includes the ill effects of ODR violations.
>
> I (the programmer) am not duplicating lib0.cpp !
Yes, you are. You're responsible for the final linking, therefore you're
responsible. The fact you don't know you are and there's no diagnostic to help
you detect it does not absolve you from the responsibility.
We may argue that the tooling should be improved to detect this and/or the
Standard language or extensions should be improved to make the situation less
likely and more detectable. I'd welcome those discussions. But that's neither
here nor there. The standard says "ODR violations are IFNDR" and the
implementation produced no diagnostic, but it's still ill-formed.
> A translation unit is a source file after preprocessing.
> I did not create two translation units with the same code.
"You" did by compiling and linking everything together. The standard doesn't
care who typed "make" or when, only about the well-formedness of the final
content. The fact you obtained some content compiled by others does not
absolve you from the need to observe the One Definition Rule and not violate
it.
> The toolchain is duplicating it for making it possible to create the two
> shared libraries.
> The standard does not say it has to behave that way.
No, it doesn't. But the fact is that this is how it behaves and because it's
your responsibility to ensure ODR is not violated, you must know whether any
duplication happened.
> I do not want to say that this is how dynamic libraries should work, but
> it is a possible alternate behavior for dynamic libraries on GNU/Linux
> systems that is available today.
And in a different universe, stars are powered by gravitational collapse. It's
an interesting theoretical exercise but irrelevant because it's not the
universe we live in.
Unlike the universe, we can change dynamic library linking on Unix systems.
But the barrier to doing that is nearly as high. So for all intents and
purposes, it's immutable law.
> Where did I say the problem is in C++?
> I wrote multiple times it is outside of C++.
> It seems to me that we are talking past each other...
I'm arguing against your assertion that "shared libraries break C++". They by
themselves do not and all the problems that we have with them are either in
C++ already or are caused by things outside of C++.
One common argument is that dynamic_cast or exceptions don't work across
library boundaries. Yes, they do, if you stick to pure C++ and don't apply
hidden visibility. But you *should* apply hidden visibility.
> >> If I do not use dynamic libraries then the code of lib0,lib1,lib2,main
> >> works as expected.
> >> If dynamic libraries where "something else", then the second and third
> >> example could work without UB too.
> >
> > "If I solve the problem, the problem is solved'. Circular reasoning.
>
> Huh?
The point is that if you do things that solve the problem, then the problem is
solved and there's nothing to discuss. And specifically, the same solution
applies whether you're using shared libraries or not. That's what I am
arguing: the problems you're relating are inside of C++ because you violated
the One Definition Rule.
> >> So my point still is:
> >>
> >> In the first example, I define multiple globals.
> >> When using libraries, I get 4 instead of 3 without changing any code.
> >
> > "If I violate ODR, the program becomes ill-formed". Right
>
> There is no ODR violation.
I've shown multiple times how you're violating it. You've argued against it by
changing the build system to remove the violation. That's great, it means you
know how to solve the problem. But the fact you *can* solve the problem does
not mean the problem didn't exist in the original case.
> See the mail of Jens and Lauri.
They seem to agree with me.
> The globals are const in different translation units.
const implies internal linkage, as Lauri pointed out. Internally-linked
symbols don't have ODR violations.
> They define different objects at difference places with internal linkage.
Correct and that's fine.
How hard is it to understand that you are allowed to have to variables of
different types named the same thing if they are static but not if they are
extern?
> Since you insist there is an ODR violation, can you show me according to
> what rule from the standard I'm breaking?
I have. https://eel.is/c++draft/basic.def.odr#15
> > MSVC doesn't count because no one is claiming that DLLs on Windows operate
> > "just like other TUs". This discussion is exclusively about how shared
> > libraries work on (modern) Unix systems.
>
> This discussion was about shared libraries in general, and I've shown
> examples on Linux.
> MSVC behaves differently, in particular for the example provided it does
> not cause UB.
> Why should it be dismissed?
> Because it does not "operate like other TU"?
> AFAIK it is not mandated in the C++ standard.
This discussion is exclusively about Unix systems because someone claimed they
are "just another TU" and I agreed, but you didn't. No one is claiming Windows
obeys the same rules.
We could be having a discussion on how to do shared libraries properly
everywhere, but that's not the discussion we're having. We could be discussing
what Physics mechanism is responsible for making stars bright, but that's not
the discussion we're having. Those things are out of scope for this
discussion, but we can switch to discussing them if you want.
> Also TU are a property of compile and link-time, not runtime (happy to
> be proven wrong).
Runtime linking is still linking.
> I consider dynamic linking out of the standard, I think I already wrote
> it more than once.
Then I don't think we need to discuss anything any more.
BTW, I also suggest you not post anything about your solving problems with
code compiled by others that you need to patch to make work. Obviously that is
an out-of-standard problem, as the standard only deals with source code and
no-library single-application linking.
> > No, it applies to C too, just in a much more limited fashion because they
> > don't have some of the causes of the problem. But it could happen with:
> >
> > lib1:
> > char *answer_of_life = NULL;
> >
> > lib2:
> > typedef struct
> > {
> >
> > size_t size;
> > char *ptr;
> >
> > } String;
> > String answer_of_life = {};
>
> I was writing about "vendoring"/"bundling", not this issue.
> Since C has no way to execute code before main is entered, and has no
> way to execute code when a dynamic library is loaded, it does not have
> the same issue with vendoring and global variables that C++ has.
It has the very same issue, as exemplified in the code above. The fact that
some content reads or writes to answer_of_life after main() is not the point.
Your problem wasn't the initialisation or order thereof, but the ODR
violation. The code above has exactly the same violation.
> >> lib0 provide the library as source code or as static library.
> >> lib1 and lib2 wants to provide something precompiled (might even be
> >> close sourced)
> >
> > Those are excuses.
>
> I believe that delivering a precompiled shared library is a valid and
> important use-case, not just an excuse.
It's an important use-case, but irrelevant to the problem at hand. The
violation happened, so it's irrelevant how the code was compiled.
Similarly, overriding some memory allocations is a valid use-case, but if a
library overrides the global operator new() in ways that don't work for other
libraries or the main application, it's a problem and it's irrelevant how this
library was compiled.
> I do not think, for example, you can install multiple versions of the
> C++ standard library on a Linux system and have it work "out-of-the-box".
Yes, you can. libstdc++ and libc++ are designed to be loadable in the same
process address space and work without stomping over each other's symbols.
Both of their low-level C++ support libraries (libsupc++ and libc++abi) are
designed to be exactly compatible with each other and interchangeable.
You can't have C++ code linking to both at the same time, but you can exchange
some data via a C glue layer. This allows, for example, a C++ application to
dynamically load a plugin using a different C++ standard library, provided the
application only accesses its C entry function and pass C types to it.
C API libraries can also do it.
> > What I said still applies: do not link a dynamic library to
> > a static library. I don't care if lib1 and lib2 are closed source: they
> > shall not include a copy of lib0 inside or they will use techniques not
> > in the Standard to hide the copy from the dynamic symbol table. This is
> > required for a quality library.
>
> It does not sound like a possible thing to provide a precompiled shared
> library without vendoring.
They shall do it if they want the label of "quality library". Or, by converse,
those libraries get the label of "poor quality library".
Better yet, don't provide precompiled. Open-source it and let others compile.
> > Which is loaded once per address space because it's a dynamic library.
> > Even if you load both libstdc++ and libc++, it works because libc++
> > namespaces itself so all its symbols are different from libstdc++'s.
>
> That is good to know, but the problem would be two libraries linked
> against libc++ (or two linked against libstdc++)
Which, as I said above and I quote, "is loaded once per process because it's a
dynamic library".
Unless you linked a dynamic library to libstdc++.a or libc++.a. But that
violates the rule of "do not link dynamic libraries to static libraries".
> We have apparently different experiences (not with the standard library,
> but with other libraries).
I'm neither disputing nor even doubting that. I do know the quality of closed-
source libraries and how poor their implementations are. I've had the
experience with teams who didn't want to open-source their code because they
were ashamed of the quality of it. And to be clear: that's a good team that
recognises they took shortcuts in the name of expediency. Most closed-source
others would be oblivious to this.
Yet ignorance of the fact you've produced low-quality content does not raise
said quality.
> I'll just note that I've never seen this rule or guideline about
> "vendoring"/"bundling" written anywhere.
> If you have a source, I'll read it gladly.
More like experience and my personal pet peeve.
Just think about how many projects copied zlib into their source codes for
expediency sake as they needed to decompress something. How do downstream
users of said libraries go about fixing the copies of zlib because a new
security issue has been reported?
Now imagine this was liblzma.
> > Because shared libraries themselves are not the problem. The point of this
> > sub-thread is that Unix dynamic library linking is "just another TU" from
> > the point of view of the C++ standard. You can use all C++ constructs the
> > same way as if you weren't using libraries, provided you accept that all
> > content in other libraries are "just another TU".
>
> The last sentence seems like a contradiction to me.
> If I "accept that all content in other libraries are "just another TU"",
> then I cannot use some constructs; for example having lib1 and lib2
> (dynamic) depending on lib0(static) with one extern global variable.
What I said works just fine. You acccept lib1 and lib2 as just other TUs. As
they both have the same "defined item" with external linkage (as per
[basic.def.odr]), linking to both implies a violation of ODR, making your
application ill-formed, no diagnostic required.
How they came about that defined item is irrelevant. Only the fact that they do
have that symbol is a problem.
Std-Discussion wrote:
> > ODR violations only appear if the variable has global linkage.
>
> What is global linkage?
> I know only external, internal and weak.
Imprecise wording on my part. I meant external linkage.
> const in the first example implies internal linkage, so why should it be
> an ODR violation?
Because, by definition, internal linkage symbols can't have ODR violations.
You can still violate ODR by having the same class or enum defined in two
different ways, even with no external linkage symbols appearing, but we've long
ignored those IFNDR problems and lived happily with them.
> I'm going to ignore this part because I have no idea what it means.
> dynamic linking, for example for windows executables, works differently.
And I've already said this discussion is not about Windows. We're exclusively
discussing "shared libraries on Unix behave as if 'just another TU'".
> > And I've pointed out it does NOT work when compiled as an application if
> > you compile it in the way that library linking would: the issue is that
> > you're duplicating lib0.cpp in your executable. This is the thesis here:
> > that libraries are "just another TU" and all the effects of it apply. And
> > that includes the ill effects of ODR violations.
>
> I (the programmer) am not duplicating lib0.cpp !
Yes, you are. You're responsible for the final linking, therefore you're
responsible. The fact you don't know you are and there's no diagnostic to help
you detect it does not absolve you from the responsibility.
We may argue that the tooling should be improved to detect this and/or the
Standard language or extensions should be improved to make the situation less
likely and more detectable. I'd welcome those discussions. But that's neither
here nor there. The standard says "ODR violations are IFNDR" and the
implementation produced no diagnostic, but it's still ill-formed.
> A translation unit is a source file after preprocessing.
> I did not create two translation units with the same code.
"You" did by compiling and linking everything together. The standard doesn't
care who typed "make" or when, only about the well-formedness of the final
content. The fact you obtained some content compiled by others does not
absolve you from the need to observe the One Definition Rule and not violate
it.
> The toolchain is duplicating it for making it possible to create the two
> shared libraries.
> The standard does not say it has to behave that way.
No, it doesn't. But the fact is that this is how it behaves and because it's
your responsibility to ensure ODR is not violated, you must know whether any
duplication happened.
> I do not want to say that this is how dynamic libraries should work, but
> it is a possible alternate behavior for dynamic libraries on GNU/Linux
> systems that is available today.
And in a different universe, stars are powered by gravitational collapse. It's
an interesting theoretical exercise but irrelevant because it's not the
universe we live in.
Unlike the universe, we can change dynamic library linking on Unix systems.
But the barrier to doing that is nearly as high. So for all intents and
purposes, it's immutable law.
> Where did I say the problem is in C++?
> I wrote multiple times it is outside of C++.
> It seems to me that we are talking past each other...
I'm arguing against your assertion that "shared libraries break C++". They by
themselves do not and all the problems that we have with them are either in
C++ already or are caused by things outside of C++.
One common argument is that dynamic_cast or exceptions don't work across
library boundaries. Yes, they do, if you stick to pure C++ and don't apply
hidden visibility. But you *should* apply hidden visibility.
> >> If I do not use dynamic libraries then the code of lib0,lib1,lib2,main
> >> works as expected.
> >> If dynamic libraries where "something else", then the second and third
> >> example could work without UB too.
> >
> > "If I solve the problem, the problem is solved'. Circular reasoning.
>
> Huh?
The point is that if you do things that solve the problem, then the problem is
solved and there's nothing to discuss. And specifically, the same solution
applies whether you're using shared libraries or not. That's what I am
arguing: the problems you're relating are inside of C++ because you violated
the One Definition Rule.
> >> So my point still is:
> >>
> >> In the first example, I define multiple globals.
> >> When using libraries, I get 4 instead of 3 without changing any code.
> >
> > "If I violate ODR, the program becomes ill-formed". Right
>
> There is no ODR violation.
I've shown multiple times how you're violating it. You've argued against it by
changing the build system to remove the violation. That's great, it means you
know how to solve the problem. But the fact you *can* solve the problem does
not mean the problem didn't exist in the original case.
> See the mail of Jens and Lauri.
They seem to agree with me.
> The globals are const in different translation units.
const implies internal linkage, as Lauri pointed out. Internally-linked
symbols don't have ODR violations.
> They define different objects at difference places with internal linkage.
Correct and that's fine.
How hard is it to understand that you are allowed to have to variables of
different types named the same thing if they are static but not if they are
extern?
> Since you insist there is an ODR violation, can you show me according to
> what rule from the standard I'm breaking?
I have. https://eel.is/c++draft/basic.def.odr#15
> > MSVC doesn't count because no one is claiming that DLLs on Windows operate
> > "just like other TUs". This discussion is exclusively about how shared
> > libraries work on (modern) Unix systems.
>
> This discussion was about shared libraries in general, and I've shown
> examples on Linux.
> MSVC behaves differently, in particular for the example provided it does
> not cause UB.
> Why should it be dismissed?
> Because it does not "operate like other TU"?
> AFAIK it is not mandated in the C++ standard.
This discussion is exclusively about Unix systems because someone claimed they
are "just another TU" and I agreed, but you didn't. No one is claiming Windows
obeys the same rules.
We could be having a discussion on how to do shared libraries properly
everywhere, but that's not the discussion we're having. We could be discussing
what Physics mechanism is responsible for making stars bright, but that's not
the discussion we're having. Those things are out of scope for this
discussion, but we can switch to discussing them if you want.
> Also TU are a property of compile and link-time, not runtime (happy to
> be proven wrong).
Runtime linking is still linking.
> I consider dynamic linking out of the standard, I think I already wrote
> it more than once.
Then I don't think we need to discuss anything any more.
BTW, I also suggest you not post anything about your solving problems with
code compiled by others that you need to patch to make work. Obviously that is
an out-of-standard problem, as the standard only deals with source code and
no-library single-application linking.
> > No, it applies to C too, just in a much more limited fashion because they
> > don't have some of the causes of the problem. But it could happen with:
> >
> > lib1:
> > char *answer_of_life = NULL;
> >
> > lib2:
> > typedef struct
> > {
> >
> > size_t size;
> > char *ptr;
> >
> > } String;
> > String answer_of_life = {};
>
> I was writing about "vendoring"/"bundling", not this issue.
> Since C has no way to execute code before main is entered, and has no
> way to execute code when a dynamic library is loaded, it does not have
> the same issue with vendoring and global variables that C++ has.
It has the very same issue, as exemplified in the code above. The fact that
some content reads or writes to answer_of_life after main() is not the point.
Your problem wasn't the initialisation or order thereof, but the ODR
violation. The code above has exactly the same violation.
> >> lib0 provide the library as source code or as static library.
> >> lib1 and lib2 wants to provide something precompiled (might even be
> >> close sourced)
> >
> > Those are excuses.
>
> I believe that delivering a precompiled shared library is a valid and
> important use-case, not just an excuse.
It's an important use-case, but irrelevant to the problem at hand. The
violation happened, so it's irrelevant how the code was compiled.
Similarly, overriding some memory allocations is a valid use-case, but if a
library overrides the global operator new() in ways that don't work for other
libraries or the main application, it's a problem and it's irrelevant how this
library was compiled.
> I do not think, for example, you can install multiple versions of the
> C++ standard library on a Linux system and have it work "out-of-the-box".
Yes, you can. libstdc++ and libc++ are designed to be loadable in the same
process address space and work without stomping over each other's symbols.
Both of their low-level C++ support libraries (libsupc++ and libc++abi) are
designed to be exactly compatible with each other and interchangeable.
You can't have C++ code linking to both at the same time, but you can exchange
some data via a C glue layer. This allows, for example, a C++ application to
dynamically load a plugin using a different C++ standard library, provided the
application only accesses its C entry function and pass C types to it.
C API libraries can also do it.
> > What I said still applies: do not link a dynamic library to
> > a static library. I don't care if lib1 and lib2 are closed source: they
> > shall not include a copy of lib0 inside or they will use techniques not
> > in the Standard to hide the copy from the dynamic symbol table. This is
> > required for a quality library.
>
> It does not sound like a possible thing to provide a precompiled shared
> library without vendoring.
They shall do it if they want the label of "quality library". Or, by converse,
those libraries get the label of "poor quality library".
Better yet, don't provide precompiled. Open-source it and let others compile.
> > Which is loaded once per address space because it's a dynamic library.
> > Even if you load both libstdc++ and libc++, it works because libc++
> > namespaces itself so all its symbols are different from libstdc++'s.
>
> That is good to know, but the problem would be two libraries linked
> against libc++ (or two linked against libstdc++)
Which, as I said above and I quote, "is loaded once per process because it's a
dynamic library".
Unless you linked a dynamic library to libstdc++.a or libc++.a. But that
violates the rule of "do not link dynamic libraries to static libraries".
> We have apparently different experiences (not with the standard library,
> but with other libraries).
I'm neither disputing nor even doubting that. I do know the quality of closed-
source libraries and how poor their implementations are. I've had the
experience with teams who didn't want to open-source their code because they
were ashamed of the quality of it. And to be clear: that's a good team that
recognises they took shortcuts in the name of expediency. Most closed-source
others would be oblivious to this.
Yet ignorance of the fact you've produced low-quality content does not raise
said quality.
> I'll just note that I've never seen this rule or guideline about
> "vendoring"/"bundling" written anywhere.
> If you have a source, I'll read it gladly.
More like experience and my personal pet peeve.
Just think about how many projects copied zlib into their source codes for
expediency sake as they needed to decompress something. How do downstream
users of said libraries go about fixing the copies of zlib because a new
security issue has been reported?
Now imagine this was liblzma.
> > Because shared libraries themselves are not the problem. The point of this
> > sub-thread is that Unix dynamic library linking is "just another TU" from
> > the point of view of the C++ standard. You can use all C++ constructs the
> > same way as if you weren't using libraries, provided you accept that all
> > content in other libraries are "just another TU".
>
> The last sentence seems like a contradiction to me.
> If I "accept that all content in other libraries are "just another TU"",
> then I cannot use some constructs; for example having lib1 and lib2
> (dynamic) depending on lib0(static) with one extern global variable.
What I said works just fine. You acccept lib1 and lib2 as just other TUs. As
they both have the same "defined item" with external linkage (as per
[basic.def.odr]), linking to both implies a violation of ODR, making your
application ill-formed, no diagnostic required.
How they came about that defined item is irrelevant. Only the fact that they do
have that symbol is a problem.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel DCAI Platform & System Engineering
Received on 2024-10-22 23:00:59