Date: Wed, 05 Feb 2025 08:52:08 -0800
On Tuesday 4 February 2025 23:37:13 Pacific Standard Time Tiago Freire wrote:
> It may make things harder, but I don't see it as a deal breaker.
> Do you really need to know the mangled name at compile time?
I think so. The only place where we don't need the mangling of the type would
be in non-static pointer or reference members. Here are a couple of cases
where the type's own mangling will show up:
* formal parameters to functions
* template parameters (to functions, classes, or variables)
* global or static members
Usually, template classes and functions are inline, but they don't have to.
They can be extern templates that are implemented elsewhere and this couldn't
be done with a typename-only type. More importantly, they need to be merged
with the equivalent declaration where the type's full mangling is known,
especially when taking the address of a function or variable template.
As for global or static members, this depends on the mangling scheme used by
the compiler. MSVC mangles the type of a variable.
> Couldn't the compiler just use the fully declared name, mark it as an alias,
> and when it actually needs to know the name (which is at most when
> exporting symbols from a library), and then as long as it is marked
> somewhere when linking the linker can go "oh yeah, I know what this type
> is, this is how I'm going to name it"?
You're not talking about the compiler here. You're talking about the linker.
Let's take a simple example of the most likely scenario:
a.cpp:
typename Foo;
void f(Foo*);
b.cpp:
inline namespace V2 { struct Foo {}; }
void f(Foo*) {}
First question here is how the linker can find the symbol it wants among the
exported/mangled names? I suspect you were thinking that the compiler would
emit in b.o something saying that "plain Foo = N2V23FooE" (or "UFoo_at_V2@" for
MSVC). Then the linker might do a search-and-replace of some pattern in a.o's
symbols and match those to what came from b.o. There's a problem with this
because the to-be-replaced pattern may occur in other, unrelated symbols,
causing existing code to suddenly break. It might be possible for the ABIs to
come up with an expansion that is impossible in any mangled name that didn't
start as UB in the first place - for example, __ anywhere in a symbol is UB,
but I saw yesterday __ in a mangled symbol generated by GCC from plain content
(a requires clause) - so I wouldn't bet on it.
It might work with a double replacement: the emitted symbol contains a random
and the compiler emits that "15__L8HcoRlRqqfA = searching for plain 'Foo'", so
when the linker finds that "plain 'Foo' = N2V23FooE" it replaces
15__L8HcoRlRqqfA with it. But this is a lot of work for the linker, because
this is per .o, instead of a global replace.
Second, how to tell among two different types called "plain Foo" which one you
want. Suppose I now add to my build:
compat.cpp:
struct Foo {};
void f(Foo*) {}
If this is automatic, then this .o will also say "plain 'Foo' = 3Foo". How
will the linker know to disregard this one? Then no, this solution of yours
could definitely not be automatic; we'd need b.cpp to explicitly say that this
is the "Foo" that can be found by pattern replacement. Is the feature worth
with this requirement?
Third, this is just not how linkers work. What happens if b.o is in a different
library? Let's start with static libraries: the way they work is that the
linker looks at its list of unresolved references and searches for a .o among
its .a libraries for one that supplies it. We'd need a way for it to find a
given .o that contains the replacement information, so it then proceeds to
searching again for _Z1fPN2V23FooE. I think this is doable, only difficult.
But shared libraries / DLLs? Now you need more information somewhere in the
library file format. It can't be the regular symbol tables because symbols map
to addresses / values, not other symbols. The file format itself would need to
be extended to supply this (for ELF, it could be a symbol using a new type,
whose value is an index or offset into the dynamic string table).
Fourth, the combination of the second and third scenarios: what happens if I
am linking to libMine.so.1 and libMine.so.2, both of which provide a "plain
Foo" that differs only in the inline namespace? How is the linker to know which
one you meant?
Fifth and finally, you require linker changes. The linkers have adapted to C++
requirements, such as vague symbols, unique linkages, etc., but they're slow
to adopt. And quite often, the linkers are provided by the OS vendor not the
compiler vendor, with most visible case being macOS's ld64, so compiler
vendors can't affect their functionality, at least not timely. This
functionality would need several years of "sorry, linking failed" lead time
before it went into widespread use. We may even have got modules by then.
> > I think that's overengineering and dramatically reduces the chance of
> > adoption. If we require binary format and toolchain updates, the barrier
> > of adoption goes up. It will have ripple effects many years later, as we
> > learn all of the effects (intentional or not) of having them.
>
> I wouldn't say its overengineering, it's a difficult problem to address that
> requires extra work to achieve. It's a matter of does the reward offset
> the cost?
> Compilers had to be remade to accommodate things like constexpr, and I think
> we are now better off for it.
Yes, but those were compiler-only changes. You're talking about linker changes
and affecting things across TU. This is an area where the Standard usually just
washes its hands and says IFNDR.
> struct A;
> struct A::B;
> void* func(A::B&);
>
> One thing that I think is important when compiling code is consistency.
> This piece of snippet shouldn't all of a sudden stop working when the actual
> definition is provided.
> But what if "struct A::B" is private? Or even non-existent?
Syntax error. That's easy for the compiler to detect, by A's closing brace.
However, what happens if A's full definition is never seen in this TU? Then it
would be possible to declare functions that take it as formal or template
parameters, or variables of pointers or references to this type, that it
wouldn't otherwise have been able to. The Standard would say it's IFNDR if you
get it wrong, because a) the mangling of a private B might be different from
that of a public B (it isn't in both ABIs) and b) link-time code generation
can detect this situation like it can detect ODR violations. In fact, this
would be a particular case of ODR violation.
> 2. Now let's consider that "struct A::B" exists but it is not public to
> "void* func(A::B&);" Again, the application must compile, symbols must be
> generated.
> You would think, when someone would try to use it the compiler would need to
> see the definition and catch this problem, but actually no. While the type
> can be private to "void* func(A::B&);" it can be public to something else
> that uses "func", at which point it may no longer have that information.
> Thus, leading to a scenario where struct A::B is private, but you have
> function that can reference the type, and the application compiles. Sure,
> you might not be able to do more than "take address", maybe that's ok. But
> either or not you get actually generated code can change depending on
> either or not both things are visible in the same translation unit.
A more worrisome scenario is
template <typename T> void f();
with struct A::B, I can call f<A::B>() and none will be the wiser.
> Unless.... you hoist the access information to link time. In that case you
> would still get an error, just at a different stage.
> Linker work might be necessary.
It would be a mangling change for A::B - instead of N1A1BE, we'd need
something inserted between 1A and 1B to indicate that B is a private struct.
Likewise for MSVC, where right now it's just B_at_A@. If we really wanted to make
this be caught, we'd need to break the world.
That will never happen, especially because the Standard would probably just
declare this as IFNDR and, that being the case, the mangling schemes would not
change.
> It may make things harder, but I don't see it as a deal breaker.
> Do you really need to know the mangled name at compile time?
I think so. The only place where we don't need the mangling of the type would
be in non-static pointer or reference members. Here are a couple of cases
where the type's own mangling will show up:
* formal parameters to functions
* template parameters (to functions, classes, or variables)
* global or static members
Usually, template classes and functions are inline, but they don't have to.
They can be extern templates that are implemented elsewhere and this couldn't
be done with a typename-only type. More importantly, they need to be merged
with the equivalent declaration where the type's full mangling is known,
especially when taking the address of a function or variable template.
As for global or static members, this depends on the mangling scheme used by
the compiler. MSVC mangles the type of a variable.
> Couldn't the compiler just use the fully declared name, mark it as an alias,
> and when it actually needs to know the name (which is at most when
> exporting symbols from a library), and then as long as it is marked
> somewhere when linking the linker can go "oh yeah, I know what this type
> is, this is how I'm going to name it"?
You're not talking about the compiler here. You're talking about the linker.
Let's take a simple example of the most likely scenario:
a.cpp:
typename Foo;
void f(Foo*);
b.cpp:
inline namespace V2 { struct Foo {}; }
void f(Foo*) {}
First question here is how the linker can find the symbol it wants among the
exported/mangled names? I suspect you were thinking that the compiler would
emit in b.o something saying that "plain Foo = N2V23FooE" (or "UFoo_at_V2@" for
MSVC). Then the linker might do a search-and-replace of some pattern in a.o's
symbols and match those to what came from b.o. There's a problem with this
because the to-be-replaced pattern may occur in other, unrelated symbols,
causing existing code to suddenly break. It might be possible for the ABIs to
come up with an expansion that is impossible in any mangled name that didn't
start as UB in the first place - for example, __ anywhere in a symbol is UB,
but I saw yesterday __ in a mangled symbol generated by GCC from plain content
(a requires clause) - so I wouldn't bet on it.
It might work with a double replacement: the emitted symbol contains a random
and the compiler emits that "15__L8HcoRlRqqfA = searching for plain 'Foo'", so
when the linker finds that "plain 'Foo' = N2V23FooE" it replaces
15__L8HcoRlRqqfA with it. But this is a lot of work for the linker, because
this is per .o, instead of a global replace.
Second, how to tell among two different types called "plain Foo" which one you
want. Suppose I now add to my build:
compat.cpp:
struct Foo {};
void f(Foo*) {}
If this is automatic, then this .o will also say "plain 'Foo' = 3Foo". How
will the linker know to disregard this one? Then no, this solution of yours
could definitely not be automatic; we'd need b.cpp to explicitly say that this
is the "Foo" that can be found by pattern replacement. Is the feature worth
with this requirement?
Third, this is just not how linkers work. What happens if b.o is in a different
library? Let's start with static libraries: the way they work is that the
linker looks at its list of unresolved references and searches for a .o among
its .a libraries for one that supplies it. We'd need a way for it to find a
given .o that contains the replacement information, so it then proceeds to
searching again for _Z1fPN2V23FooE. I think this is doable, only difficult.
But shared libraries / DLLs? Now you need more information somewhere in the
library file format. It can't be the regular symbol tables because symbols map
to addresses / values, not other symbols. The file format itself would need to
be extended to supply this (for ELF, it could be a symbol using a new type,
whose value is an index or offset into the dynamic string table).
Fourth, the combination of the second and third scenarios: what happens if I
am linking to libMine.so.1 and libMine.so.2, both of which provide a "plain
Foo" that differs only in the inline namespace? How is the linker to know which
one you meant?
Fifth and finally, you require linker changes. The linkers have adapted to C++
requirements, such as vague symbols, unique linkages, etc., but they're slow
to adopt. And quite often, the linkers are provided by the OS vendor not the
compiler vendor, with most visible case being macOS's ld64, so compiler
vendors can't affect their functionality, at least not timely. This
functionality would need several years of "sorry, linking failed" lead time
before it went into widespread use. We may even have got modules by then.
> > I think that's overengineering and dramatically reduces the chance of
> > adoption. If we require binary format and toolchain updates, the barrier
> > of adoption goes up. It will have ripple effects many years later, as we
> > learn all of the effects (intentional or not) of having them.
>
> I wouldn't say its overengineering, it's a difficult problem to address that
> requires extra work to achieve. It's a matter of does the reward offset
> the cost?
> Compilers had to be remade to accommodate things like constexpr, and I think
> we are now better off for it.
Yes, but those were compiler-only changes. You're talking about linker changes
and affecting things across TU. This is an area where the Standard usually just
washes its hands and says IFNDR.
> struct A;
> struct A::B;
> void* func(A::B&);
>
> One thing that I think is important when compiling code is consistency.
> This piece of snippet shouldn't all of a sudden stop working when the actual
> definition is provided.
> But what if "struct A::B" is private? Or even non-existent?
Syntax error. That's easy for the compiler to detect, by A's closing brace.
However, what happens if A's full definition is never seen in this TU? Then it
would be possible to declare functions that take it as formal or template
parameters, or variables of pointers or references to this type, that it
wouldn't otherwise have been able to. The Standard would say it's IFNDR if you
get it wrong, because a) the mangling of a private B might be different from
that of a public B (it isn't in both ABIs) and b) link-time code generation
can detect this situation like it can detect ODR violations. In fact, this
would be a particular case of ODR violation.
> 2. Now let's consider that "struct A::B" exists but it is not public to
> "void* func(A::B&);" Again, the application must compile, symbols must be
> generated.
> You would think, when someone would try to use it the compiler would need to
> see the definition and catch this problem, but actually no. While the type
> can be private to "void* func(A::B&);" it can be public to something else
> that uses "func", at which point it may no longer have that information.
> Thus, leading to a scenario where struct A::B is private, but you have
> function that can reference the type, and the application compiles. Sure,
> you might not be able to do more than "take address", maybe that's ok. But
> either or not you get actually generated code can change depending on
> either or not both things are visible in the same translation unit.
A more worrisome scenario is
template <typename T> void f();
with struct A::B, I can call f<A::B>() and none will be the wiser.
> Unless.... you hoist the access information to link time. In that case you
> would still get an error, just at a different stage.
> Linker work might be necessary.
It would be a mangling change for A::B - instead of N1A1BE, we'd need
something inserted between 1A and 1B to indicate that B is a private struct.
Likewise for MSVC, where right now it's just B_at_A@. If we really wanted to make
this be caught, we'd need to break the world.
That will never happen, especially because the Standard would probably just
declare this as IFNDR and, that being the case, the mangling schemes would not
change.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel DCAI Platform & System Engineering
Received on 2025-02-05 16:52:14