Date: Sun, 24 Sep 2023 12:01:08 -0400
On Sun, Sep 24, 2023 at 11:47 AM Frederick Virchanza Gotham via
Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> On Sun, Sep 24, 2023 at 12:23 AM Jason McKesson wrote:
> >
> > Why do you think the number of `typeinfo` objects is relevant here?
> >
> > `typeinfo`s are only generated if you statically do something that
> > *requires* such a type to exist. Like calling `typeid` on it, or maybe
> > using `dynamic_cast`. If your program never uses `typeid` on those
> > types, then `typeinfo` objects won't exist for them.
>
>
> There are a few more scenarios in which the typeinfo will be
> generated. Consider the following translation unit:
>
> enum Animal { Frog, Cat, Monkey };
> void Func(void) {}
>
> If I use the latest version of g++ on Linux to make an object file out
> of this, and then run "nm -C" on it, I get:
>
> 0000000000000000 T Func()
>
> Now if I make an alteration to the translation unit as follows:
>
> enum Animal { Frog, Cat, Monkey };
> void Func(void) { throw Frog; }
>
> I compile this and run "nm -C", and I get:
>
> U __cxa_allocate_exception
> U __cxa_throw
> 0000000000000000 T Func()
> 0000000000000000 V typeinfo for Animal
> 0000000000000000 V typeinfo name for Animal
> U vtable for __cxxabiv1::__enum_type_info
>
> So if you ever throw an object of any type, then the typeinfo for that
> type is generated and put inside the object file. The same thing
> happens if you use any type with 'std::any', like in the following
> translation unit:
>
> #include <any>
> enum Animal { Frog, Cat, Monkey };
> void Func(std::any &arg) { arg = Frog; }
>
> This too results in the typeinfo for 'Animal' being generated and put
> inside the object file.
>
> I'm going to try to implement "std::visit( Visitor, std::any& )" for
> my program which has 41 object files. First thing I'll do is create a
> source file that includes all the C++ standard library header files:
>
> echo "algorithm<any<array<atomic<barrier<bit<bitset<cassert<ccomplex<cctype<cerrno<cfenv<cfloat<charconv<chrono<cinttypes<ciso646<climits<clocale<cmath<codecvt<compare<complex<concepts<condition_variable<coroutine<csetjmp<csignal<cstdalign<cstdarg<cstdbool<cstddef<cstdint<cstdio<cstdlib<cstring<ctgmath<ctime<cuchar<cwchar<cwctype<deque<exception<execution<expected<filesystem<format<forward_list<fstream<functional<future<initializer_list<iomanip<ios<iosfwd<iostream<istream<iterator<latch<limits<list<locale<map<memory<memory_resource<mutex<new<numbers<numeric<optional<ostream<queue<random<ranges<ratio<regex<scoped_allocator<semaphore<set<shared_mutex<source_location<span<spanstream<sstream<stack<stacktrace<stdexcept<stdfloat<stop_token<streambuf<string<string_view<strstream<syncstream<system_error<thread<tuple<typeindex<typeinfo<type_traits<unordered_map<unordered_set<utility<valarray<variant<vector<version"
> | tr '<' '\n' | awk '{print "#include <" $s ">"}' > allheaders.cpp
>
> Next I will append an inclusion directive for all my own header files:
>
> find -maxdepth 1 -iname "*.h" -or -iname "*.hpp" | sort | awk
> '{print "#include \"" $s "\""}' >> allheaders.cpp
>
> In tandem to this, I'll get a list of all typeinfo's across all the
> object files:
>
> nm -C *.o | grep "typeinfo for" | grep -v " U typeinfo for" | cut
> -d ' ' -f 5- | sort | uniq > alltypes.txt
>
> Next I wrote a program that would read in all the types from
> "alltypes.txt" and compose a series of 'if' statements:
>
> http://www.virjacode.com/download/make_visit_any.cpp
>
> I built this program and ran it, and it generated the following header
> file for me:
>
> http://www.virjacode.com/download/visit_any.hpp
>
> I included this header file in a source code file with one simple
> function as follows:
>
> extern void SomeOtherFuncInSomeOtherTranslationUnit(void const*);
>
> extern void Func(std::any &arg)
> {
> visit<void>( [](auto &&obj){
> SomeOtherFuncInSomeOtherTranslationUnit(&obj); }, arg );
> }
>
> I compiled this source file with "-DNDEBUG -Os", and got an object
> file of size 8.3 megabytes. The assembler is 24 megabytes which you
> can see here:
>
> http://www.virjacode.com/download/assembler_for_visit_any.txt
>
> So anyway the whole point in me doing all this is just to get people
> thinking. It _is_ possible to implement "std::visit" for "std::any",
> and it _is_ possible to have a template catch-block, it's just that
> we'll have a few megabytes of machine code, and also that we might
> have to have some way of enumerating all of the types present in a
> given project (like how I scoured through all the object files with
> "nm -C").
Define "possible".
The world you describe requires that the compiler can see object
files. Which... it can't, because compilers *generate* object files.
It doesn't read them.
As such, at the moment the compiler sees your `template catch` block,
it must generate something. And that "something" *cannot* be related
to what is happening in other object files. That's just not how
"compilation" works.
Maybe you're hypothesizing that a compiler will generate some abstract
syntax tree of the contents of the `template catch` block, upon which
the linker (the thing that actually sees object files) will perform
the actual act of compilation based on the object files it actually
sees. But that's not a thing that happens in C++. And indeed, with
DLLs/SOs, that's basically impossible.
Furthermore, you're talking about putting a compiler *inside* the
linker. That compiler would need to have access to not just the AST of
the contents of the block, but of *everything else* that block looks
at or depends on. It would need to modify the binary machine language
of the function it is in, effectively recompiling that function. Which
in turn requires recompilation of everything that function depends on.
And if that function is itself a template function or has been
inlined... well, that's going to get ugly *really* fast.
Or you know, we could just not break the entire C++ compilation model
for a feature that has no actual merit. That too.
Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> On Sun, Sep 24, 2023 at 12:23 AM Jason McKesson wrote:
> >
> > Why do you think the number of `typeinfo` objects is relevant here?
> >
> > `typeinfo`s are only generated if you statically do something that
> > *requires* such a type to exist. Like calling `typeid` on it, or maybe
> > using `dynamic_cast`. If your program never uses `typeid` on those
> > types, then `typeinfo` objects won't exist for them.
>
>
> There are a few more scenarios in which the typeinfo will be
> generated. Consider the following translation unit:
>
> enum Animal { Frog, Cat, Monkey };
> void Func(void) {}
>
> If I use the latest version of g++ on Linux to make an object file out
> of this, and then run "nm -C" on it, I get:
>
> 0000000000000000 T Func()
>
> Now if I make an alteration to the translation unit as follows:
>
> enum Animal { Frog, Cat, Monkey };
> void Func(void) { throw Frog; }
>
> I compile this and run "nm -C", and I get:
>
> U __cxa_allocate_exception
> U __cxa_throw
> 0000000000000000 T Func()
> 0000000000000000 V typeinfo for Animal
> 0000000000000000 V typeinfo name for Animal
> U vtable for __cxxabiv1::__enum_type_info
>
> So if you ever throw an object of any type, then the typeinfo for that
> type is generated and put inside the object file. The same thing
> happens if you use any type with 'std::any', like in the following
> translation unit:
>
> #include <any>
> enum Animal { Frog, Cat, Monkey };
> void Func(std::any &arg) { arg = Frog; }
>
> This too results in the typeinfo for 'Animal' being generated and put
> inside the object file.
>
> I'm going to try to implement "std::visit( Visitor, std::any& )" for
> my program which has 41 object files. First thing I'll do is create a
> source file that includes all the C++ standard library header files:
>
> echo "algorithm<any<array<atomic<barrier<bit<bitset<cassert<ccomplex<cctype<cerrno<cfenv<cfloat<charconv<chrono<cinttypes<ciso646<climits<clocale<cmath<codecvt<compare<complex<concepts<condition_variable<coroutine<csetjmp<csignal<cstdalign<cstdarg<cstdbool<cstddef<cstdint<cstdio<cstdlib<cstring<ctgmath<ctime<cuchar<cwchar<cwctype<deque<exception<execution<expected<filesystem<format<forward_list<fstream<functional<future<initializer_list<iomanip<ios<iosfwd<iostream<istream<iterator<latch<limits<list<locale<map<memory<memory_resource<mutex<new<numbers<numeric<optional<ostream<queue<random<ranges<ratio<regex<scoped_allocator<semaphore<set<shared_mutex<source_location<span<spanstream<sstream<stack<stacktrace<stdexcept<stdfloat<stop_token<streambuf<string<string_view<strstream<syncstream<system_error<thread<tuple<typeindex<typeinfo<type_traits<unordered_map<unordered_set<utility<valarray<variant<vector<version"
> | tr '<' '\n' | awk '{print "#include <" $s ">"}' > allheaders.cpp
>
> Next I will append an inclusion directive for all my own header files:
>
> find -maxdepth 1 -iname "*.h" -or -iname "*.hpp" | sort | awk
> '{print "#include \"" $s "\""}' >> allheaders.cpp
>
> In tandem to this, I'll get a list of all typeinfo's across all the
> object files:
>
> nm -C *.o | grep "typeinfo for" | grep -v " U typeinfo for" | cut
> -d ' ' -f 5- | sort | uniq > alltypes.txt
>
> Next I wrote a program that would read in all the types from
> "alltypes.txt" and compose a series of 'if' statements:
>
> http://www.virjacode.com/download/make_visit_any.cpp
>
> I built this program and ran it, and it generated the following header
> file for me:
>
> http://www.virjacode.com/download/visit_any.hpp
>
> I included this header file in a source code file with one simple
> function as follows:
>
> extern void SomeOtherFuncInSomeOtherTranslationUnit(void const*);
>
> extern void Func(std::any &arg)
> {
> visit<void>( [](auto &&obj){
> SomeOtherFuncInSomeOtherTranslationUnit(&obj); }, arg );
> }
>
> I compiled this source file with "-DNDEBUG -Os", and got an object
> file of size 8.3 megabytes. The assembler is 24 megabytes which you
> can see here:
>
> http://www.virjacode.com/download/assembler_for_visit_any.txt
>
> So anyway the whole point in me doing all this is just to get people
> thinking. It _is_ possible to implement "std::visit" for "std::any",
> and it _is_ possible to have a template catch-block, it's just that
> we'll have a few megabytes of machine code, and also that we might
> have to have some way of enumerating all of the types present in a
> given project (like how I scoured through all the object files with
> "nm -C").
Define "possible".
The world you describe requires that the compiler can see object
files. Which... it can't, because compilers *generate* object files.
It doesn't read them.
As such, at the moment the compiler sees your `template catch` block,
it must generate something. And that "something" *cannot* be related
to what is happening in other object files. That's just not how
"compilation" works.
Maybe you're hypothesizing that a compiler will generate some abstract
syntax tree of the contents of the `template catch` block, upon which
the linker (the thing that actually sees object files) will perform
the actual act of compilation based on the object files it actually
sees. But that's not a thing that happens in C++. And indeed, with
DLLs/SOs, that's basically impossible.
Furthermore, you're talking about putting a compiler *inside* the
linker. That compiler would need to have access to not just the AST of
the contents of the block, but of *everything else* that block looks
at or depends on. It would need to modify the binary machine language
of the function it is in, effectively recompiling that function. Which
in turn requires recompilation of everything that function depends on.
And if that function is itself a template function or has been
inlined... well, that's going to get ugly *really* fast.
Or you know, we could just not break the entire C++ compilation model
for a feature that has no actual merit. That too.
Received on 2023-09-24 16:01:19