Date: Sat, 2 Mar 2019 18:03:37 +0000
The last thread wandered off topic pretty quickly, so I'll try this again and be more aggressive in telling people to take off-topic conversations to another thread.
I would like to find a way for users to decouple the upgrading of tools from the migration to modules. This is not intended as a long term idealized solution, but as a way to ease migration pains. We should also work on the ideal solution, but in parallel to the legacy solution. I think this has the potential to make the upgrade from C++17 to C++20 roughly the same cost to users as the upgrade from a C++14 to C++17. This was discussed some in the impromptu tooling session on Friday at Kona 2019.
The no-build-system-upgrade constraint implies other constraints:
1. No up-front scanning of the source to find module name and dependency information, because a lot of current build systems don't currently have a scan step.
2. No dynamic dependencies between TUs. Many current build systems assume that the .cpp -> .o[bj] transformation is trivially parallelizable.
3. No upgrade of build tool executables. This has to work with versions of "make", "ninja", and "scons" from 10+ years ago.
4. You _can_ add compiler / linker flags.
Some quick notes on this implementation strategy:
* Uses TEXTUAL inclusion
* Compiler assumes that the build system knows nothing of BMIs
* Compiler needs to be able to do module mapping with minimal input from users.
The scheme I have in mind would result in no build throughput improvements with the old bad build systems, but I think it would still provide the isolation benefits of modules and be conforming. When the user is able to upgrade their build system, they can start getting the build throughput improvements.
The general idea is to treat the module interface file as a glorified header (Gaby has mentioned this possibility in various venues). When the user passes --strawman-slow-modules (or perhaps -frewrite-imports...) to the compiler, the compiler does a textual inclusion of the module interface file (no BMI involved at all). The textual inclusion would likely involve placing a #pragma strawman-module begin(name-of-module) directive, with a #pragma strawman-module end(name-of-module) directive at the end of the module text. Each TU will duplicate this work. If the compiler can emit this text file, then it can be distributed using existing technologies that are expecting preprocessed files. This is similar in nature to clang's -frewrite-imports.
So this requires that compilers support this textual modules approach. It also requires that the compiler be able to find the module interface files without requiring the (dumb) build system to scan in advance. The "easiest" (and slow) way to make this happen is to require that module names correspond to file names, and that compilers provide a search path. I am well aware that this isn't fast, but this general scheme is intended for build system compatibility. Vendors should also provide a faster thing that can be used by newer build systems. Compilers can also provide a command line override to say where a creatively named module can be found.
Example input:
// ~/all_libs/my_lib/data.cpp
// interface + implementation
export module data;
export int x;
// ~/all_libs/my_lib/foo.cpp
// interface
export module foo;
import data;
export int func();
// ~/all_libs/my_lib/foo_impl.cpp
// implementation
module foo;
int func() {return x;};
// ~/my_exe/bar.cpp
// non-modular
import foo;
int main() {return func();}
Example invocation 1 (give me object files for all implementations):
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib bar.cpp -o bar.o
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib ~/all_libs/my_lib/foo_impl.cpp -o ~/all_libs/my_lib/foo_impl.o
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib ~/all_libs/my_lib/data.cpp -o ~/all_libs/my_lib/data.o
Example invocation 2 (give me textually processed files for tooling / distribution):
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib bar.cpp -o bar.i
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib ~/all_libs/my_lib/foo_impl.cpp -o ~/all_libs/my_lib/foo_impl.i
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib ~/all_libs/my_lib/data.cpp -o ~/all_libs/my_lib/data.i
Example outputs:
// ~/my_exe/bar.i
#pragma strawman-module begin(foo)
export module foo;
#pragma strawman-module begin(data)
export module data;
export int x;
#pragma strawman-module end(data)
export int func();
#pragma strawman-module end(foo)
int main() {return func();}
//~/all_libs/my_lib/foo_impl.i
module foo;
#pragma strawman-module begin(foo)
export module foo;
#pragma strawman-module begin(data)
export module data;
export int x;
#pragma strawman-module end(data)
export int func();
#pragma strawman-module end(foo)
int func() {return 42;};
//~/all_libs/my_lib/data.i
export module data;
export int x;
Users would still need to build each module implementation file (just as they have to build each .cpp today) in order for all symbols to get defined. This might disappoint some people that think that textual modules will provide behavior similar to "unity" / "blob" builds. Non-inline function definitions in an imported module wouldn't have object code emitted in importers... the object could would only be provided in the TU that defines that module.
All of this is intended to allow a fully conforming modules implementation. It also does not preclude additional build options intended for new, smart, fast, build systems. To the contrary, this is an area that I encourage investigation and research.
Let me know if there are holes in this plan, and if it sounds reasonable to implement. Also let me know if this sounds like it won't help in keeping your existing tool or build system chugging along.
I would like to find a way for users to decouple the upgrading of tools from the migration to modules. This is not intended as a long term idealized solution, but as a way to ease migration pains. We should also work on the ideal solution, but in parallel to the legacy solution. I think this has the potential to make the upgrade from C++17 to C++20 roughly the same cost to users as the upgrade from a C++14 to C++17. This was discussed some in the impromptu tooling session on Friday at Kona 2019.
The no-build-system-upgrade constraint implies other constraints:
1. No up-front scanning of the source to find module name and dependency information, because a lot of current build systems don't currently have a scan step.
2. No dynamic dependencies between TUs. Many current build systems assume that the .cpp -> .o[bj] transformation is trivially parallelizable.
3. No upgrade of build tool executables. This has to work with versions of "make", "ninja", and "scons" from 10+ years ago.
4. You _can_ add compiler / linker flags.
Some quick notes on this implementation strategy:
* Uses TEXTUAL inclusion
* Compiler assumes that the build system knows nothing of BMIs
* Compiler needs to be able to do module mapping with minimal input from users.
The scheme I have in mind would result in no build throughput improvements with the old bad build systems, but I think it would still provide the isolation benefits of modules and be conforming. When the user is able to upgrade their build system, they can start getting the build throughput improvements.
The general idea is to treat the module interface file as a glorified header (Gaby has mentioned this possibility in various venues). When the user passes --strawman-slow-modules (or perhaps -frewrite-imports...) to the compiler, the compiler does a textual inclusion of the module interface file (no BMI involved at all). The textual inclusion would likely involve placing a #pragma strawman-module begin(name-of-module) directive, with a #pragma strawman-module end(name-of-module) directive at the end of the module text. Each TU will duplicate this work. If the compiler can emit this text file, then it can be distributed using existing technologies that are expecting preprocessed files. This is similar in nature to clang's -frewrite-imports.
So this requires that compilers support this textual modules approach. It also requires that the compiler be able to find the module interface files without requiring the (dumb) build system to scan in advance. The "easiest" (and slow) way to make this happen is to require that module names correspond to file names, and that compilers provide a search path. I am well aware that this isn't fast, but this general scheme is intended for build system compatibility. Vendors should also provide a faster thing that can be used by newer build systems. Compilers can also provide a command line override to say where a creatively named module can be found.
Example input:
// ~/all_libs/my_lib/data.cpp
// interface + implementation
export module data;
export int x;
// ~/all_libs/my_lib/foo.cpp
// interface
export module foo;
import data;
export int func();
// ~/all_libs/my_lib/foo_impl.cpp
// implementation
module foo;
int func() {return x;};
// ~/my_exe/bar.cpp
// non-modular
import foo;
int main() {return func();}
Example invocation 1 (give me object files for all implementations):
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib bar.cpp -o bar.o
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib ~/all_libs/my_lib/foo_impl.cpp -o ~/all_libs/my_lib/foo_impl.o
my_compiler --strawman-slow-modules -I ~/all_libs/my_lib ~/all_libs/my_lib/data.cpp -o ~/all_libs/my_lib/data.o
Example invocation 2 (give me textually processed files for tooling / distribution):
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib bar.cpp -o bar.i
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib ~/all_libs/my_lib/foo_impl.cpp -o ~/all_libs/my_lib/foo_impl.i
my_compiler --strawman-slow-modules -E -I ~/all_libs/my_lib ~/all_libs/my_lib/data.cpp -o ~/all_libs/my_lib/data.i
Example outputs:
// ~/my_exe/bar.i
#pragma strawman-module begin(foo)
export module foo;
#pragma strawman-module begin(data)
export module data;
export int x;
#pragma strawman-module end(data)
export int func();
#pragma strawman-module end(foo)
int main() {return func();}
//~/all_libs/my_lib/foo_impl.i
module foo;
#pragma strawman-module begin(foo)
export module foo;
#pragma strawman-module begin(data)
export module data;
export int x;
#pragma strawman-module end(data)
export int func();
#pragma strawman-module end(foo)
int func() {return 42;};
//~/all_libs/my_lib/data.i
export module data;
export int x;
Users would still need to build each module implementation file (just as they have to build each .cpp today) in order for all symbols to get defined. This might disappoint some people that think that textual modules will provide behavior similar to "unity" / "blob" builds. Non-inline function definitions in an imported module wouldn't have object code emitted in importers... the object could would only be provided in the TU that defines that module.
All of this is intended to allow a fully conforming modules implementation. It also does not preclude additional build options intended for new, smart, fast, build systems. To the contrary, this is an area that I encourage investigation and research.
Let me know if there are holes in this plan, and if it sounds reasonable to implement. Also let me know if this sounds like it won't help in keeping your existing tool or build system chugging along.
Received on 2019-03-02 19:03:45