Date: Sun, 12 Nov 2023 13:17:28 -0500
I think you have the premise correct, Hassan. Yo emphasize, though, we
cannot expect all users of a given library to convert from textual
inclusion to explicit use of named modules in the same commit. Even for
well governed monorepos, if the repo includes enough code and enough active
contributors, it is unreasonable to change more than a thousand source
files in a single commit.
This means it's a strong requirement that a library be consumable through
both textual inclusion and through a named module import in the same
program.
In our discussion on Friday, we talked about a few options. It would be
good to get some writeups on the pros and cons of each of the options so we
can educate the community on which work and which are recommended.
Initially less formal writeups to drive discussion seems great. Eventually
we can write more formal ISO papers to function as references.
All of the approaches we discussed likely require additional work on
standards for defining metadata for how to consume modular C++. Even
monorepos ship shared libraries with C++ interfaces for other build systems
to consume! Writeups and case studies for this module metadata would also
help with allowing the community to practically adopt modules.
Bret
On Sun, Nov 12, 2023, 12:42 Hassan Sajjad via SG15 <sg15_at_[hidden]>
wrote:
> Hi Gaby,
>
> I would like to rephrase the comment that I made on the server.
>
> Problem:
> I don't fully understand the proposal, but I am getting the gist.
> Supposedly, if a mono repo project has multiple libraries such that all of
> them are being compiled with the same baseline compile command, and one of
> the libraries, library "Cat", simultaneously supports being consumed both
> as modules and header-files. But at the moment only a few can consume "Cat"
> as a module while others consume it as header-files. The problem happens
> when one consumer that is consuming as a module also consumes some other
> library that is consuming "Cat" as header-file. Now, this consumer is
> inadvertently consuming "Cat" both as header-file and module.
>
> I am guessing that the solution that paper is proposing is similar to that
> once "Cat" is converted to modules, the build-system can reliably
> communicate to the compiler that if any header-file or header-unit of
> library "Cat" is observed to be included by any library, then replace it
> with "import Cat;" instead and also force include a macro include-file.
>
> So, we want to map the list of header-files of the library "Cat" with the
> module "Cat".
>
> I propose a way to signal this reliably to the compiler. Most header-files
> today come from include-directories. Many build-systems gather source-files
> in sets which is generally called a target. A target has the same baseline
> compile command + local preprocessor arguments for all its files. A target
> can have other targets as dependencies. When a target adds another target
> as a dependency, its public / usage-requirement / interface
> include-directories get added to the set of include-directories of the
> dependent target.
>
> This means that whichever target uses "Cat", adds the public
> include-directories of the "Cat" to its own set. Now, in
> build-configuration, the user can mark "Cat" that it supports being
> consumed as a module and also provide some extra metadata (TBDL). Now, when
> the "Cat" consumer target adds "Cat"'s public include-directories to its
> own include-directories set, it also saves the meta-data and associates
> that with these directories. Then it can use this meta-data in
> compile-command construction.
>
> The meta-data is manually specified by the user in the build-configuration
> file. It is the set of names of modules the "Cat" introduces. It also
> includes the set of header-unit / header-file containing all macros
> squeezed out of "Cat". Whether a particular file of the set is a
> header-file or header-unit is determined by the header-units.json file in
> the directory the header-file is found.
>
> For most cases, there will only be the need for one entity in the set of
> modules as a single module can have multiple partitions in it. Also, a
> single macro containing header-file can be the amalgamation of multiple
> header-files, but the user has the liberty.
>
> e.g. a sample compile-command for a file "dog.cpp" of target "Dog" could
> be ```cl.exe /c /MIinclude/cat(cat)(true, cat_macro.hpp) dog.cpp
> /reference cat=build/cat.ifc ...```
>
> I just randomly choose syntax. Compilers can design it according to the
> environment and terminal they operate in. Instead of /I flag, /MI flag is
> used for specifying include-directory. The respective meta-data specified
> by the user for the target "Cat" is embedded in. Both sets are given in
> their parenthesis. It is
> cat --> the module name the library is introducing
> true, cat_macro.h --> true if filename is to be found as <cat_macro.h>
> and false if it is to be found as "cat_macro.h". This file must exist in
> one of "Cat" include directories. It could be a header-unit. It could be a
> header-file.
> The ... in the end shows any dependencies of the modules specified in the
> set if any.
>
> Now, with this information whenever the compiler hits an include of a
> header-file / header-unit from an include-directory of "Cat", it
> ignores include and instead processes the above set if it had not already.
>
> This way compiler, with the help of build-system, reliably ensures that
> the library "Cat" is either consumed by the conventional approach in all of
> its dependents or as a module. Just a little user intervention is needed to
> specify the meta-data. In bigger projects, CMake constructs like
> ```add_executable``` are not directly specified. Instead, they are
> specified through a wrapper. This wrapper can by default specify the
> meta-data such as a module name same as the target name and macro
> header-file such as targetname_macro.h automatically with the setting that
> whether it supports being consumed as a module exposed to the user. Hence
> this metadata specification part could be automated as well.
>
> If this is implemented in MSVC, my build-system HMake will be able to
> support it in no time (in 2 days at most). Thus, with one switch you will
> be able to switch from conventional to modules across your project
> considering your library supports both simultaneously. Actually, if the
> library is a closed domain, a part of monorepo, the user can move code from
> includes to the modules. Those empty header files should just exist there
> to hint to the compiler. Once all references of the header-files of the
> target "Cat" are removed from the dependent targets and "#include" gets
> replaced by "import", the header-files could be safely deleted. Multiple
> libraries can simultaneously convert to modules. As soon as a library
> converts to the module, all of its dependents use that module.
>
> My build-system HMake also supports header-units so that you can consume
> the library in any of the 3 ways. I compiled SFML with C++20 header-units
> and can experiment with the consumption of "std" module in the whole of
> SFML if this gets accepted.
>
> Another point is that for the ```/showIncludes``` flag, the compiler
> should not show any of header-files from /MI include-directories as those
> are no longer the dependencies. Those are just the hints.
>
> While I currently see no issues, I acknowledge the possibility of
> limitations and potential errors.
>
> Thank you for your proposal. The most interesting part for me is "the
> suggested implementation for transition strategy relies purely on build set
> up and requires no language rule changes"
>
> Best,
> Hassan Sajjad
>
> On Thu, Nov 9, 2023 at 2:28 AM Gabriel Dos Reis via SG15 <
> sg15_at_[hidden]> wrote:
>
>> Hello,
>>
>>
>>
>> I have a draft paper addressing a problem that some C++23 implementations
>> are having: https://isocpp.org/files/papers/D3041R0.pdf
>>
>>
>>
>> Is there a way for me to present that tomorrow?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> -- Gaby
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
cannot expect all users of a given library to convert from textual
inclusion to explicit use of named modules in the same commit. Even for
well governed monorepos, if the repo includes enough code and enough active
contributors, it is unreasonable to change more than a thousand source
files in a single commit.
This means it's a strong requirement that a library be consumable through
both textual inclusion and through a named module import in the same
program.
In our discussion on Friday, we talked about a few options. It would be
good to get some writeups on the pros and cons of each of the options so we
can educate the community on which work and which are recommended.
Initially less formal writeups to drive discussion seems great. Eventually
we can write more formal ISO papers to function as references.
All of the approaches we discussed likely require additional work on
standards for defining metadata for how to consume modular C++. Even
monorepos ship shared libraries with C++ interfaces for other build systems
to consume! Writeups and case studies for this module metadata would also
help with allowing the community to practically adopt modules.
Bret
On Sun, Nov 12, 2023, 12:42 Hassan Sajjad via SG15 <sg15_at_[hidden]>
wrote:
> Hi Gaby,
>
> I would like to rephrase the comment that I made on the server.
>
> Problem:
> I don't fully understand the proposal, but I am getting the gist.
> Supposedly, if a mono repo project has multiple libraries such that all of
> them are being compiled with the same baseline compile command, and one of
> the libraries, library "Cat", simultaneously supports being consumed both
> as modules and header-files. But at the moment only a few can consume "Cat"
> as a module while others consume it as header-files. The problem happens
> when one consumer that is consuming as a module also consumes some other
> library that is consuming "Cat" as header-file. Now, this consumer is
> inadvertently consuming "Cat" both as header-file and module.
>
> I am guessing that the solution that paper is proposing is similar to that
> once "Cat" is converted to modules, the build-system can reliably
> communicate to the compiler that if any header-file or header-unit of
> library "Cat" is observed to be included by any library, then replace it
> with "import Cat;" instead and also force include a macro include-file.
>
> So, we want to map the list of header-files of the library "Cat" with the
> module "Cat".
>
> I propose a way to signal this reliably to the compiler. Most header-files
> today come from include-directories. Many build-systems gather source-files
> in sets which is generally called a target. A target has the same baseline
> compile command + local preprocessor arguments for all its files. A target
> can have other targets as dependencies. When a target adds another target
> as a dependency, its public / usage-requirement / interface
> include-directories get added to the set of include-directories of the
> dependent target.
>
> This means that whichever target uses "Cat", adds the public
> include-directories of the "Cat" to its own set. Now, in
> build-configuration, the user can mark "Cat" that it supports being
> consumed as a module and also provide some extra metadata (TBDL). Now, when
> the "Cat" consumer target adds "Cat"'s public include-directories to its
> own include-directories set, it also saves the meta-data and associates
> that with these directories. Then it can use this meta-data in
> compile-command construction.
>
> The meta-data is manually specified by the user in the build-configuration
> file. It is the set of names of modules the "Cat" introduces. It also
> includes the set of header-unit / header-file containing all macros
> squeezed out of "Cat". Whether a particular file of the set is a
> header-file or header-unit is determined by the header-units.json file in
> the directory the header-file is found.
>
> For most cases, there will only be the need for one entity in the set of
> modules as a single module can have multiple partitions in it. Also, a
> single macro containing header-file can be the amalgamation of multiple
> header-files, but the user has the liberty.
>
> e.g. a sample compile-command for a file "dog.cpp" of target "Dog" could
> be ```cl.exe /c /MIinclude/cat(cat)(true, cat_macro.hpp) dog.cpp
> /reference cat=build/cat.ifc ...```
>
> I just randomly choose syntax. Compilers can design it according to the
> environment and terminal they operate in. Instead of /I flag, /MI flag is
> used for specifying include-directory. The respective meta-data specified
> by the user for the target "Cat" is embedded in. Both sets are given in
> their parenthesis. It is
> cat --> the module name the library is introducing
> true, cat_macro.h --> true if filename is to be found as <cat_macro.h>
> and false if it is to be found as "cat_macro.h". This file must exist in
> one of "Cat" include directories. It could be a header-unit. It could be a
> header-file.
> The ... in the end shows any dependencies of the modules specified in the
> set if any.
>
> Now, with this information whenever the compiler hits an include of a
> header-file / header-unit from an include-directory of "Cat", it
> ignores include and instead processes the above set if it had not already.
>
> This way compiler, with the help of build-system, reliably ensures that
> the library "Cat" is either consumed by the conventional approach in all of
> its dependents or as a module. Just a little user intervention is needed to
> specify the meta-data. In bigger projects, CMake constructs like
> ```add_executable``` are not directly specified. Instead, they are
> specified through a wrapper. This wrapper can by default specify the
> meta-data such as a module name same as the target name and macro
> header-file such as targetname_macro.h automatically with the setting that
> whether it supports being consumed as a module exposed to the user. Hence
> this metadata specification part could be automated as well.
>
> If this is implemented in MSVC, my build-system HMake will be able to
> support it in no time (in 2 days at most). Thus, with one switch you will
> be able to switch from conventional to modules across your project
> considering your library supports both simultaneously. Actually, if the
> library is a closed domain, a part of monorepo, the user can move code from
> includes to the modules. Those empty header files should just exist there
> to hint to the compiler. Once all references of the header-files of the
> target "Cat" are removed from the dependent targets and "#include" gets
> replaced by "import", the header-files could be safely deleted. Multiple
> libraries can simultaneously convert to modules. As soon as a library
> converts to the module, all of its dependents use that module.
>
> My build-system HMake also supports header-units so that you can consume
> the library in any of the 3 ways. I compiled SFML with C++20 header-units
> and can experiment with the consumption of "std" module in the whole of
> SFML if this gets accepted.
>
> Another point is that for the ```/showIncludes``` flag, the compiler
> should not show any of header-files from /MI include-directories as those
> are no longer the dependencies. Those are just the hints.
>
> While I currently see no issues, I acknowledge the possibility of
> limitations and potential errors.
>
> Thank you for your proposal. The most interesting part for me is "the
> suggested implementation for transition strategy relies purely on build set
> up and requires no language rule changes"
>
> Best,
> Hassan Sajjad
>
> On Thu, Nov 9, 2023 at 2:28 AM Gabriel Dos Reis via SG15 <
> sg15_at_[hidden]> wrote:
>
>> Hello,
>>
>>
>>
>> I have a draft paper addressing a problem that some C++23 implementations
>> are having: https://isocpp.org/files/papers/D3041R0.pdf
>>
>>
>>
>> Is there a way for me to present that tomorrow?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> -- Gaby
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
Received on 2023-11-12 18:17:41