C++ Logo

sg15

Advanced search

Re: #include to modules transition

From: Hassan Sajjad <hassan.sajjad069_at_[hidden]>
Date: Mon, 13 Nov 2023 01:22:15 +0500
>
> it is unreasonable to change more than a thousand source files in a
> single commit
>

I think such a library can be broken down into smaller libraries with e.g.
10, 10 files each. Then converted to different modules and combined into a
single module. This could be an option.

This means it's a strong requirement that a library be consumable through
> both textual inclusion and through a named module import in the same
> program.
>

While a pro of this approach is that a library only gets consumed as either
a module or as header-files and hence saves the compiler some trouble (
based on a few comments that I have read online), it does not discourage
the coexistence of header-files and modules of a same library in a same
program. e.g. the switch in the build configuration that specifies that
this library can be consumed as the module is opt-in. This means that only
if enabled, the build-system will conspire with the compiler for its
dependent target to replace all includes of that target with the module and
header-unit / header-file specified on the configuration file. But if not
enabled, then the user is on its own. They can use any of "#include" or
"import" in their code.

I never thought about this approach at all. I always thought that the
transition to the modules could only be bottom-up and this combined with
the incompatibility of a library header and module in the same program was
a major limitation in my head, at-least for the time being. The Gabby paper
brought this notion of top-down transition approach to light and I shared
my perspective as a build-system dev. I still think that we should start
from the bottom so minimal code is impacted or maybe we can use this
approach with some small library at the very top. Just to test the waves. :)

If the compiler devs don't see any issues with it, I request them to
support it. I can do a case study with SFML and then maybe we can build on
it further.

In the process, we can document as well.

Best,
Hassan Sajjad

On Sun, Nov 12, 2023 at 11:17 PM Bret Brown via SG15 <sg15_at_[hidden]>
wrote:

> I think you have the premise correct, Hassan. Yo emphasize, though, we
> cannot expect all users of a given library to convert from textual
> inclusion to explicit use of named modules in the same commit. Even for
> well governed monorepos, if the repo includes enough code and enough active
> contributors, it is unreasonable to change more than a thousand source
> files in a single commit.
>
> This means it's a strong requirement that a library be consumable through
> both textual inclusion and through a named module import in the same
> program.
>
> In our discussion on Friday, we talked about a few options. It would be
> good to get some writeups on the pros and cons of each of the options so we
> can educate the community on which work and which are recommended.
> Initially less formal writeups to drive discussion seems great. Eventually
> we can write more formal ISO papers to function as references.
>
> All of the approaches we discussed likely require additional work on
> standards for defining metadata for how to consume modular C++. Even
> monorepos ship shared libraries with C++ interfaces for other build systems
> to consume! Writeups and case studies for this module metadata would also
> help with allowing the community to practically adopt modules.
>
> Bret
>
> On Sun, Nov 12, 2023, 12:42 Hassan Sajjad via SG15 <sg15_at_[hidden]>
> wrote:
>
>> Hi Gaby,
>>
>> I would like to rephrase the comment that I made on the server.
>>
>> Problem:
>> I don't fully understand the proposal, but I am getting the gist.
>> Supposedly, if a mono repo project has multiple libraries such that all of
>> them are being compiled with the same baseline compile command, and one of
>> the libraries, library "Cat", simultaneously supports being consumed both
>> as modules and header-files. But at the moment only a few can consume "Cat"
>> as a module while others consume it as header-files. The problem happens
>> when one consumer that is consuming as a module also consumes some other
>> library that is consuming "Cat" as header-file. Now, this consumer is
>> inadvertently consuming "Cat" both as header-file and module.
>>
>> I am guessing that the solution that paper is proposing is similar to
>> that once "Cat" is converted to modules, the build-system can reliably
>> communicate to the compiler that if any header-file or header-unit of
>> library "Cat" is observed to be included by any library, then replace it
>> with "import Cat;" instead and also force include a macro include-file.
>>
>> So, we want to map the list of header-files of the library "Cat" with the
>> module "Cat".
>>
>> I propose a way to signal this reliably to the compiler. Most
>> header-files today come from include-directories. Many build-systems gather
>> source-files in sets which is generally called a target. A target has the
>> same baseline compile command + local preprocessor arguments for all its
>> files. A target can have other targets as dependencies. When a target adds
>> another target as a dependency, its public / usage-requirement / interface
>> include-directories get added to the set of include-directories of the
>> dependent target.
>>
>> This means that whichever target uses "Cat", adds the public
>> include-directories of the "Cat" to its own set. Now, in
>> build-configuration, the user can mark "Cat" that it supports being
>> consumed as a module and also provide some extra metadata (TBDL). Now, when
>> the "Cat" consumer target adds "Cat"'s public include-directories to its
>> own include-directories set, it also saves the meta-data and associates
>> that with these directories. Then it can use this meta-data in
>> compile-command construction.
>>
>> The meta-data is manually specified by the user in the
>> build-configuration file. It is the set of names of modules the "Cat"
>> introduces. It also includes the set of header-unit / header-file
>> containing all macros squeezed out of "Cat". Whether a particular file of
>> the set is a header-file or header-unit is determined by the
>> header-units.json file in the directory the header-file is found.
>>
>> For most cases, there will only be the need for one entity in the set of
>> modules as a single module can have multiple partitions in it. Also, a
>> single macro containing header-file can be the amalgamation of multiple
>> header-files, but the user has the liberty.
>>
>> e.g. a sample compile-command for a file "dog.cpp" of target "Dog" could
>> be ```cl.exe /c /MIinclude/cat(cat)(true, cat_macro.hpp) dog.cpp
>> /reference cat=build/cat.ifc ...```
>>
>> I just randomly choose syntax. Compilers can design it according to the
>> environment and terminal they operate in. Instead of /I flag, /MI flag is
>> used for specifying include-directory. The respective meta-data specified
>> by the user for the target "Cat" is embedded in. Both sets are given in
>> their parenthesis. It is
>> cat --> the module name the library is introducing
>> true, cat_macro.h --> true if filename is to be found as <cat_macro.h>
>> and false if it is to be found as "cat_macro.h". This file must exist in
>> one of "Cat" include directories. It could be a header-unit. It could be a
>> header-file.
>> The ... in the end shows any dependencies of the modules specified in the
>> set if any.
>>
>> Now, with this information whenever the compiler hits an include of a
>> header-file / header-unit from an include-directory of "Cat", it
>> ignores include and instead processes the above set if it had not already.
>>
>> This way compiler, with the help of build-system, reliably ensures that
>> the library "Cat" is either consumed by the conventional approach in all of
>> its dependents or as a module. Just a little user intervention is needed to
>> specify the meta-data. In bigger projects, CMake constructs like
>> ```add_executable``` are not directly specified. Instead, they are
>> specified through a wrapper. This wrapper can by default specify the
>> meta-data such as a module name same as the target name and macro
>> header-file such as targetname_macro.h automatically with the setting that
>> whether it supports being consumed as a module exposed to the user. Hence
>> this metadata specification part could be automated as well.
>>
>> If this is implemented in MSVC, my build-system HMake will be able to
>> support it in no time (in 2 days at most). Thus, with one switch you will
>> be able to switch from conventional to modules across your project
>> considering your library supports both simultaneously. Actually, if the
>> library is a closed domain, a part of monorepo, the user can move code from
>> includes to the modules. Those empty header files should just exist there
>> to hint to the compiler. Once all references of the header-files of the
>> target "Cat" are removed from the dependent targets and "#include" gets
>> replaced by "import", the header-files could be safely deleted. Multiple
>> libraries can simultaneously convert to modules. As soon as a library
>> converts to the module, all of its dependents use that module.
>>
>> My build-system HMake also supports header-units so that you can consume
>> the library in any of the 3 ways. I compiled SFML with C++20 header-units
>> and can experiment with the consumption of "std" module in the whole of
>> SFML if this gets accepted.
>>
>> Another point is that for the ```/showIncludes``` flag, the compiler
>> should not show any of header-files from /MI include-directories as those
>> are no longer the dependencies. Those are just the hints.
>>
>> While I currently see no issues, I acknowledge the possibility of
>> limitations and potential errors.
>>
>> Thank you for your proposal. The most interesting part for me is "the
>> suggested implementation for transition strategy relies purely on build set
>> up and requires no language rule changes"
>>
>> Best,
>> Hassan Sajjad
>>
>> On Thu, Nov 9, 2023 at 2:28 AM Gabriel Dos Reis via SG15 <
>> sg15_at_[hidden]> wrote:
>>
>>> Hello,
>>>
>>>
>>>
>>> I have a draft paper addressing a problem that some C++23
>>> implementations are having: https://isocpp.org/files/papers/D3041R0.pdf
>>>
>>>
>>>
>>> Is there a way for me to present that tomorrow?
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> -- Gaby
>>>
>>>
>>> _______________________________________________
>>> SG15 mailing list
>>> SG15_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>

Received on 2023-11-12 20:22:16