C++ Logo

sg15

Advanced search

Re: #include to modules transition

From: Hassan Sajjad <hassan.sajjad069_at_[hidden]>
Date: Sun, 12 Nov 2023 22:42:10 +0500
Hi Gaby,

I would like to rephrase the comment that I made on the server.

Problem:
I don't fully understand the proposal, but I am getting the gist.
Supposedly, if a mono repo project has multiple libraries such that all of
them are being compiled with the same baseline compile command, and one of
the libraries, library "Cat", simultaneously supports being consumed both
as modules and header-files. But at the moment only a few can consume "Cat"
as a module while others consume it as header-files. The problem happens
when one consumer that is consuming as a module also consumes some other
library that is consuming "Cat" as header-file. Now, this consumer is
inadvertently consuming "Cat" both as header-file and module.

I am guessing that the solution that paper is proposing is similar to that
once "Cat" is converted to modules, the build-system can reliably
communicate to the compiler that if any header-file or header-unit of
library "Cat" is observed to be included by any library, then replace it
with "import Cat;" instead and also force include a macro include-file.

So, we want to map the list of header-files of the library "Cat" with the
module "Cat".

I propose a way to signal this reliably to the compiler. Most header-files
today come from include-directories. Many build-systems gather source-files
in sets which is generally called a target. A target has the same baseline
compile command + local preprocessor arguments for all its files. A target
can have other targets as dependencies. When a target adds another target
as a dependency, its public / usage-requirement / interface
include-directories get added to the set of include-directories of the
dependent target.

This means that whichever target uses "Cat", adds the public
include-directories of the "Cat" to its own set. Now, in
build-configuration, the user can mark "Cat" that it supports being
consumed as a module and also provide some extra metadata (TBDL). Now, when
the "Cat" consumer target adds "Cat"'s public include-directories to its
own include-directories set, it also saves the meta-data and associates
that with these directories. Then it can use this meta-data in
compile-command construction.

The meta-data is manually specified by the user in the build-configuration
file. It is the set of names of modules the "Cat" introduces. It also
includes the set of header-unit / header-file containing all macros
squeezed out of "Cat". Whether a particular file of the set is a
header-file or header-unit is determined by the header-units.json file in
the directory the header-file is found.

For most cases, there will only be the need for one entity in the set of
modules as a single module can have multiple partitions in it. Also, a
single macro containing header-file can be the amalgamation of multiple
header-files, but the user has the liberty.

e.g. a sample compile-command for a file "dog.cpp" of target "Dog" could
be ```cl.exe /c /MIinclude/cat(cat)(true, cat_macro.hpp) dog.cpp
/reference cat=build/cat.ifc ...```

I just randomly choose syntax. Compilers can design it according to the
environment and terminal they operate in. Instead of /I flag, /MI flag is
used for specifying include-directory. The respective meta-data specified
by the user for the target "Cat" is embedded in. Both sets are given in
their parenthesis. It is
cat --> the module name the library is introducing
true, cat_macro.h --> true if filename is to be found as <cat_macro.h>
and false if it is to be found as "cat_macro.h". This file must exist in
one of "Cat" include directories. It could be a header-unit. It could be a
header-file.
The ... in the end shows any dependencies of the modules specified in the
set if any.

Now, with this information whenever the compiler hits an include of a
header-file / header-unit from an include-directory of "Cat", it
ignores include and instead processes the above set if it had not already.

This way compiler, with the help of build-system, reliably ensures that the
library "Cat" is either consumed by the conventional approach in all of its
dependents or as a module. Just a little user intervention is needed to
specify the meta-data. In bigger projects, CMake constructs like
```add_executable``` are not directly specified. Instead, they are
specified through a wrapper. This wrapper can by default specify the
meta-data such as a module name same as the target name and macro
header-file such as targetname_macro.h automatically with the setting that
whether it supports being consumed as a module exposed to the user. Hence
this metadata specification part could be automated as well.

 If this is implemented in MSVC, my build-system HMake will be able to
support it in no time (in 2 days at most). Thus, with one switch you will
be able to switch from conventional to modules across your project
considering your library supports both simultaneously. Actually, if the
library is a closed domain, a part of monorepo, the user can move code from
includes to the modules. Those empty header files should just exist there
to hint to the compiler. Once all references of the header-files of the
target "Cat" are removed from the dependent targets and "#include" gets
replaced by "import", the header-files could be safely deleted. Multiple
libraries can simultaneously convert to modules. As soon as a library
converts to the module, all of its dependents use that module.

 My build-system HMake also supports header-units so that you can consume
the library in any of the 3 ways. I compiled SFML with C++20 header-units
and can experiment with the consumption of "std" module in the whole of
SFML if this gets accepted.

 Another point is that for the ```/showIncludes``` flag, the compiler
should not show any of header-files from /MI include-directories as those
are no longer the dependencies. Those are just the hints.

While I currently see no issues, I acknowledge the possibility of
limitations and potential errors.

Thank you for your proposal. The most interesting part for me is "the
suggested implementation for transition strategy relies purely on build set
up and requires no language rule changes"

Best,
Hassan Sajjad

On Thu, Nov 9, 2023 at 2:28 AM Gabriel Dos Reis via SG15 <
sg15_at_[hidden]> wrote:

> Hello,
>
>
>
> I have a draft paper addressing a problem that some C++23 implementations
> are having: https://isocpp.org/files/papers/D3041R0.pdf
>
>
>
> Is there a way for me to present that tomorrow?
>
>
>
> Thanks,
>
>
>
> -- Gaby
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>

Received on 2023-11-12 17:42:11