Date: Sun, 3 Mar 2019 10:39:06 -0500
On Sun, Mar 3, 2019, 9:34 AM Boris Kolpackov <boris_at_[hidden]>
wrote:
> Ben Craig <ben.craig_at_[hidden]> writes:
>
> > The scheme I have in mind would result in no build throughput
> > improvements with the old bad build systems, but I think it
> > would still provide the isolation benefits of modules and be
> > conforming.
>
> I wonder if it could result in worse build throughput compared
> to headers, thought?
>
Depends on how it is implemented. Certainly possible though, especially
when compiling few "leaf" TUs. In particular, it is almost guaranteed to be
slower when compiling a single cpp with a clean cache, but that is probably
true for module-aware builds as well. I would expect compilers that
implement this mode to implicitly cache module interfaces. This should
provide much of the perf benefits of module-aware builds, assuming there is
a good overlap of module-affecting build flags. This may also help solve
the distributed build issue, by keeping the cache between compiles, indexed
by content hashes, including the chained hashes of imported modules (I
assume this is what JF meant by "blockchain")
> To be conforming, not only the importing TU will have to be
> isolated from any macros defined by the module interface, but
> the module intreface will have to be isolated from any macros
> definited by the importing TU (whether it should also be
> isolated from macros defined on the command line is an
> interesting question, BTW). And this isolation will have to
> happen recursively, for modules imported by module interfaces.
>
I keep seeing comments like this. It isn't just macros. I'm pretty sure
modules also offer full phase 7 isolation. This means that overload
resolution (among other things) must ignore any entities that shouldn't be
seen by the imported module, even if they are defined earlier in the single
stream. This also applies to header units, not just "full modules". I'm not
an implementer, but phase 7 isolation sounds harder to me than emulating
macro isolation which is just a better form of the push/pop macro mechanism
most (all?) preprocessors already have. I know clang's current solution
involves launching a new copy of the compiler to compile each modules.
Now consider a TU that imports a bunch of modules which in
> turn each import a bunch more and all of them include some
> common header, say <functional>. The above isolation rules
> mean that each of those "module interface fragments" (for
> the lack of better term) will include their own full copy
> of <functional> (because the include guards will not be
> defined; how this feature interact with #pragma once is
> an intersting question, BTW).
>
This actually seems easy, relative to everything else required of a
compiler to support this. For one thing, the std headers are (likely to be)
defined as importable headers, so they are required to be treated as header
units, which means they are always compiled as-if they are their own TU.
Each "real" module needs to be treated the same way. They must behave as if
they #include their own copy of each header. And that describes how you
would handle non-importable headers. Implementations should optimize this
case by only including a single copy of each unique file in the stream, and
stitching together the equivalent (potentially synthesized) translation
units by combining the files as needed. Maybe the best solution is to
combine the files in a container like a tar or zip format rather than doing
the current mechanism of directly including the text with some #pragma-like
separators.
> Note also that the same kind of duplication applies to the
> module interfaces themselves: if a TU imports a bunch of
> modules which in turn each import the same module, its
> interface fragment will be duplicated as well.
> _______________________________________________
> Modules mailing list
> Modules_at_[hidden]
> Subscription: http://lists.isocpp.org/mailman/listinfo.cgi/modules
> Link to this post: http://lists.isocpp.org/modules/2019/03/0116.php
>
wrote:
> Ben Craig <ben.craig_at_[hidden]> writes:
>
> > The scheme I have in mind would result in no build throughput
> > improvements with the old bad build systems, but I think it
> > would still provide the isolation benefits of modules and be
> > conforming.
>
> I wonder if it could result in worse build throughput compared
> to headers, thought?
>
Depends on how it is implemented. Certainly possible though, especially
when compiling few "leaf" TUs. In particular, it is almost guaranteed to be
slower when compiling a single cpp with a clean cache, but that is probably
true for module-aware builds as well. I would expect compilers that
implement this mode to implicitly cache module interfaces. This should
provide much of the perf benefits of module-aware builds, assuming there is
a good overlap of module-affecting build flags. This may also help solve
the distributed build issue, by keeping the cache between compiles, indexed
by content hashes, including the chained hashes of imported modules (I
assume this is what JF meant by "blockchain")
> To be conforming, not only the importing TU will have to be
> isolated from any macros defined by the module interface, but
> the module intreface will have to be isolated from any macros
> definited by the importing TU (whether it should also be
> isolated from macros defined on the command line is an
> interesting question, BTW). And this isolation will have to
> happen recursively, for modules imported by module interfaces.
>
I keep seeing comments like this. It isn't just macros. I'm pretty sure
modules also offer full phase 7 isolation. This means that overload
resolution (among other things) must ignore any entities that shouldn't be
seen by the imported module, even if they are defined earlier in the single
stream. This also applies to header units, not just "full modules". I'm not
an implementer, but phase 7 isolation sounds harder to me than emulating
macro isolation which is just a better form of the push/pop macro mechanism
most (all?) preprocessors already have. I know clang's current solution
involves launching a new copy of the compiler to compile each modules.
Now consider a TU that imports a bunch of modules which in
> turn each import a bunch more and all of them include some
> common header, say <functional>. The above isolation rules
> mean that each of those "module interface fragments" (for
> the lack of better term) will include their own full copy
> of <functional> (because the include guards will not be
> defined; how this feature interact with #pragma once is
> an intersting question, BTW).
>
This actually seems easy, relative to everything else required of a
compiler to support this. For one thing, the std headers are (likely to be)
defined as importable headers, so they are required to be treated as header
units, which means they are always compiled as-if they are their own TU.
Each "real" module needs to be treated the same way. They must behave as if
they #include their own copy of each header. And that describes how you
would handle non-importable headers. Implementations should optimize this
case by only including a single copy of each unique file in the stream, and
stitching together the equivalent (potentially synthesized) translation
units by combining the files as needed. Maybe the best solution is to
combine the files in a container like a tar or zip format rather than doing
the current mechanism of directly including the text with some #pragma-like
separators.
> Note also that the same kind of duplication applies to the
> module interfaces themselves: if a TU imports a bunch of
> modules which in turn each import the same module, its
> interface fragment will be duplicated as well.
> _______________________________________________
> Modules mailing list
> Modules_at_[hidden]
> Subscription: http://lists.isocpp.org/mailman/listinfo.cgi/modules
> Link to this post: http://lists.isocpp.org/modules/2019/03/0116.php
>
Received on 2019-03-03 16:39:22