ISOCPP SG15 List: Re: P2898R0: Importable Headers are Not Universally Implementable

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 22 May 2023 16:17:37 -0400

I've read this paper several times, but I'm afraid I'm still struggling
to understand the concerns.

First, I don't know what is meant by "universally implementable". The
paper doesn't offer a definition of the term and its use within the
paper doesn't present an intuitive meaning, at least not for me (note
that "universally implementable" appears only in the title and in an
assertion in section 2.1 in the discussion of #pragma once). If the
concern is that dependency scanning for importable headers is dependent
on which implementation the result of the dependency scanning will be
used to construct a build system for, then I agree that dependency
scanning does not produce an implementation independent result. But I
also don't find that to be particularly concerning either; absent a
(possibly implementation dependent and often complicated) include search
path, it is not possible, in general, to universally map #include
directives to source files either.

Section 2.1 states:

    This identity problem has always been a complicated topic for the
    C++ specification, the `#pragma once` directive has been supported
    by various implementations in varying degrees of compatibility, but
    it cannot be universally implemented because we don’t have a way of
    specifying what is the thing that should be included only “once”
    given the way that header search works.

This issue does exist (and has for a very long time) but I don't see how
it is relevant. Since header to source file mapping is
implementation-defined, a dependency scanner or build system must match
the implementation-defined behavior for the targeted implementations.

Section 2.2 states:

    The cost of that approach, however, is that we create a significant
    bottleneck in the dependency chain of any given object. *Changing
    the list of Importable Headers **or the Local Preprocessor Arguments
    for any one of them* will result in a complete invalidation of the
    dependency scanning for all translation units in the project.

I think the bold text is not quite correct. Changing whether a header is
importable or not only invalidates the TUs that include/import it.
Likewise, changing how a header unit is built only invalidates the TUs
that import it (which may affect additional TUs if the importing TU is
also an importable one). So, instead of "will result", I would
substitute "might result". I don't see any reason why a build system
can't cache these results and update them when they are found to be
violated; I've implemented similar caching in GNU make based build
systems in the past. I do agree that it is necessary to construct the
dependency tree for importable headers bottom up (because of imported
macros) and that restricts the embarrassingly parallel opportunities
available to named modules. I would like to see some real numbers before
concluding that there is a problem to be solved though. Header units
don't have to perform as well as named modules; they just have to
perform better than source inclusion and have an adoption cost less than
named modules to be a potentially attractive and viable solution.

Section 2.3 states:

    This is going to be particularly challenging if the ecosystem ends
    up in a situation where different compilers make different choices
    about how to handle the implicit replacement of `#include` by the
    equivalent `import`.

Every implementation will need to be told which header files are
importable. An implementation independent mechanism to specify this
would be beneficial (e.g., something like Clang's module maps or MSVC's
new header-unit-JSON thing). But the alternative is that dependency
scanners and build systems will have to be aware of the mechanisms used
by the implementations they target. Since such implementation dependence
is present anyway, I don't see this as being a big issue.

Section 3.1 states:

    The main restriction that enables a interoperable use of
    pre-compiled headers is that the translation unit has to use it as a
    preamble to the translation unit, meaning the precompiled header is
    guaranteed not to be influenced by any other code in the translation
    unit

I think I understand what you are trying to say here, but I find the
wording awkward. In my mind, the restriction is that the initial
preprocessor state used to build a PCH and the initial preprocessor
state used to build a TU that uses the PCH must be such that the
translation of the TU that uses the PCH match the translation that would
be performed if the PCH files were directly included (with the exception
of the precise preprocessor state at the end of translation). If stated
in this way, I believe the differences between PCH and Clang header
modules described in the following section disappear.

Section 3.2 states:

    While they don’t share the same restriction as pre-compiled headers,
    Clang Header Modules are implemented in situations where there is an
    assumption that the headers are not supposed to be influenced by the
    state of the preprocessor, and it is considered an user error if
    that restriction is violated. In other words, they are applicable in
    code-bases that are using a subset of the C++ Language. For code
    that follows that convention, falling back to plain source inclusion
    is considered a valid interpretation of the code.

The first sentence is not quite true. Clang modules allow for BMI
creation and selection to be dependent on macro definitions defined for
the importing TU. See the config_macros
<https://clang.llvm.org/docs/Modules.html#configuration-macros-declaration>
declaration.

The paper asserts "not acceptable" and "unacceptable" in various places
without qualification. What criteria was used to determine what would be
acceptable? Please update the paper accordingly.

I recommend substituting "distributed builds" for use of "remote
execution" in the paper since the latter can mean just running the build
on a (single) remote system. The concern is really the packaging and
distribution of (a subset of) the input files needed to perform a subset
of the build.

The paper uses "command line" as a proxy for identifying the set of
macros that will be merged into the preprocessor state of an importing
TU. The set of such macros is actually those that are still defined when
translation of a header unit completes. For clarity, I recommend
avoiding discussion of a command line since the existence of such is an
implementation detail and at best only provides a subset of the
information needed.

Tom.

On 5/18/23 4:46 PM, Daniel Ruoso via SG15 wrote:
> Hello,
>
> After lots of discussions and trying to find solutions on how to make
> Importable Headers work, I have come to the unfortunate conclusion
> that they are not implementable in a way that would be acceptable in
> all environments where C++ is used.
>
> Moreover, I believe the goals that were set for the specification of
> Importable Headers can be achieved in a much simpler way, without
> introducing the special semantics that those provide.
>
> In other words, we can make the standard allow the optimizations
> achieved by Explicit Clang Header Modules as well as the early
> adoption reports from MSVC, while at the same time maintaining the
> semantics of source inclusion as the "canonical" behavior for when a
> given environment cannot make use of those optimizations.
>
> This revision of the paper does not include wording changes yet, as I
> want to build consensus on this prior to going through the effort of
> figuring out what wording needs changing.
>
> See the paper at:https://wg21.link/P2898R0
>
> daniel
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15

Received on 2023-05-22 20:17:38