Date: Thu, 19 May 2022 00:35:57 -0600
On Tue, May 17, 2022 at 5:09 AM Boris Kolpackov <boris_at_[hidden]>
wrote:
> Michael Spencer <bigcheesegs_at_[hidden]> writes:
>
> > Yes, we perfectly model the preprocessor state as it pertains to
> > dependencies. Currently this is done by actually building a header unit
> > that is empty except for the exported macros and then importing that,
>
> Interesting, thanks for the information. Do you build such "macro-only
> BMIs" once and then share them across all the TUs being scanned or do
> you have to build them from scratch per-TU (thus potentially redoing
> the same thing multiple times)?
>
Currently they are shared where possible, but I don't think you actually
need to do this, scanning all of LLVM + Clang in non-modules mode takes
milliseconds to do, that's with full duplication between each translation
unit.
>
> Also, does the build system has any input (e.g., compiler options)
> into how such BMIs are built? Feels like it would be natural for
> the scanner to ask the build system to build these BMIs... (you
> can probably see where I am going with this ;-)).
>
The scanner is invoked by the build system and is given the full command
line. It would actually be really bad to have the compiler build these as
it would defeat all the caching we do when generating minified files. The
BMIs really are just an artifact of how we got the scanner working for
modules. It's using a mode of the compiler we intend to remove as soon as
possible because of how many issues it causes.
What we would like to do if absolutely necessary for performance is just
build a bytecode interpreter for the bits of the preprocessor that can
impact dependencies. Scanning would then have a shared step of generating
the bytecode once for each file on disk included, and then execute the
bytecode for each TU, caching state where possible. This can trivially be
done in parallel, and can easily handle tens of millions of lines of code
in seconds, really only limited by IO (and the build system starts
dispatching work as soon as the first leaf module is discovered, doesn't
need to scan the entire build). The ability to do this kind of scanning is
the primary reason I wrote https://wg21.link/P1857
> [...] but it's possible to do this directly.
>
> One potential complication of recreating the macro isolation semantics
> directly is that you will also isolate include guards which means that
> common headers might have to be scanned multiple times for the same
> TU (the same situation would occur in your current approach if you
> are building the "macro-only BMIs" from scratch for each TU). Here
> is an example:
>
> // importer.cpp
> //
> #include <functional>
> import "header-unit1.hpp"; // Also includes <functional>.
> import "header-unit2.hpp"; // Also includes <functional>.
>
> In this example a naive implementation would end up scanning
> <functional> multiple times.
>
This isn't a concern for us. We minimize <functional> once (and build a
data structure that lets us skip past preprocessor blocks), and then parse
the minified version each time it is included. Parsing the minified version
is essentially free.
>
> Gaby suggested to me (in a private email) that this can be avoided
> but (according to my understanding, at least) it will require a
> pretty sophisticated macro dependency analysis (I can elaborate
> if there is interest).
>
>
> > The only issue is if you want to automatically detect `importable
> header`s
> > by seeing `import <header>;`.
>
> To me this looks like a recipe for build non-determinism. Even if you
> ignore the case where parts are built independently, I believe to get
> decent performance a build system will need to scan in parallel. Which
> means the decision to consider a header importable will be racy.
>
There's no non-determinism in the way I presented this. Each TU makes its
own decision on if a header is importable or not based on if it ever saw
`import <header>;` at the start of phase 4.
Also we already scan in parallel with one scan task per TU.
- Michael Spencer
wrote:
> Michael Spencer <bigcheesegs_at_[hidden]> writes:
>
> > Yes, we perfectly model the preprocessor state as it pertains to
> > dependencies. Currently this is done by actually building a header unit
> > that is empty except for the exported macros and then importing that,
>
> Interesting, thanks for the information. Do you build such "macro-only
> BMIs" once and then share them across all the TUs being scanned or do
> you have to build them from scratch per-TU (thus potentially redoing
> the same thing multiple times)?
>
Currently they are shared where possible, but I don't think you actually
need to do this, scanning all of LLVM + Clang in non-modules mode takes
milliseconds to do, that's with full duplication between each translation
unit.
>
> Also, does the build system has any input (e.g., compiler options)
> into how such BMIs are built? Feels like it would be natural for
> the scanner to ask the build system to build these BMIs... (you
> can probably see where I am going with this ;-)).
>
The scanner is invoked by the build system and is given the full command
line. It would actually be really bad to have the compiler build these as
it would defeat all the caching we do when generating minified files. The
BMIs really are just an artifact of how we got the scanner working for
modules. It's using a mode of the compiler we intend to remove as soon as
possible because of how many issues it causes.
What we would like to do if absolutely necessary for performance is just
build a bytecode interpreter for the bits of the preprocessor that can
impact dependencies. Scanning would then have a shared step of generating
the bytecode once for each file on disk included, and then execute the
bytecode for each TU, caching state where possible. This can trivially be
done in parallel, and can easily handle tens of millions of lines of code
in seconds, really only limited by IO (and the build system starts
dispatching work as soon as the first leaf module is discovered, doesn't
need to scan the entire build). The ability to do this kind of scanning is
the primary reason I wrote https://wg21.link/P1857
> [...] but it's possible to do this directly.
>
> One potential complication of recreating the macro isolation semantics
> directly is that you will also isolate include guards which means that
> common headers might have to be scanned multiple times for the same
> TU (the same situation would occur in your current approach if you
> are building the "macro-only BMIs" from scratch for each TU). Here
> is an example:
>
> // importer.cpp
> //
> #include <functional>
> import "header-unit1.hpp"; // Also includes <functional>.
> import "header-unit2.hpp"; // Also includes <functional>.
>
> In this example a naive implementation would end up scanning
> <functional> multiple times.
>
This isn't a concern for us. We minimize <functional> once (and build a
data structure that lets us skip past preprocessor blocks), and then parse
the minified version each time it is included. Parsing the minified version
is essentially free.
>
> Gaby suggested to me (in a private email) that this can be avoided
> but (according to my understanding, at least) it will require a
> pretty sophisticated macro dependency analysis (I can elaborate
> if there is interest).
>
>
> > The only issue is if you want to automatically detect `importable
> header`s
> > by seeing `import <header>;`.
>
> To me this looks like a recipe for build non-determinism. Even if you
> ignore the case where parts are built independently, I believe to get
> decent performance a build system will need to scan in parallel. Which
> means the decision to consider a header importable will be racy.
>
There's no non-determinism in the way I presented this. Each TU makes its
own decision on if a header is importable or not based on if it ever saw
`import <header>;` at the start of phase 4.
Also we already scan in parallel with one scan task per TU.
- Michael Spencer
Received on 2022-05-19 06:36:08