sg15: Re: [Tooling] Modules feedback

From: JF Bastien <cxx_at_[hidden]>
Date: Fri, 8 Feb 2019 21:33:37 -0800

On Fri, Feb 8, 2019 at 1:59 PM Ben Boeckel <ben.boeckel_at_[hidden]> wrote:

> On Fri, Feb 08, 2019 at 13:34:43 -0800, JF Bastien wrote:
> > Let us know if you have any feedback!
>
> §2
> > It is no slower to determine the dependencies of a modular C++ file
> > than a normal C++ file today, and current proposals aimed at
> > restricting the preamble wouldn’t make a scanner any faster.
>
> Header dependencies are subtly, but crucially, different than module
> dependencies. One can determine header dependencies *while compiling*
> the file (though is not necessarily how it is implemented) and can be
> done at any time once the source file is up-to-date. The first run does
> not need this information because we already know we need to run the
> output; storing discovered dependency information is a nice side effect.
>
> Module dependencies must be present *during compile*, so must be
> determined at build time (since source files may not be available before
> the build has started) *before* compilation starts.
>

Maybe it would help anchor our discussion if we had a graph of "build
activity over time" where we quantify parallelism versus sequentialism, and
what each operation is doing? Put another way, a step which blocks many
others is fine as long as it's pretty fast. How fast would dependency
scanning have to be on a project of LLVM's size to make you comfortable?

It might also be useful to separate concerns between clean build and
incremental build. The costs won't be the same in both, so we should
probably discuss them independently (or rather, in the same paper but not
the same paragraph).

§2.2
> > <clang-scan-deps>
>
> Assuming there is consensus on D1483§7, I think this tool can be worked
> to satisfy it. There may be issues around having the tool emulate other
> compilers, but we have experience with that as well[1]. I think we may
> even be able to drop some features from the list (e.g., pcm generation
> at scan time and "installed listeners") in your paper.
>

We're not saying that this specific implementation should be the only one
(or that it shouldn't!). I'd certainly be interested in seeing other
projects implement similar tools.

> §2.3
> > <mapping files>
>
> The approach we describe doesn't require mapping files at all. Other
> tools may find them useful however. I'm thinking mainly static analysis
> tools (versus those that piggyback on the compiler like IWYU and
> `clang-tidy`) since they can't just say "run us with your build tool".
>

That can certainly be an optional thing.

§2.4
> > We believe modules should be built (not shipped or distributed) as
> > part of a build, and potentially shared in the same environment for
> > other compiler invocations that end up using non-conflicting setups.
>
> Linux distros aren't going to like this… Nor are ports-based systems
> (like Homebrew). Go, Rust, and other languages can get away with it
> because given source code, how to build it is dictated by convention and
> available tooling. C++ is a wild west of solutions for the "source ->
> binary" transformation and given a set of sources, there's no "good way"
> to just know how to compile it today.
>
> That said, I don't think it's something the language standard can
> dictate, but compilers can work together to provide something shippable
> beside compiled libraries.
>

I don't think we're talking about the same thing: our paper talks about
shipping something the compiler created between source code and a native
binaries, and we don't think that's necessary.

Linux, Homebrew, and other platforms (such as, say, the one I support)
currently ship headers with native binaries. We believe that modules allows
them to continue doing so, both for their own code which yields said native
binaries, as well as for developers on these platforms which link to the
native binaries (by referring to the headers) using modules for their own
code.

Agreed it's a bit of a wild west, but again modules aren't The C++ Savior,
and they don't need to solve this particular problem in our opinion.
There's plenty of people who are looking at a variety of solutions,
including shipping LLVM IR, using WebAssembly, or putting your compiler's
artifacts in the blockchain. I wouldn't want C++ modules to come in and
remove all the fun innovation.

I agree that we could standardize some C++ module format that every
toolchain agrees to, and somehow fix the problem with multiple
configurations (the one we describe with -D, -Werror, optimization levels,
etc). That would indeed be an easy distribution format. It would, however,
severely restrict what implementations can do. I don't think it's a
valuable tradeoff to make at this point in time.

I'll draw a parallel with LLVM IR: it *can* be stabilized in some way, but
that has a bunch of issues (some solvable!). Were LLVM IR actually
stabilized we'd lose plenty of flexibility as a compiler. It's not a silly
idea, it's been tried plenty of times, but so far LLVM has seen many valid
reasons to change its IR over time. We can figure things out like
auto-upgrading from older versions to newer ones, how to handle semantic
changes, how to encode information which isn't fully relevant (such as
debug info and various metadata), but... Well my experience tells me we
won't get it right, and it'll take quite a while to get something somewhat
sensible.

Thanks,
>
> --Ben
>
> [1]https://github.com/CastXML/CastXML
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>

Received on 2019-02-09 06:33:52