sg15: Re: [Tooling] Modules feedback

From: Steve Downey <sdowney_at_[hidden]>
Date: Mon, 11 Feb 2019 15:45:07 -0500

I think you can linearize the scan of all the translation units by assuming
all of the unknown modules will be in $PROJECT_MODULE_DIR/module_name.bmi,
and when you find the interface unit that produces module_name, write out
the build dependency there. This works less well for external dependencies.
There's also the additional complication that the imported module name may
not exist yet. That should be a build error, but shouldn't necessarily be a
hard error during scanning and collating. Building unrelated source should
still be possible, the keep-going option.

On Mon, Feb 11, 2019 at 3:30 PM Matthew Woehlke <mwoehlke.floss_at_[hidden]>
wrote:

> On 11/02/2019 14.05, Ben Boeckel wrote:
> > On Mon, Feb 11, 2019 at 11:56:23 -0500, Matthew Woehlke wrote:
> >> I think there might be some communication confusion here. "Determining
> >> dependencies" can mean two things: determining the *identity* of
> >> dependencies, or determining the *location* of dependencies. Determining
> >> the *identity* is, indeed, as easy for modules as for includes. It's the
> >> *location* that's a problem. For includes, the compiler already has a
> >> set of include paths which it can scan (relatively quick; this is just
> >> directory listing) to determine location.
> >>
> >> For modules... I'm not sure how this is going to work. This actually
> >> relates to the open question of how we "ship" modules. But ignore that
> >> for a second.
> >
> > In CMake, the identity is output by the `scan` step. `collate` puts this
> > together with output paths to create full paths to locations of modules.
>
> For project-internal stuff, yes. What about externals? Do those need to
> be scanned also? If so, how?
>
> I think/hope the answer is going to be "no", but while I think we
> generally have consensus how we would *like* that to work, I don't know
> that we have anything that *does* work yet.
>
> I'm not saying that needs to be a "blocker", just that it should be kept
> in mind.
>
> >> Compared to includes, we have more trouble with finding the location of
> >> modules provided by the project. To do that, we have to scan every
> >> source file in the project (or at least, in every library of the project
> >> that the current TU uses, including the TU's own component). This most
> >> assuredly *is* slower. (Unless we use module maps...)
> >
> > The `collate` step handles modules generated by other source files and
> > outputting the information required by the build tool to get the
> > dependency graph correct.
>
> Right. That's the "scan[ning] every source file in [...] every library
> of the project that the current TU uses". The good news is we don't need
> to do that *per TU* (we can do it "once"¹ and reuse the result), but as
> you originally noted, it's a non-trivial difference compared to what we
> do now.
>
> Mostly I was picking at the claim that "It is no slower to determine the
> dependencies of a modular C++ file than a normal C++ file today".
> Strictly speaking, this is true iff you want the dependency *identities*
> but don't care about their *locations*. For a single TU, getting the
> *locations* is decidedly slower. In the context of a complete build,
> however, it's harder to say, since the extra work can be done in
> parallel and the cost is effectively spread out across multiple TU's.
>
> (¹ Modulo need to recompute when things change, but at worst, once per
> build.)
>
> >> What about modules *external* to the project? If they are shipped
> >> already built, then presumably things aren't much different than
> >> includes; we expect them to just be there already.
> >
> > I think there is consensus that compiled modules do *not* get shipped.
> > Some other format (possibly IPR[1]) would be provided.
>
> Right... at which point we have a caching problem, but (despite being
> one of the two hard problems in programming) I think we can deal with
> that :-).
>
> Hmm... but, is it then the worst case that the build tool will have to
> find the module interface files for all modules that are needed but do
> *not* come from local sources, and generate build rules for those? (But
> at least that's back to being a directory-search operation and not a
> file-parse operation. At least I *assume*/hope that's the intent of IPR
> or whatever!) Or do we expect this to be the compiler's job?
>
> (In any case, we're straying from IS territory, though it's probably
> still useful to have an idea of the answers for such questions...)
>
> >> Where I could see this getting *really* ugly is if a) we have no module
> >> maps, and b) we have no solution for shipping already-built modules. If,
> >> in order to use the module from some external library, I first need to
> >> *build* that module for my own project... how do I do that? Scan every
> >> source file *on the entire system*? That's not just slower, it's not
> >> even feasible. Fortunately, I think we all want to not have to go
> there...
> >
> > A module is what you need to consume the interface. You don't need the
> > source itself for that. The IPR would just say what symbols are provided
> > via that module (analogous to headers saying what symbols are available)
> > and the linker would do the same job it has always done and hook them up
> > when linking the providing library.
>
> Right... the point was more that we *need* such a thing :-).
>
> More specifically, we need *something* which is a) portable, and b)
> ensures that we can get BMI's from external packages. The ways I see to
> do that are:
>
> - Per (my) above, ship sources and *scan* sources to figure out how to
> build modules. I think we all *really don't* want this.
>
> - Ship module maps to cut out the scanning step of the above. These
> would be *build* artifacts, i.e. build/packaging produces them.
>
> - Per (your) above, ship something that has a 1:1 mapping to modules
> that is a portable representation from which BMI's can be generated. (Or
> which the compiler / tooling can consume directly. Some tooling
> applications may be able to use these directly and never need a BMI.) I
> think most (all?) of us prefer this.
>
> For the last, we *may* still want/need some sort of "module map" to map
> names to module representation artifacts, but as per the middle option,
> this would be generated, not user-maintained.
>
> --
> Matthew
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>

Received on 2019-02-11 21:45:22