sg15: Re: [Tooling] Clang Modules and build system requirements

From: Matthew Woehlke <mwoehlke.floss_at_[hidden]>
Date: Fri, 8 Feb 2019 14:35:09 -0500

On 08/02/2019 14.01, Tom Honermann wrote:
> Clang modules has demonstrated that 1) programmers are willing to author
> module map files, 2) build systems don't have to be aware of them, and
> 3) dependencies can still be inferred.

The second point is true *if* compilers can build modules on demand, as
they are used.

Now... let's be clear. I think that is a *viable* strategy. It makes
things IMMENSELY easier for certain forms of tooling.

The down side is that my build tool may fire off build jobs for several
TU's that all use the same module. Depending on how the shared cache (if
there is one) is implemented, either each instance will build¹ the same
module and only one will be persisted, or one will be chosen to build
the module while the others wait. (Both options *should* take a similar
amount of wall time, though one will generate more heat.) This will
work, but the jobs that are either doing redundant work or are stalled
is "wasted" time when we probably could have been doing something else.

(@Tom, this is the "lost parallelism" I mentioned in my other message.)

However, if the compiler also offers a mode to "pre-cache" a module
build (this should not be difficult), then we can have the best of both
worlds; build systems that care about this lost parallelism can generate
rules to "pre-cache" a module and inject the dependencies into their
build graph so that this happens before building any TU that needs that
module. This does mean that tools doing so *do* need to understand the
module map. *But*, they could also opt out of doing so, at the cost of
the build potentially doing extra work.

CMake probably wouldn't *need* to change, or would need only minor
changes, to support this mode, however we might *want* to support
"pre-caching". This would work along the lines previously outlined,
*except* I believe it would avoid the bottle-neck on having to collate
module output information, because the module map already provides that.

(¹ The term "build" is used rather loosely throughout.)

> On 2/8/19 12:59 PM, Corentin wrote:
>> Any project beyond a simple Hello World has some of its state in a
>> build system, and the only accurate way to build or parse a TU is to
>> ask the build system for the relevant info.
>
> I don't agree with that. It is common to configure tools with include
> paths and macro definitions gleaned from compilers without involving
> build systems.

I could possibly quibble with whether that's "accurate", but I think
it's moot. Either this comes from the build system, or the user provides
it by hand (or something else?). In whichever case, I don't see why
providing the information for module dependencies would be any more
difficult than providing the comparable info for includes. IOW, we
aren't making this any harder than today.

> If manifest files have to reflect a build configuration, then I think
> the problems we face have been made worse.

This is plausible, e.g. if some module doesn't exist or if a different
set of source files contribute to a module, depending on configuration.
However, I see two possible fixes:

- Use the preprocessor on module maps.
- Such module maps are generated source files.

The latter means that some of your tooling won't work until you build,
but that is already true for generated headers, so IMHO we are no worse
off than today. IOW, I don't see a problem.

-- 
Matthew

Received on 2019-02-08 20:35:13