ISOCPP SG15 List: Re: P2898R0: Importable Headers are Not Universally Implementable

From: Gabriel Dos Reis <gdr_at_[hidden]>
Date: Tue, 30 May 2023 20:56:29 +0000

[Mathias]

  * I think part of our disagreement is about what percentage of that cost is due to header units. I think it is a small % of the overall cost of supporting modules, but you are implying that it is a high %, possibly even the majority of the cost. I would be very surprised if that is the case.

As an implementer, owner of the toolset that has the most complete implementation of C++20 Modules+Header Units, I can positively report that header units represent a much higher proportion of the cost of the implementation both in the compiler and build system support. I am not also not at all in favor or header unit removal.

-- Gaby

From: SG15 <sg15-bounces_at_lists.isocpp.org> On Behalf Of Mathias Stearn via SG15
Sent: Tuesday, May 30, 2023 1:05 PM
To: Daniel Ruoso <daniel_at_[hidden]>
Cc: Mathias Stearn <redbeard0531+isocpp_at_gmail.com>; sg15_at_[hidden]
Subject: Re: [SG15] P2898R0: Importable Headers are Not Universally Implementable

On Tue, May 30, 2023 at 6:17 PM Daniel Ruoso <daniel_at_[hidden]<mailto:daniel_at_[hidden]>> wrote:
Em ter., 30 de mai. de 2023 às 11:28, Mathias Stearn <redbeard0531+isocpp_at_[hidden]<mailto:redbeard0531%2Bisocpp_at_[hidden]>> escreveu:
Please don't move the goalposts. You asked me what was needed to support importable headers *in ninja*, so that is what I provided. This is especially notable in the context that where I was saying that different build strategies will apply for different build systems (specifically the execution component like make/ninja/msbuild). *Of course* a strategy that was specifically designed for ninja to take advantage of restat will not work unmodified with make which doesn't.

Sorry, I took too many steps. I skipped acknowledging that this would work in ninja (which you educated me on). And moved on to the next point, which is the impact of requiring a feature like what ninja does.

So I did a bit of digging and I now think it is possible to get restat-like behavior out of make. The "trick" is to use a separate recursive make invocation for scanning vs building. So for example, in your user-facing Makefile, you will have entries like this:

all:
    $(MAKE) -f Makefile2 scan_all
    $(MAKE) -f Makefile2 build_all
.PHONY : all

The scan_all task will do all of the scanning (possibly by defering to sub tasks if you aren't using a megascan approach) and then exit that instance of make. The build_all task will run with a new instance of make, so it won't see that the scanning tasks were considered dirty and will only look at the mtimes of the outputs. If the scan_all task (and its dependencies) only touch files whose outputs have changed, then the downstream build tasks will not need to rerun.

Note that you will want your internal build_foo tasks to depend on the internal build_bar tasks rather than the user-facing bar tasks. This will ensure that you only end up with make invoking itself (and doing all stat calls) twice, rather than many times.

This should work with POSIX make. There may be some extensions in GNU make to make this easier or more efficient. I assume that most non-make build systems that support recursively executing themselves will support the same technique.

It essentially means that `make` is no longer a viable implementation to drive a C++ build (not even generated make, as CMake does). As I mentioned before, we use ninja for development builds, but it doesn't satisfy all our requirements for the production builds (e.g.: https://github.com/ninja-build/ninja/issues/1139 ).
You also said that you'd be willing to move to ninja if they took that patch. Given that you have a working (I assume?) patch that meets your needs, I suspect that using a patched ninja is a tiny amount of work relative to the other build system work required to support modules.

Yes. I'm looking at the ecosystem as a whole. This is a whole lot of work for a whole lot of people, if they all have to convert from one build system to another as part of this process.

If you are directly using make, rather than using something like cmake to generate your makefiles, then I suspect that teaching make to understand modules and get correct and precise builds will be a similar amount of effort to porting off of make. Unless of course gmake gets a built-in feature to build modules.

And that's the crux of what the paper is pointing at: Only a subset of build systems in use today will be able to viably support Importable Headers, which will exclude systems where Named Modules could be implemented.
I'm still not sure how any of this is specific to header units. The scan step is already required for named modules, and if you have a single megascan as a normal (ie not order-only) dep of every compile, AND you are unable to do the equivalent of restat=1, it is going to hurt.

The list of named modules is not an input to the dependency scan, because they don't affect the preprocessor state.

The list of header units and their local preprocessor arguments have to be an input to the dependency scan ("megascan" or not). Which means when the list changes it invalidates the entire build (for a build system that doesn't know how to stop the invalidation from propagating).

Sure, but if you are doing a megascan then every source file in the build is already input to the scan task. And I assume you are editing source files a lot more often than you changing the list of header units.

If you are doing many microscans, and you don't want to rely on the scanner to do its own caching for fast rescans, then you can use a modified form of the above technique with a `scandeps_all` step before the `scan` step. That would look at the dynamic dependencies of that scan (ie which header files were actually included/imported) and only touch a `scan_foo.scandeps` file if the list of header units changed in a way that affected them or if the file didn't exist. The `scan_foo` task would then be declared to depend on `scan_foo.scandeps` and NOT on the actual file containing the list of header units. If you are using make, you might already be doing something similar to detect if the command lines have changed to force granular rebuilds of exactly the tasks that have commands that have changed, another critical feature of ninja that make doesn't have built in.

I guess the fundamental question is whether we have consensus on "GNU Make is no longer a viable C++ build driver". I, personally, strongly oppose that direction.
Personally, I'd love if we could decouple from broken build systems of the past. There is so much fundamentally wrong with Make, that I'd love to see adoption of modules, and the need to do build system work anyway, be used as a reason to wholesale move off of it. I'm not saying that it *can't* be made to work, or even work well, I'm just saying that we should instead invest any effort in less fundamentally broken tools.

And it may be the case that this is the direction that we'll go, however I'd like us to be very explicit about it. After all there are tons of projects using a variety of build systems, and we'd be creating a whole lot of costs for those people.

My point is that introducing modules *already* is creating a lot of costs. We have already accepted this, and I don't think we would consider backing out of modules. I think part of our disagreement is about what percentage of that cost is due to header units. I think it is a small % of the overall cost of supporting modules, but you are implying that it is a high %, possibly even the majority of the cost. I would be very surprised if that is the case. However, given that I don't think we have any implementations of scanners and compilers that are production ready (or if there are, I haven't personally tried to use them yet), I must admit that this is more of a gut feeling rather than based on evidence.

daniel

Received on 2023-05-30 20:56:33