ISOCPP SG15 List: Re: [P3033] Should we import function bodies to get better optimizations?

From: Chuanqi Xu <chuanqi.xcq_at_[hidden]>
Date: Wed, 01 Nov 2023 16:22:58 +0800

Hi Mathias,
Thanks for your valuable reply!
> Can't you keep the 2-phase model for parallelism AND have minimal rebuilds? ... You could also produce 2 BMIs, one with just the interface for importers, and one with preparsed function bodies for phase 2.
Great idea! I never thought that there was such a method. I'll add it to the TODO list. Also the user interfaces are easy to design. We can add a new flag `-freduced-BMI-output=<path>`. Then no matter we're in phase 1 model or phase 2 model, the compiler can produce the reduced BMI to the specified position. The interface is really easy to be adopted with CMake and also it leaves the space for CMake to support 2 phase compilation models in the future.
> I suppose an alternative would be for compilers to store their own metadata about which specific functions were considered and their hash, so that they can no-op themselves (like ccache on steroids) if they detect that there is no change. While the build system would still invoke the compiler, it would be nearly free as long as you only changed bodies not transitively called by that TU. To me, that seems like the gold standard, although it is radically different from what we do today.
I think the compiler still needs to parse the function to get the hash so it might not be a no-op IIUC. Do I misunderstand?
> I'd rather we leave the option of whether I care more about rebuilds and ABI stability vs optimization to then end project doing the compilation rather than mandating a specific set of tradeoffs in WG21/SG15.
My understanding for SG15 is a place where vendors can try to get a consensus. Then when vendors get the consensus, the users would get the uniform using experiences. This is the point. But yeah, if SG15 can't get the consensus, it is not super bad that users need to remember and understand the different models of tools. It is already the case now : |
Then my current position for the question: `faster rebuilds and ABI stability vs optimization` is that I prefer faster rebuilding. I've seen at least 3 people saying that it may be a disaster to touch a module unit in the deep position of the dependent chain. Then the headers are better in this case, since we can compile all the .cpp files in parallel. Then they're complaining they don't feel modules are not better than headers : (
> I'd even like to leave the door open to compilers implicitly inlining any function from imported units if its hueristics think it is likely to benefit from inlining. I've been told that compilers don't have a good way to know today at phase 1 whether it will be beneficial to make any given function body available to importers. But I really hope within a few years they can learn some simple hueristics so that 90%+ of the time developers don't need to worry about manually deciding what to mark as inline, and only need to use it when the hueristics guess incorrectly. To me, manually marking inline in modules feels almost as silly as telling the compiler what to put in registers. (In headers it serves a purpose as a misspelled in_header keyword)
Yeah, totally agreed. It may be implemented by combining techniques from PGO or LTO. But it may not happen this year or the next year : )
Thanks,
Chuanqi
------------------------------------------------------------------
From:Mathias Stearn <redbeard0531_at_[hidden]>
Send Time:2023 Nov. 1 (Wed.) 15:10
To:Mathias Stearn via SG15 <sg15_at_[hidden]>
Cc:Chuanqi <chuanqi.xcq_at_[hidden]m>
Subject:Re: [SG15] [P3033] Should we import function bodies to get better optimizations?
Can't you keep the 2-phase model for parallelism AND have minimal rebuilds? It just means that the input to phase 2 is both the raw source file and the BMI. You could also produce 2 BMIs, one with just the interface for importers, and one with preparsed function bodies for phase 2, so you don't have to throw away any work spent on parsing. Ideally this would also take the source file as input so that the precompile can stop immediately when it hits the private module fragment, leaving the rest of the parsing to phase 2, unlocking importers ASAP. It may even make sense to have 3 BMIs, splitting the one for importers into one with pure interface (including constexpr and template bodies) and one with the other inline function bodies. Then even if you change an inline function body, consumers with -O0 or -fno-inline won't need to rebuild. This would also allow importers to skip phase 1 rebuilds if only the bodies of importers changed and go straight to phase 2 (potentially saving transitive importers rebuilds!)
I suppose an alternative would be for compilers to store their own metadata about which specific functions were considered and their hash, so that they can no-op themselves (like ccache on steroids) if they detect that there is no change. While the build system would still invoke the compiler, it would be nearly free as long as you only changed bodies not transitively called by that TU. To me, that seems like the gold standard, although it is radically different from what we do today.
I'd rather we leave the option of whether I care more about rebuilds and ABI stability vs optimization to then end project doing the compilation rather than mandating a specific set of tradeoffs in WG21/SG15. I'd even like to leave the door open to compilers implicitly inlining any function from imported units if its hueristics think it is likely to benefit from inlining. I've been told that compilers don't have a good way to know today at phase 1 whether it will be beneficial to make any given function body available to importers. But I really hope within a few years they can learn some simple hueristics so that 90%+ of the time developers don't need to worry about manually deciding what to mark as inline, and only need to use it when the hueristics guess incorrectly. To me, manually marking inline in modules feels almost as silly as telling the compiler what to put in registers. (In headers it serves a purpose as a misspelled in_header keyword)
I know there are use cases for preventing inlining for ABI resilience. However, those use cases can always be solved by not making the body available to the compiler at all by putting them in a non-imported implementation file rather than a module unit transitively imported by the primary. That should guarantee that there will be no inlining, as long as you don't make object files available for LTO. This is how it has worked before modules, and I don't think we should be favoring that use case with easier syntax over end developers who want their compilers to give them good perf automatically.
On Wed, 1 Nov 2023, 06.10 Chuanqi Xu via SG15, <sg15_at_[hidden] <mailto:sg15_at_[hidden] >> wrote:
Hi,
See https://github.com/llvm/llvm-project/issues/60996 <https://github.com/llvm/llvm-project/issues/60996 > for the motivating issue.
I found that currently the clang's behavior of treating importing functions to optimizers is interesting.
At O0, clang won't generate the bodies of non-inline function from imported modules. But with optimization enabled, clang would generate the bodies of non-inline functions for inlining. Daniel pointed out that this is not only an implementation choices but related to libraries ABI and ODR violations. And he suggests a paper to present the problem clearly and trying to get a consensus in vendors. Then here is the paper: https://isocpp.org/files/papers/P3033R0.html <https://isocpp.org/files/papers/P3033R0.html >
Since I won't be in Kona in person, it may be better to send opinions/comments/questions to the thread directly.
Thanks,
Chuanqi
_______________________________________________
SG15 mailing list
SG15_at_[hidden] <mailto:SG15_at_lists.isocpp.org >
https://lists.isocpp.org/mailman/listinfo.cgi/sg15 <https://lists.isocpp.org/mailman/listinfo.cgi/sg15 >

Received on 2023-11-01 08:23:09