C++ Logo

sg15

Advanced search

Re: A Different Approach To Compiling C++

From: Hassan Sajjad <hassan.sajjad069_at_[hidden]>
Date: Sun, 3 Sep 2023 04:57:24 +0500
Hi.

First of all, great job on your writeups. They are very well considered. I
> appreciate how much effort you're putting into communicating your ideas and
> developing better C++ tools.
>

Thank you. This complement means a lot. It would be an immense pleasure if
I or my project could be helpful in improving C++ tooling.

I think your design could work
>

This was so good to hear :). Thank you again.


> though I expect the node identities of the imported entities need to be
> tweaked to better model a combination of the file and the flags used to
> parse the file. The build system would ideally know when to just reuse a
> particular parse of a particular entity, especially when optimizing for
> build speed as you are.
>

With the new modifications my build system now covers 50% of the use case
you have described in your email. Please see the link
https://github.com/HassanSajjad-302/Example12/blob/main/hmake.cpp.

This might appear big, but without comments, I think this is quite concise
for what it is achieving. This covers the scenarios where in a mono-repo
style project a dependency builds with different flags than the consumer
e.g. when the dependency supports only the c++20 but the project is using
c++23 or e.g. a dependency has to be built by a different compiler for
performance reasons.

However, this is a manual approach, and we can not detect whether the ifc
files from a prebuilt target are compatible for use in a target we are
building or not.

This paper
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2581r2.pdf
provides good guidance on how this can be achieved in the build system.

It is mentioned in the paper

Those categories, however, represent higher-level semantics. It is not the
> case that the build system can introspect the command line used to produce
> a BMI on another project and decide which arguments fall on which of the
> categories. They need to be authored by the engineer maintaining the build
> system.


This paper proposes that build systems should offer a mechanism to identify
> which of the options used in a translation unit are a Local Preprocessor
> Argument.


It is also mentioned

The scope of compatibility for consuming an existing built module interface
> file is defined by the union of the Basic Toolchain Configuration Arguments
> and the BMI-Sensitive arguments. And it should explicitly exclude Local
> Preprocessor Arguments. Those arguments are the ones that should be used to
> define the compatibility identifier.


The build system should take the compiler invocation for a translation
> unit, remove the Local Preprocessor Arguments and the reference to the
> specific translation unit and invoke the compiler in that mode in order to
> obtain the compatibility identifier for BMIs produced and consumed by that
> translation unit.


I was thinking that instead of changing the build system to have an API to
allow the user to specify the flags of different categories this job is
also given to the compiler. So, the compiler when invoked with a specific
option with the full compile command excluding input files, outputs:
1) Concatenation of Basic Toolchain Configuration Arguments and
BMI-Sensitive Arguments.
2) Compiler Definitions
3) Include Directories
4) Other Arguments
5) Compatibility Identifier

Following the example mentioned in the paper, I broke Local Preprocessor
Arguments into 2 and 3. Can there be more besides these? Also, maybe 5 can
be left out and 1 can be used instead.

Exempting the build-system of the responsibility of categorization of
different flags keeps the user-facing API of the build-system more concise
as the user does not need to specify the compiler flags in 4 different
categories instead of 1.

It seems that there is acceptance regarding the dynamic loading of the
compiler shared-library and the mentioned API. Would it be appropriate to
prepare a formal proposal for consideration?

Best,
Hassan Sajjad

On Sat, Aug 26, 2023 at 3:48 AM Bret Brown <mail_at_[hidden]> wrote:

> Hi Hassan,
>
> First of all, great job on your writeups. They are very well considered. I
> appreciate how much effort you're putting into communicating your ideas and
> developing better C++ tools.
>
> I have one concern with the last design you shared with us. As far As I
> understand currently, we expect that both imported header units and public
> module interfaces will need to be parsed multiple times per build graph.
> This has been discussed in the ISO Tooling Study Group a bit already, so
> it's not a new isea. In short, it's expected that in typical scenarios, the
> build system would need to model multiple parses of a particular importable
> entity. This is because, unfortunately, some compilation options of the
> importing translation unit will need to be used to parse the imported
> translation unit.
>
> I think your design could work, though I expect the node identities of the
> imported entities need to be tweaked to better model a combination of the
> file and the flags used to parse the file. The build system would ideally
> know when to just reuse a particular parse of a particular entity,
> especially when optimizing for build speed as you are.
>
> Daniel is hinting at this challenge upthread in his reference to
> preprocessor state. You can read more about that in his ISO papers that
> have been published in the last year, including https://wg22.link/p2898.
> Daniel also elaborated on this with diagrams in his C++Now talk this
> spring. You can find it here:
> https://youtu.be/_LGR0U5Opdg?si=H92b0eyGQd8Vsr0v. Note that Daniel said
> upthread that his concerns are satisfied by some new ideas that were
> discussed in Varna. But the need for build systems to model multiple parses
> per imported interface remains given current understanding.
>
> I suspect certain codebases that are carefully governed might have exactly
> one parse per imported unit, but a build system that wants to support
> things like existing package management ecosystems, among other examples,
> would need to consider a more complicated approach.
>
> Bret Brown
>
>
> On Thu, Aug 24, 2023, 23:53 Hassan Sajjad via SG15 <sg15_at_[hidden]>
> wrote:
>
>> Thanks for reaching out. I have gone through the thread and I am
>>> still a little confused probably because the approach is quite different
>>> from what we have been used to.
>>>
>>
>> Thank you so much for commenting. Yes, it is a quite different approach.
>>
>> It seems that there are two aspects that you are proposing: a) the
>>> way the build system describes the build rules; b) how we can
>>> efficiently translate the build rules into action. IIUC in some
>>> languages (duck typing predominantly) they use the language itself to
>>> describe the rules, too.
>>>
>>
>> I just wanted to let you know that I am not commenting on what language
>> the build-system should use. I am just proposing a new approach with the
>> potential of a good speed-up. I am only partially aware of the other
>> build-systems design, specifically the design around C++20 modules /
>> header-units. Because this is a different approach, non-trivial changes
>> might be needed in other build-systems to support it.
>>
>> The second part is more interesting to me. AFAICT your approach
>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>> registry" which upon of the symbol to compile the relevant subgraph? I'd
>>> appreciate if you could describe more verbosely (with examples maybe)
>>> how the process works.
>>>
>>
>> My pleasures.
>>
>> Please see the attached document. In it, I analyze how I plan to add
>> support for this in my build-system. I also provide an example. Based on
>> the analysis, I am confident in my ability to implement this within one
>> month if it's approved and a compiler with an API becomes available. While
>> I currently see no issues, I acknowledge the possibility of limitations and
>> potential errors. I am currently awaiting feedback from other stakeholders
>> before proceeding.
>>
>> A slight correction in my above email. I mentioned that my build-system
>> now is limitation-free with the adoption of the new consensus. Well, it is
>> for a clean build. But there is a small bug for rebuild which will be fixed
>> soon.
>>
>> Best,
>> Hassan Sajjad
>>
>> On Tue, Aug 22, 2023 at 8:49 PM Vassil Vassilev <v.g.vassilev_at_[hidden]>
>> wrote:
>>
>>> Hi Hassan,
>>>
>>> Thanks for reaching out. I have gone through the thread and I am
>>> still a little confused probably because the approach is quite different
>>> from what we have been used to.
>>>
>>> It seems that there are two aspects that you are proposing: a) the
>>> way the build system describes the build rules; b) how we can
>>> efficiently translate the build rules into action. IIUC in some
>>> languages (duck typing predominantly) they use the language itself to
>>> describe the rules, too.
>>>
>>> The second part is more interesting to me. AFAICT your approach
>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>> registry" which upon of the symbol to compile the relevant subgraph? I'd
>>> appreciate if you could describe more verbosely (with examples maybe)
>>> how the process works.
>>>
>>> Best, Vassil
>>>
>>> On 7/29/23 3:28 PM, Hassan Sajjad via SG15 wrote:
>>> > Hi.
>>> >
>>> > I will like to showcase my build-system HMake
>>> > https://github.com/HassanSajjad-302/HMake.
>>> >
>>> > It has C++20 modules and header-units support. It also supports
>>> > drop-in header-files to header-units replacement.
>>> >
>>> > With it, with MSVC I compiled an SFML example with C++ 20 header-units.
>>> > https://github.com/HassanSajjad-302/SFML
>>> >
>>> > HMake however has a flaw and there is no easy solution for it. To fix
>>> > this, I will like to propose a new way to compile source files.
>>> >
>>> https://github.com/HassanSajjad-302/HMake/wiki/HMake-Flaw-And-Its-Possible-Fixes
>>> >
>>> > I am very confident that the adoption of this will result in flawless
>>> > module and header-unit support in HMake which will translate to a very
>>> > good user experience while converting their code base to C++20 modules.
>>> >
>>> > Please share your thoughts.
>>> >
>>> > Best,
>>> > Hassan Sajjad
>>> >
>>> > _______________________________________________
>>> > SG15 mailing list
>>> > SG15_at_[hidden]
>>> > https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>
>>>
>>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
>

Received on 2023-09-02 23:57:33