C++ Logo

sg15

Advanced search

Re: A Different Approach To Compiling C++

From: Hassan Sajjad <hassan.sajjad069_at_[hidden]>
Date: Tue, 5 Sep 2023 09:46:07 +0500
Hi.

Please share your thoughts on this.

Best,
Hassan Sajjad


On Sun, Sep 3, 2023, 04:57 Hassan Sajjad <hassan.sajjad069_at_[hidden]> wrote:

> Hi.
>
> First of all, great job on your writeups. They are very well considered. I
>> appreciate how much effort you're putting into communicating your ideas and
>> developing better C++ tools.
>>
>
> Thank you. This complement means a lot. It would be an immense pleasure if
> I or my project could be helpful in improving C++ tooling.
>
> I think your design could work
>>
>
> This was so good to hear :). Thank you again.
>
>
>> though I expect the node identities of the imported entities need to be
>> tweaked to better model a combination of the file and the flags used to
>> parse the file. The build system would ideally know when to just reuse a
>> particular parse of a particular entity, especially when optimizing for
>> build speed as you are.
>>
>
> With the new modifications my build system now covers 50% of the use case
> you have described in your email. Please see the link
> https://github.com/HassanSajjad-302/Example12/blob/main/hmake.cpp.
>
> This might appear big, but without comments, I think this is quite concise
> for what it is achieving. This covers the scenarios where in a mono-repo
> style project a dependency builds with different flags than the consumer
> e.g. when the dependency supports only the c++20 but the project is using
> c++23 or e.g. a dependency has to be built by a different compiler for
> performance reasons.
>
> However, this is a manual approach, and we can not detect whether the ifc
> files from a prebuilt target are compatible for use in a target we are
> building or not.
>
> This paper
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2581r2.pdf
> provides good guidance on how this can be achieved in the build system.
>
> It is mentioned in the paper
>
> Those categories, however, represent higher-level semantics. It is not the
>> case that the build system can introspect the command line used to produce
>> a BMI on another project and decide which arguments fall on which of the
>> categories. They need to be authored by the engineer maintaining the build
>> system.
>
>
> This paper proposes that build systems should offer a mechanism to
>> identify which of the options used in a translation unit are a Local
>> Preprocessor Argument.
>
>
> It is also mentioned
>
> The scope of compatibility for consuming an existing built module
>> interface file is defined by the union of the Basic Toolchain Configuration
>> Arguments and the BMI-Sensitive arguments. And it should explicitly exclude
>> Local Preprocessor Arguments. Those arguments are the ones that should be
>> used to define the compatibility identifier.
>
>
> The build system should take the compiler invocation for a translation
>> unit, remove the Local Preprocessor Arguments and the reference to the
>> specific translation unit and invoke the compiler in that mode in order to
>> obtain the compatibility identifier for BMIs produced and consumed by that
>> translation unit.
>
>
> I was thinking that instead of changing the build system to have an API to
> allow the user to specify the flags of different categories this job is
> also given to the compiler. So, the compiler when invoked with a specific
> option with the full compile command excluding input files, outputs:
> 1) Concatenation of Basic Toolchain Configuration Arguments and
> BMI-Sensitive Arguments.
> 2) Compiler Definitions
> 3) Include Directories
> 4) Other Arguments
> 5) Compatibility Identifier
>
> Following the example mentioned in the paper, I broke Local Preprocessor
> Arguments into 2 and 3. Can there be more besides these? Also, maybe 5 can
> be left out and 1 can be used instead.
>
> Exempting the build-system of the responsibility of categorization of
> different flags keeps the user-facing API of the build-system more concise
> as the user does not need to specify the compiler flags in 4 different
> categories instead of 1.
>
> It seems that there is acceptance regarding the dynamic loading of the
> compiler shared-library and the mentioned API. Would it be appropriate to
> prepare a formal proposal for consideration?
>
> Best,
> Hassan Sajjad
>
> On Sat, Aug 26, 2023 at 3:48 AM Bret Brown <mail_at_[hidden]> wrote:
>
>> Hi Hassan,
>>
>> First of all, great job on your writeups. They are very well considered.
>> I appreciate how much effort you're putting into communicating your ideas
>> and developing better C++ tools.
>>
>> I have one concern with the last design you shared with us. As far As I
>> understand currently, we expect that both imported header units and public
>> module interfaces will need to be parsed multiple times per build graph.
>> This has been discussed in the ISO Tooling Study Group a bit already, so
>> it's not a new isea. In short, it's expected that in typical scenarios, the
>> build system would need to model multiple parses of a particular importable
>> entity. This is because, unfortunately, some compilation options of the
>> importing translation unit will need to be used to parse the imported
>> translation unit.
>>
>> I think your design could work, though I expect the node identities of
>> the imported entities need to be tweaked to better model a combination of
>> the file and the flags used to parse the file. The build system would
>> ideally know when to just reuse a particular parse of a particular entity,
>> especially when optimizing for build speed as you are.
>>
>> Daniel is hinting at this challenge upthread in his reference to
>> preprocessor state. You can read more about that in his ISO papers that
>> have been published in the last year, including https://wg22.link/p2898.
>> Daniel also elaborated on this with diagrams in his C++Now talk this
>> spring. You can find it here:
>> https://youtu.be/_LGR0U5Opdg?si=H92b0eyGQd8Vsr0v. Note that Daniel said
>> upthread that his concerns are satisfied by some new ideas that were
>> discussed in Varna. But the need for build systems to model multiple parses
>> per imported interface remains given current understanding.
>>
>> I suspect certain codebases that are carefully governed might have
>> exactly one parse per imported unit, but a build system that wants to
>> support things like existing package management ecosystems, among other
>> examples, would need to consider a more complicated approach.
>>
>> Bret Brown
>>
>>
>> On Thu, Aug 24, 2023, 23:53 Hassan Sajjad via SG15 <sg15_at_[hidden]>
>> wrote:
>>
>>> Thanks for reaching out. I have gone through the thread and I am
>>>> still a little confused probably because the approach is quite different
>>>> from what we have been used to.
>>>>
>>>
>>> Thank you so much for commenting. Yes, it is a quite different approach.
>>>
>>> It seems that there are two aspects that you are proposing: a) the
>>>> way the build system describes the build rules; b) how we can
>>>> efficiently translate the build rules into action. IIUC in some
>>>> languages (duck typing predominantly) they use the language itself to
>>>> describe the rules, too.
>>>>
>>>
>>> I just wanted to let you know that I am not commenting on what language
>>> the build-system should use. I am just proposing a new approach with the
>>> potential of a good speed-up. I am only partially aware of the other
>>> build-systems design, specifically the design around C++20 modules /
>>> header-units. Because this is a different approach, non-trivial changes
>>> might be needed in other build-systems to support it.
>>>
>>> The second part is more interesting to me. AFAICT your approach
>>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>>> registry" which upon of the symbol to compile the relevant subgraph? I'd
>>>> appreciate if you could describe more verbosely (with examples maybe)
>>>> how the process works.
>>>>
>>>
>>> My pleasures.
>>>
>>> Please see the attached document. In it, I analyze how I plan to add
>>> support for this in my build-system. I also provide an example. Based on
>>> the analysis, I am confident in my ability to implement this within one
>>> month if it's approved and a compiler with an API becomes available. While
>>> I currently see no issues, I acknowledge the possibility of limitations and
>>> potential errors. I am currently awaiting feedback from other stakeholders
>>> before proceeding.
>>>
>>> A slight correction in my above email. I mentioned that my build-system
>>> now is limitation-free with the adoption of the new consensus. Well, it is
>>> for a clean build. But there is a small bug for rebuild which will be fixed
>>> soon.
>>>
>>> Best,
>>> Hassan Sajjad
>>>
>>> On Tue, Aug 22, 2023 at 8:49 PM Vassil Vassilev <v.g.vassilev_at_[hidden]>
>>> wrote:
>>>
>>>> Hi Hassan,
>>>>
>>>> Thanks for reaching out. I have gone through the thread and I am
>>>> still a little confused probably because the approach is quite
>>>> different
>>>> from what we have been used to.
>>>>
>>>> It seems that there are two aspects that you are proposing: a) the
>>>> way the build system describes the build rules; b) how we can
>>>> efficiently translate the build rules into action. IIUC in some
>>>> languages (duck typing predominantly) they use the language itself to
>>>> describe the rules, too.
>>>>
>>>> The second part is more interesting to me. AFAICT your approach
>>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>>> registry" which upon of the symbol to compile the relevant subgraph?
>>>> I'd
>>>> appreciate if you could describe more verbosely (with examples maybe)
>>>> how the process works.
>>>>
>>>> Best, Vassil
>>>>
>>>> On 7/29/23 3:28 PM, Hassan Sajjad via SG15 wrote:
>>>> > Hi.
>>>> >
>>>> > I will like to showcase my build-system HMake
>>>> > https://github.com/HassanSajjad-302/HMake.
>>>> >
>>>> > It has C++20 modules and header-units support. It also supports
>>>> > drop-in header-files to header-units replacement.
>>>> >
>>>> > With it, with MSVC I compiled an SFML example with C++ 20
>>>> header-units.
>>>> > https://github.com/HassanSajjad-302/SFML
>>>> >
>>>> > HMake however has a flaw and there is no easy solution for it. To fix
>>>> > this, I will like to propose a new way to compile source files.
>>>> >
>>>> https://github.com/HassanSajjad-302/HMake/wiki/HMake-Flaw-And-Its-Possible-Fixes
>>>> >
>>>> > I am very confident that the adoption of this will result in flawless
>>>> > module and header-unit support in HMake which will translate to a
>>>> very
>>>> > good user experience while converting their code base to C++20
>>>> modules.
>>>> >
>>>> > Please share your thoughts.
>>>> >
>>>> > Best,
>>>> > Hassan Sajjad
>>>> >
>>>> > _______________________________________________
>>>> > SG15 mailing list
>>>> > SG15_at_[hidden]
>>>> > https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>>
>>>>
>>>> _______________________________________________
>>> SG15 mailing list
>>> SG15_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>
>>

Received on 2023-09-05 04:46:22