C++ Logo

sg15

Advanced search

Re: A Different Approach To Compiling C++

From: Hassan Sajjad <hassan.sajjad069_at_[hidden]>
Date: Mon, 11 Sep 2023 19:02:39 +0500
Hi.

I emailed this https://lists.isocpp.org/sg15/2023/09/2040.php a few days
ago.
I am looking forward for your review.

Best,
Hassan Sajjad

On Tue, Sep 5, 2023, 09:46 Hassan Sajjad <hassan.sajjad069_at_[hidden]> wrote:

> Hi.
>
> Please share your thoughts on this.
>
> Best,
> Hassan Sajjad
>
>
> On Sun, Sep 3, 2023, 04:57 Hassan Sajjad <hassan.sajjad069_at_[hidden]>
> wrote:
>
>> Hi.
>>
>> First of all, great job on your writeups. They are very well considered.
>>> I appreciate how much effort you're putting into communicating your ideas
>>> and developing better C++ tools.
>>>
>>
>> Thank you. This complement means a lot. It would be an immense pleasure
>> if I or my project could be helpful in improving C++ tooling.
>>
>> I think your design could work
>>>
>>
>> This was so good to hear :). Thank you again.
>>
>>
>>> though I expect the node identities of the imported entities need to be
>>> tweaked to better model a combination of the file and the flags used to
>>> parse the file. The build system would ideally know when to just reuse a
>>> particular parse of a particular entity, especially when optimizing for
>>> build speed as you are.
>>>
>>
>> With the new modifications my build system now covers 50% of the use case
>> you have described in your email. Please see the link
>> https://github.com/HassanSajjad-302/Example12/blob/main/hmake.cpp.
>>
>> This might appear big, but without comments, I think this is quite
>> concise for what it is achieving. This covers the scenarios where in a
>> mono-repo style project a dependency builds with different flags than the
>> consumer e.g. when the dependency supports only the c++20 but the project
>> is using c++23 or e.g. a dependency has to be built by a different compiler
>> for performance reasons.
>>
>> However, this is a manual approach, and we can not detect whether the ifc
>> files from a prebuilt target are compatible for use in a target we are
>> building or not.
>>
>> This paper
>> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2581r2.pdf
>> provides good guidance on how this can be achieved in the build system.
>>
>> It is mentioned in the paper
>>
>> Those categories, however, represent higher-level semantics. It is not
>>> the case that the build system can introspect the command line used to
>>> produce a BMI on another project and decide which arguments fall on which
>>> of the categories. They need to be authored by the engineer maintaining the
>>> build system.
>>
>>
>> This paper proposes that build systems should offer a mechanism to
>>> identify which of the options used in a translation unit are a Local
>>> Preprocessor Argument.
>>
>>
>> It is also mentioned
>>
>> The scope of compatibility for consuming an existing built module
>>> interface file is defined by the union of the Basic Toolchain Configuration
>>> Arguments and the BMI-Sensitive arguments. And it should explicitly exclude
>>> Local Preprocessor Arguments. Those arguments are the ones that should be
>>> used to define the compatibility identifier.
>>
>>
>> The build system should take the compiler invocation for a translation
>>> unit, remove the Local Preprocessor Arguments and the reference to the
>>> specific translation unit and invoke the compiler in that mode in order to
>>> obtain the compatibility identifier for BMIs produced and consumed by that
>>> translation unit.
>>
>>
>> I was thinking that instead of changing the build system to have an API
>> to allow the user to specify the flags of different categories this job is
>> also given to the compiler. So, the compiler when invoked with a specific
>> option with the full compile command excluding input files, outputs:
>> 1) Concatenation of Basic Toolchain Configuration Arguments and
>> BMI-Sensitive Arguments.
>> 2) Compiler Definitions
>> 3) Include Directories
>> 4) Other Arguments
>> 5) Compatibility Identifier
>>
>> Following the example mentioned in the paper, I broke Local Preprocessor
>> Arguments into 2 and 3. Can there be more besides these? Also, maybe 5 can
>> be left out and 1 can be used instead.
>>
>> Exempting the build-system of the responsibility of categorization of
>> different flags keeps the user-facing API of the build-system more concise
>> as the user does not need to specify the compiler flags in 4 different
>> categories instead of 1.
>>
>> It seems that there is acceptance regarding the dynamic loading of the
>> compiler shared-library and the mentioned API. Would it be appropriate to
>> prepare a formal proposal for consideration?
>>
>> Best,
>> Hassan Sajjad
>>
>> On Sat, Aug 26, 2023 at 3:48 AM Bret Brown <mail_at_[hidden]> wrote:
>>
>>> Hi Hassan,
>>>
>>> First of all, great job on your writeups. They are very well considered.
>>> I appreciate how much effort you're putting into communicating your ideas
>>> and developing better C++ tools.
>>>
>>> I have one concern with the last design you shared with us. As far As I
>>> understand currently, we expect that both imported header units and public
>>> module interfaces will need to be parsed multiple times per build graph.
>>> This has been discussed in the ISO Tooling Study Group a bit already, so
>>> it's not a new isea. In short, it's expected that in typical scenarios, the
>>> build system would need to model multiple parses of a particular importable
>>> entity. This is because, unfortunately, some compilation options of the
>>> importing translation unit will need to be used to parse the imported
>>> translation unit.
>>>
>>> I think your design could work, though I expect the node identities of
>>> the imported entities need to be tweaked to better model a combination of
>>> the file and the flags used to parse the file. The build system would
>>> ideally know when to just reuse a particular parse of a particular entity,
>>> especially when optimizing for build speed as you are.
>>>
>>> Daniel is hinting at this challenge upthread in his reference to
>>> preprocessor state. You can read more about that in his ISO papers that
>>> have been published in the last year, including https://wg22.link/p2898.
>>> Daniel also elaborated on this with diagrams in his C++Now talk this
>>> spring. You can find it here:
>>> https://youtu.be/_LGR0U5Opdg?si=H92b0eyGQd8Vsr0v. Note that Daniel said
>>> upthread that his concerns are satisfied by some new ideas that were
>>> discussed in Varna. But the need for build systems to model multiple parses
>>> per imported interface remains given current understanding.
>>>
>>> I suspect certain codebases that are carefully governed might have
>>> exactly one parse per imported unit, but a build system that wants to
>>> support things like existing package management ecosystems, among other
>>> examples, would need to consider a more complicated approach.
>>>
>>> Bret Brown
>>>
>>>
>>> On Thu, Aug 24, 2023, 23:53 Hassan Sajjad via SG15 <
>>> sg15_at_[hidden]> wrote:
>>>
>>>> Thanks for reaching out. I have gone through the thread and I am
>>>>> still a little confused probably because the approach is quite
>>>>> different
>>>>> from what we have been used to.
>>>>>
>>>>
>>>> Thank you so much for commenting. Yes, it is a quite different approach.
>>>>
>>>> It seems that there are two aspects that you are proposing: a) the
>>>>> way the build system describes the build rules; b) how we can
>>>>> efficiently translate the build rules into action. IIUC in some
>>>>> languages (duck typing predominantly) they use the language itself to
>>>>> describe the rules, too.
>>>>>
>>>>
>>>> I just wanted to let you know that I am not commenting on what language
>>>> the build-system should use. I am just proposing a new approach with the
>>>> potential of a good speed-up. I am only partially aware of the other
>>>> build-systems design, specifically the design around C++20 modules /
>>>> header-units. Because this is a different approach, non-trivial changes
>>>> might be needed in other build-systems to support it.
>>>>
>>>> The second part is more interesting to me. AFAICT your approach
>>>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>>>> registry" which upon of the symbol to compile the relevant subgraph?
>>>>> I'd
>>>>> appreciate if you could describe more verbosely (with examples maybe)
>>>>> how the process works.
>>>>>
>>>>
>>>> My pleasures.
>>>>
>>>> Please see the attached document. In it, I analyze how I plan to add
>>>> support for this in my build-system. I also provide an example. Based on
>>>> the analysis, I am confident in my ability to implement this within one
>>>> month if it's approved and a compiler with an API becomes available. While
>>>> I currently see no issues, I acknowledge the possibility of limitations and
>>>> potential errors. I am currently awaiting feedback from other stakeholders
>>>> before proceeding.
>>>>
>>>> A slight correction in my above email. I mentioned that my build-system
>>>> now is limitation-free with the adoption of the new consensus. Well, it is
>>>> for a clean build. But there is a small bug for rebuild which will be fixed
>>>> soon.
>>>>
>>>> Best,
>>>> Hassan Sajjad
>>>>
>>>> On Tue, Aug 22, 2023 at 8:49 PM Vassil Vassilev <v.g.vassilev_at_[hidden]>
>>>> wrote:
>>>>
>>>>> Hi Hassan,
>>>>>
>>>>> Thanks for reaching out. I have gone through the thread and I am
>>>>> still a little confused probably because the approach is quite
>>>>> different
>>>>> from what we have been used to.
>>>>>
>>>>> It seems that there are two aspects that you are proposing: a) the
>>>>> way the build system describes the build rules; b) how we can
>>>>> efficiently translate the build rules into action. IIUC in some
>>>>> languages (duck typing predominantly) they use the language itself to
>>>>> describe the rules, too.
>>>>>
>>>>> The second part is more interesting to me. AFAICT your approach
>>>>> inverts the build graph /somehow/. Is the aim to provide a "symbol
>>>>> registry" which upon of the symbol to compile the relevant subgraph?
>>>>> I'd
>>>>> appreciate if you could describe more verbosely (with examples maybe)
>>>>> how the process works.
>>>>>
>>>>> Best, Vassil
>>>>>
>>>>> On 7/29/23 3:28 PM, Hassan Sajjad via SG15 wrote:
>>>>> > Hi.
>>>>> >
>>>>> > I will like to showcase my build-system HMake
>>>>> > https://github.com/HassanSajjad-302/HMake.
>>>>> >
>>>>> > It has C++20 modules and header-units support. It also supports
>>>>> > drop-in header-files to header-units replacement.
>>>>> >
>>>>> > With it, with MSVC I compiled an SFML example with C++ 20
>>>>> header-units.
>>>>> > https://github.com/HassanSajjad-302/SFML
>>>>> >
>>>>> > HMake however has a flaw and there is no easy solution for it. To
>>>>> fix
>>>>> > this, I will like to propose a new way to compile source files.
>>>>> >
>>>>> https://github.com/HassanSajjad-302/HMake/wiki/HMake-Flaw-And-Its-Possible-Fixes
>>>>> >
>>>>> > I am very confident that the adoption of this will result
>>>>> in flawless
>>>>> > module and header-unit support in HMake which will translate to a
>>>>> very
>>>>> > good user experience while converting their code base to C++20
>>>>> modules.
>>>>> >
>>>>> > Please share your thoughts.
>>>>> >
>>>>> > Best,
>>>>> > Hassan Sajjad
>>>>> >
>>>>> > _______________________________________________
>>>>> > SG15 mailing list
>>>>> > SG15_at_[hidden]
>>>>> > https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>>>
>>>>>
>>>>> _______________________________________________
>>>> SG15 mailing list
>>>> SG15_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>>>
>>>

Received on 2023-09-11 14:02:54