C++ Logo

sg15

Advanced search

Re: A Different Approach To Compiling C++

From: Tom Honermann <tom_at_[hidden]>
Date: Fri, 22 Sep 2023 10:53:40 -0400
On 9/22/23 3:49 AM, Hassan Sajjad via SG15 wrote:
> Hi.
>
> I think you have plenty of rigor here to justify a numbered ISO
> paper to drive an agenda item for the Tooling Study Group (SG-15).
>
> Thank you. I would like to present my paper
> https://htmlpreview.github.io/?https://github.com/HassanSajjad-302/iso-papers/blob/main/generated/my-paper.html.

Some additional things that I think would be helpful to address in the
paper:

 1. Modern build systems manage multiple concurrently running jobs. How
    would concurrency be handled in your model? Would all compilations
    occur within the (single?) build system process? If so, how would
    you expect the increased memory and CPU utilization to effect
    administrative resource limitations imposed by POSIX ulimits or
    Windows job restrictions? What would the effect be on the Linux OOM
    process killer? How would the build system handle a compiler crash
    that occurs during the build? How would an administrator be expected
    to diagnose a hung or exceptionally long running compilation job
    when they can't rely on process listing to spot individual compiler
    processes.
 2. Would you require the compiler shared library to be able to perform
    a compilation without use of global state (so that multiple
    compilations can be performed concurrently)? All C++ compilers that
    I'm aware of use global state today and cannot handle multiple
    compilations within the same process.
 3. Coverity and similar tools that rely on functionality like that
    provided by Bear <https://github.com/rizsotto/Bear> would not work
    with your model without some protocol that enables them to interact
    with the build system as jobs start and end.

Tom.

>
> I have emailed Nevin Liber for the number following the procedures
> guidelines from the following link,
> https://isocpp.org/std/standing-documents/sd-7-mailing-procedures-and-how-to-write-papers
>
> Though I expect finding some dedicated attention from the author
> of the papers you cite would be a productive use of time
> beforehand, so at least you two are on the same line of thinking
> before starting a larger group discussion. But it's worth
> submitting a paper to drive an interactive discussion even if you
> cannot reasonably arrange that sidebar discussion.
>
>
> Sorry, I could not think of any papers to cite.
>
> I would be very grateful if anyone is interested in co-authoring the
> next revisions of the paper.
>
> Best,
> Hassan Sajjad
>
> On Tue, Sep 12, 2023 at 7:00 PM Bret Brown <mail_at_[hidden]> wrote:
>
> I think you have plenty of rigor here to justify a numbered ISO
> paper to drive an agenda item for the Tooling Study Group (SG-15).
> Though I expect finding some dedicated attention from the author
> of the papers you cite would be a productive use of time
> beforehand, so at least you two are on the same line of thinking
> before starting a larger group discussion. But it's worth
> submitting a paper to drive an interactive discussion even if you
> cannot reasonably arrange that sidebar discussion.
>
> As to the compiler categorizing flags as local or not, I'm open to
> the idea I suppose. Though I suspect there are a lot of strange
> edge cases that would justify a feature in which authored
> information is respected about which preprocessor definitions are
> explicitly local and which ones explicitly are not.
>
> But I agree with you that in most cases, nobody should want to
> maintain detailed build system configurations about how to
> categorize each flag. Though I expect that normally what will
> happen is people won't consider this at all and use whatever
> defaults the build system supports, which I expect to be non-local
> flags since that's the more conservative option with respect to
> accurate parsing.
>
> Bret
>
> On Mon, Sep 11, 2023, 10:02 Hassan Sajjad
> <hassan.sajjad069_at_[hidden]> wrote:
>
> Hi.
>
> I emailed this https://lists.isocpp.org/sg15/2023/09/2040.php
> a few days ago.
> I am looking forward for your review.
>
> Best,
> Hassan Sajjad
>
> On Tue, Sep 5, 2023, 09:46 Hassan Sajjad
> <hassan.sajjad069_at_[hidden]> wrote:
>
> Hi.
>
> Please share your thoughts on this.
>
> Best,
> Hassan Sajjad
>
>
> On Sun, Sep 3, 2023, 04:57 Hassan Sajjad
> <hassan.sajjad069_at_[hidden]> wrote:
>
> Hi.
>
> First of all, great job on your writeups. They are
> very well considered. I appreciate how much effort
> you're putting into communicating your ideas and
> developing better C++ tools.
>
>
> Thank you. This complement means a lot. It would be an
> immense pleasure if I or my project could be helpful
> in improving C++ tooling.
>
> I think your design could work
>
>
> This was so good to hear :). Thank you again.
>
> though I expect the node identities of the
> imported entities need to be tweaked to better
> model a combination of the file and the flags used
> to parse the file. The build system would ideally
> know when to just reuse a particular parse of a
> particular entity, especially when optimizing for
> build speed as you are.
>
>
> With the new modifications my build system now covers
> 50% of the use case you have described in your email.
> Please see the link
> https://github.com/HassanSajjad-302/Example12/blob/main/hmake.cpp.
>
> This might appear big, but without comments, I think
> this is quite concise for what it is achieving. This
> covers the scenarios where in a mono-repo style
> project a dependency builds with different flags than
> the consumer e.g. when the dependency supports only
> the c++20 but the project is using c++23 or e.g. a
> dependency has to be built by a different compiler for
> performance reasons.
>
> However, this is a manual approach, and we can not
> detect whether the ifc files from a prebuilt target
> are compatible for use in a target we are building or not.
>
> This paper
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2581r2.pdf
> provides good guidance on how this can be achieved in
> the build system.
>
> It is mentioned in the paper
>
> Those categories, however, represent higher-level
> semantics. It is not the case that the build
> system can introspect the command line used to
> produce a BMI on another project and decide which
> arguments fall on which of the categories. They
> need to be authored by the engineer maintaining
> the build system.
>
>
> This paper proposes that build systems should
> offer a mechanism to identify which of the options
> used in a translation unit are a Local
> Preprocessor Argument.
>
>
> It is also mentioned
>
> The scope of compatibility for consuming an
> existing built module interface file is defined by
> the union of the Basic Toolchain Configuration
> Arguments and the BMI-Sensitive arguments. And
> it should explicitly exclude Local Preprocessor
> Arguments. Those arguments are the ones that
> should be used to define the compatibility identifier.
>
>
> The build system should take the compiler
> invocation for a translation unit, remove
> the Local Preprocessor Arguments and the reference
> to the specific translation unit and invoke the
> compiler in that mode in order to obtain the
> compatibility identifier for BMIs produced and
> consumed by that translation unit.
>
>
> I was thinking that instead of changing the build
> system to have an API to allow the user to specify the
> flags of different categories this job is also given
> to the compiler. So, the compiler when invoked with a
> specific option with the full compile command
> excluding input files, outputs:
> 1) Concatenation of Basic Toolchain Configuration
> Arguments and BMI-Sensitive Arguments.
> 2) Compiler Definitions
> 3) Include Directories
> 4) Other Arguments
> 5) Compatibility Identifier
>
> Following the example mentioned in the paper, I broke
> Local Preprocessor Arguments into 2 and 3. Can there
> be more besides these? Also, maybe 5 can be left out
> and 1 can be used instead.
>
> Exempting the build-system of the responsibility of
> categorization of different flags keeps
> the user-facing API of the build-system more concise
> as the user does not need to specify the compiler
> flags in 4 different categories instead of 1.
>
> It seems that there is acceptance regarding the
> dynamic loading of the compiler shared-library and the
> mentioned API. Would it be appropriate to prepare a
> formal proposal for consideration?
>
> Best,
> Hassan Sajjad
>
> On Sat, Aug 26, 2023 at 3:48 AM Bret Brown
> <mail_at_[hidden]> wrote:
>
> Hi Hassan,
>
> First of all, great job on your writeups. They are
> very well considered. I appreciate how much effort
> you're putting into communicating your ideas and
> developing better C++ tools.
>
> I have one concern with the last design you shared
> with us. As far As I understand currently, we
> expect that both imported header units and public
> module interfaces will need to be parsed multiple
> times per build graph. This has been discussed in
> the ISO Tooling Study Group a bit already, so it's
> not a new isea. In short, it's expected that in
> typical scenarios, the build system would need to
> model multiple parses of a particular importable
> entity. This is because, unfortunately, some
> compilation options of the importing translation
> unit will need to be used to parse the imported
> translation unit.
>
> I think your design could work, though I expect
> the node identities of the imported entities need
> to be tweaked to better model a combination of the
> file and the flags used to parse the file. The
> build system would ideally know when to just reuse
> a particular parse of a particular entity,
> especially when optimizing for build speed as you are.
>
> Daniel is hinting at this challenge upthread in
> his reference to preprocessor state. You can read
> more about that in his ISO papers that have been
> published in the last year, including
> https://wg22.link/p2898. Daniel also elaborated on
> this with diagrams in his C++Now talk this spring.
> You can find it here:
> https://youtu.be/_LGR0U5Opdg?si=H92b0eyGQd8Vsr0v.
> Note that Daniel said upthread that his concerns
> are satisfied by some new ideas that were
> discussed in Varna. But the need for build systems
> to model multiple parses per imported interface
> remains given current understanding.
>
> I suspect certain codebases that are carefully
> governed might have exactly one parse per imported
> unit, but a build system that wants to support
> things like existing package management
> ecosystems, among other examples, would need to
> consider a more complicated approach.
>
> Bret Brown
>
>
> On Thu, Aug 24, 2023, 23:53 Hassan Sajjad via SG15
> <sg15_at_[hidden]> wrote:
>
> Thanks for reaching out. I have gone
> through the thread and I am
> still a little confused probably because
> the approach is quite different
> from what we have been used to.
>
>
> Thank you so much for commenting. Yes, it is a
> quite different approach.
>
> It seems that there are two aspects that
> you are proposing: a) the
> way the build system describes the build
> rules; b) how we can
> efficiently translate the build rules into
> action. IIUC in some
> languages (duck typing predominantly) they
> use the language itself to
> describe the rules, too.
>
>
> I just wanted to let you know that I am not
> commenting on what language the build-system
> should use. I am just proposing a new approach
> with the potential of a good speed-up. I am
> only partially aware of the other
> build-systems design, specifically the design
> around C++20 modules / header-units. Because
> this is a different approach, non-trivial
> changes might be needed in other build-systems
> to support it.
>
> The second part is more interesting to me.
> AFAICT your approach
> inverts the build graph /somehow/. Is the
> aim to provide a "symbol
> registry" which upon of the symbol to
> compile the relevant subgraph? I'd
> appreciate if you could describe more
> verbosely (with examples maybe)
> how the process works.
>
> My pleasures.
>
> Please see the attached document. In it, I
> analyze how I plan to add support for this in
> my build-system. I also provide an example.
> Based on the analysis, I am confident in my
> ability to implement this within one month if
> it's approved and a compiler with an API
> becomes available. While I currently see no
> issues, I acknowledge the possibility of
> limitations and potential errors. I am
> currently awaiting feedback from other
> stakeholders before proceeding.
>
> A slight correction in my above email. I
> mentioned that my build-system now is
> limitation-free with the adoption of the new
> consensus. Well, it is for a clean build. But
> there is a small bug for rebuild which will be
> fixed soon.
>
> Best,
> Hassan Sajjad
>
> On Tue, Aug 22, 2023 at 8:49 PM Vassil
> Vassilev <v.g.vassilev_at_[hidden]> wrote:
>
> Hi Hassan,
>
> Thanks for reaching out. I have gone
> through the thread and I am
> still a little confused probably because
> the approach is quite different
> from what we have been used to.
>
> It seems that there are two aspects
> that you are proposing: a) the
> way the build system describes the build
> rules; b) how we can
> efficiently translate the build rules into
> action. IIUC in some
> languages (duck typing predominantly) they
> use the language itself to
> describe the rules, too.
>
> The second part is more interesting to
> me. AFAICT your approach
> inverts the build graph /somehow/. Is the
> aim to provide a "symbol
> registry" which upon of the symbol to
> compile the relevant subgraph? I'd
> appreciate if you could describe more
> verbosely (with examples maybe)
> how the process works.
>
> Best, Vassil
>
> On 7/29/23 3:28 PM, Hassan Sajjad via SG15
> wrote:
> > Hi.
> >
> > I will like to showcase my build-system
> HMake
> > https://github.com/HassanSajjad-302/HMake.
> >
> > It has C++20 modules and header-units
> support. It also supports
> > drop-in header-files to header-units
> replacement.
> >
> > With it, with MSVC I compiled an SFML
> example with C++ 20 header-units.
> > https://github.com/HassanSajjad-302/SFML
> >
> > HMake however has a flaw and there is no
> easy solution for it. To fix
> > this, I will like to propose a new way
> to compile source files.
> >
> https://github.com/HassanSajjad-302/HMake/wiki/HMake-Flaw-And-Its-Possible-Fixes
> >
> > I am very confident that the adoption of
> this will result in flawless
> > module and header-unit support in HMake
> which will translate to a very
> > good user experience while converting
> their code base to C++20 modules.
> >
> > Please share your thoughts.
> >
> > Best,
> > Hassan Sajjad
> >
> >
> _______________________________________________
> > SG15 mailing list
> > SG15_at_[hidden]
> >
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15

Received on 2023-09-22 14:53:41