Date: Sun, 10 Feb 2019 09:00:23 +0100
BMIs can be put in a single folder for the whole build. No reason to
separate them in different folders. Number of folders is therefore only an
issue while scanning
On Sun, Feb 10, 2019, 4:17 AM Scott Wardle <swardle_at_[hidden]> wrote:
> Nice story. Was this on Linux?
>
> We sorted our include paths by the number of hits to that path and it was
> a 10% gain to clean build times. But this was a long time ago before
> windows 10 even. So I was not sure if the better caching in Windows 10
> might have helped since. We have lost the tech to do this however and not
> been bothered to write this.
>
> This is a pain as really having one include per library works very well as
> you can’t include things you should not. Keep your layering enforced by
> your build system.
>
> I am not sure how many modules we will have per library yet but if it is
> close to 1 to 1 understanding this problem might big enough to want to fix
> it. If we had a directory system caching service that could very quickly
> tell you what of the 100s of include directory you really needed to check.
> The way I look at it right now is we will have the same includes as we will
> with module search paths (as this would just be the easiest thing to do in
> the build system) so what ever this time is now will double.
>
> Scott
>
> > On Feb 9, 2019, at 17:31, Ben Craig <ben.craig_at_[hidden]> wrote:
> >
> > The project I build most often has roughly ~150 include paths. We've
> got a crazy conglomeration of gnu make generating FASTBuild projects, held
> together with python, perl, and developer tears.
> >
> > I recently improved the preprocessing time of that project from 7
> minutes 30 seconds to 4 minutes 30 seconds by moving the boost include from
> position ~100 in the include path to position ~4. So yeah, long include
> paths can have an impact on build times.
> >
> >> -----Original Message-----
> >> From: tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> On
> >> Behalf Of Scott Wardle
> >> Sent: Saturday, February 9, 2019 6:53 PM
> >> To: ben.boeckel_at_[hidden]
> >> Cc: WG21 Tooling Study Group SG15 <tooling_at_[hidden]>;
> >> michael_spencer_at_[hidden]
> >> Subject: [EXTERNAL] Re: [Tooling] Modules feedback
> >>
> >>
> >>
> >>> On Feb 9, 2019, at 2:12 PM, Ben Boeckel <ben.boeckel_at_[hidden]>
> >> wrote:
> >>>
> >>>> On Sat, Feb 09, 2019 at 00:01:07 -0800, Scott Wardle wrote:
> >>>> I think you are total right that is what I could use here. I would
> >>>> love a diagram that shows what processes or stages are needed for the
> >>>> use cases you are thinking of clean build vs incremental or maybe
> >>>> some others.
> >>>
> >>> I'll look at adding two possible implementation (1:1:1 source:scan:ddi
> >>> versus N:1:N where N can be all-at-once or per-target) graphs.
> >>>
> >>>> Here is what I was thinking of:
> >>>> - Clean builds vs incremental builds
> >>>> - Linux processes are cheap vs windows less process more threads
> >>>> - Multi computer build, what data is pushed over the network what
> data is
> >> pulled over the network.
> >>>> -Object/Module BMI/Binary/Module Map caching vs no caching.
> >>>
> >>> These are probably good as notes on when one approach might be
> >>> preferred over the other. I'm more than willing to go over build
> >>> graphs on a whiteboard at Kona, but I don't know that multiple pages
> >>> of incremental diagrams while trying to describe execution strategies
> >>> of them is going to be easy to digest.
> >>
> >> You might be right it is hard to say how much detail to write here. I
> think we
> >> are all stuck in our own little worlds it is hard to see how this works
> for
> >> everyone. Maybe one good example is better than a lot of small 1/2 done
> >> ones.
> >>
> >>>
> >>>> Maybe even making this more concrete and talk about command lines of
> >> some of these use cases:
> >>>> At least we should talk about:
> >>>> -does modules change anything with:
> >>>> -Include paths -I<dir> vs -isystem
> >>>> -Object/Library path -L<dir>
> >>>> -are there overlap with:
> >>>> -module BMI paths (-fmodules-cache-path=<directory> vs
> -fprebuilt-
> >> module-path=<directory>)
> >>>> -module map path/files -fmodule-map-file=
> >>>
> >>
> >> I was hopping if we enumerated the different ways of making modules and
> >> their different command lines I would get data I was thinking of.
> >>
> >> I am trying to understand where we are with the merged module proposal.
> >> Last time I looked at things was module TS. From my reading of the
> merged
> >> module proposal it sounds like we can do what we did in module TS and
> what
> >> we could do with older clang modules. These both work very differently
> but
> >> now maybe we can do both of these at once with import "some-header.h”;
> >> vs import foo;?
> >>
> >> What I want to know is: what do we think the posable inputs and outputs
> are
> >> for each phase of the build process. In trying to figure out what
> styles of
> >> inputs and outputs we like for our build tools enumerating the current
> set off
> >> posable inputs and outputs seems like a good idea. At the very least
> some
> >> day we will have to teach some or all of these possibilities.
> >>
> >> The kind of issue I am looking for is for example with module TS why
> did we
> >> type "export module foo;" why do we need the “foo” would this not be the
> >> name of the file? IE there was a command line where you needed to write
> >> the name of the modules BMI file /module:output obj\foo.ifc what is the
> >> name of the module? The name of the BMI file or the name in the
> .cppm/ixx.
> >> Why do we need both? (I think 2.3 seems to call out some reasons, but
> with
> >> a module map in clangs doc talks about making many modules out of many
> >> headers with one map. As very different thing then a one to one
> relation I
> >> was playing with in module TS anyways).
> >>
> >> Sorry I am asking so many questions. This is great stuff.
> >>
> >>
> >>> Eh, these might be to low-level for what we're describing. We list what
> >>> is important for compilers to provide in §7.1. The actual flag
> spellings
> >>> aren't (that) important. In any case, I would hope that it would not.
> >>> Changing semantics of flags as fundamental as `-I` or `-L` based on
> >>> `-fmodules` or `-std=c++2a` is not going to be fun for build tools to
> >>> implement.
> >>>
> >>>> -Artifact Hashing (?? How do dependency work with this? see
> >>>> what the process writes out and assume dependency? maybe I
> >>>> don’t understand this.)
> >>>
> >>> This is strictly a dependency detection strategy. `mtime` is common,
> but
> >>> hashing before saying "dirty" is another viable strategy.
> >>>
> >>> The rest of this is off-topic here, but I'll add my 2¢.
> >>
> >> I see, I was thinking you would name the output BMI based on the hash of
> >> the input or something. So this is like dependency in SCons.
> >>
> >>>
> >>>> Note the use case I am trying to understand is EA uses include paths
> >>>> as a layer enforcement mechanism. IE lower layer rendering can’t
> >>>> include high level gameplay. But gameplay can include rendering. We
> >>>> currently have a different set of includes for each library. A game is
> >>>> built out of about 400 to 300 of these libraries. Since we know what
> >>>> library uses what other libraries we can use this to understand what
> >>>> includes path are necessary. These include path dependencies are
> >>>> different than a libraries linkage dependency. You might use a header
> >>>> from a library but as you only use inline functions you don’t need to
> >>>> link to it and therefore you don’t need to build it first. This can be
> >>>> a good speed up when building DLLs.
> >>>
> >>> For CMake, there is a potential to add `$<COMPILE_ONLY>` genex to add
> >>> usage requirements, but ignore the target at link time. Other build
> >>> tools could certainly implement analogous semantics.
> >>>
> >>> https://urldefense.proofpoint.com/v2/url?u=https-
> >> 3A__gitlab.kitware.com_cmake_cmake_issues_18049-23note-
> >> 5F496112&d=DwIGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r
> >> =y8mub81SfUi-
> >> UCZRX0Vl1g&m=gu_Ht4Rd0JlRtQbUKN0ry8naMpc1KMe31hB33VWz3eM&s=
> >> GGDmnCHAvmQL4UQYZ2insqoSkpF1G28Jdc2cxklZuP4&e=
> >>>
> >>>> What I am worried about with the EA include path layering enforcement
> >>>> is:
> >>>> -We are very close to running out of command line (on windows) as we
> >>>> will have 100s of include paths. (A high level, application level
> >>>> modules will need just about every library after all.). With modules
> >>>> I am not sure what is the equivalent of include paths are but it would
> >>>> seem like we need 2x the command line for module paths if not more.
> >>>> -We have had this system for a long time so we probably have duplicate
> >>>> include files names. if we reduced the number of include paths we
> >>>> might hit these problems.
> >>>
> >>> Response files help a lot here.
> >>
> >> Yes this seems like an easy problem at first glance. The problem is we
> >> generate visual studio SLN files and these do not really support huge
> projects
> >> that we are trying to build very well. If EA moved away from SLN files
> it
> >> would be easy to fix. We have done this before then then came back
> visual
> >> studio SLN files as then we get a GUI to do non-permanent changes to the
> >> SLN files. (Since we use glob source files we regen our sln on a sync.)
> >>
> >>>
> >>>> -We have 100s of include paths. I worry this is not very efficient.
> >>>> If the OS has a good directory cache maybe this is good enough however
> >>>> it could be be very slow otherwise. I am not sure if other company do
> >>>> this type of thing.
> >>>
> >>> VTK's build can have ~100 `-I` flags for parts using "lots" of VTK. It
> >>> certainly has *an* effect, but I don't know how much number-wise.
> >>>
> >>> --Ben
> >>
> >> It is nice to hear that other people are hitting similar issues here.
> >>
> >> Scott
> >>
> >> _______________________________________________
> >> Tooling mailing list
> >> Tooling_at_[hidden]
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-
> >> 2Dstd.org_mailman_listinfo_tooling&d=DwIGaQ&c=I_0YwoKy7z5LMTVdyO6
> >> YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-
> >> UCZRX0Vl1g&m=gu_Ht4Rd0JlRtQbUKN0ry8naMpc1KMe31hB33VWz3eM&s=
> >> 13at1MUpgj53U8STD-4214YgePWHYOqWgBZ_5Ne3GQc&e=
> > _______________________________________________
> > Tooling mailing list
> > Tooling_at_[hidden]
> > http://www.open-std.org/mailman/listinfo/tooling
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
separate them in different folders. Number of folders is therefore only an
issue while scanning
On Sun, Feb 10, 2019, 4:17 AM Scott Wardle <swardle_at_[hidden]> wrote:
> Nice story. Was this on Linux?
>
> We sorted our include paths by the number of hits to that path and it was
> a 10% gain to clean build times. But this was a long time ago before
> windows 10 even. So I was not sure if the better caching in Windows 10
> might have helped since. We have lost the tech to do this however and not
> been bothered to write this.
>
> This is a pain as really having one include per library works very well as
> you can’t include things you should not. Keep your layering enforced by
> your build system.
>
> I am not sure how many modules we will have per library yet but if it is
> close to 1 to 1 understanding this problem might big enough to want to fix
> it. If we had a directory system caching service that could very quickly
> tell you what of the 100s of include directory you really needed to check.
> The way I look at it right now is we will have the same includes as we will
> with module search paths (as this would just be the easiest thing to do in
> the build system) so what ever this time is now will double.
>
> Scott
>
> > On Feb 9, 2019, at 17:31, Ben Craig <ben.craig_at_[hidden]> wrote:
> >
> > The project I build most often has roughly ~150 include paths. We've
> got a crazy conglomeration of gnu make generating FASTBuild projects, held
> together with python, perl, and developer tears.
> >
> > I recently improved the preprocessing time of that project from 7
> minutes 30 seconds to 4 minutes 30 seconds by moving the boost include from
> position ~100 in the include path to position ~4. So yeah, long include
> paths can have an impact on build times.
> >
> >> -----Original Message-----
> >> From: tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> On
> >> Behalf Of Scott Wardle
> >> Sent: Saturday, February 9, 2019 6:53 PM
> >> To: ben.boeckel_at_[hidden]
> >> Cc: WG21 Tooling Study Group SG15 <tooling_at_[hidden]>;
> >> michael_spencer_at_[hidden]
> >> Subject: [EXTERNAL] Re: [Tooling] Modules feedback
> >>
> >>
> >>
> >>> On Feb 9, 2019, at 2:12 PM, Ben Boeckel <ben.boeckel_at_[hidden]>
> >> wrote:
> >>>
> >>>> On Sat, Feb 09, 2019 at 00:01:07 -0800, Scott Wardle wrote:
> >>>> I think you are total right that is what I could use here. I would
> >>>> love a diagram that shows what processes or stages are needed for the
> >>>> use cases you are thinking of clean build vs incremental or maybe
> >>>> some others.
> >>>
> >>> I'll look at adding two possible implementation (1:1:1 source:scan:ddi
> >>> versus N:1:N where N can be all-at-once or per-target) graphs.
> >>>
> >>>> Here is what I was thinking of:
> >>>> - Clean builds vs incremental builds
> >>>> - Linux processes are cheap vs windows less process more threads
> >>>> - Multi computer build, what data is pushed over the network what
> data is
> >> pulled over the network.
> >>>> -Object/Module BMI/Binary/Module Map caching vs no caching.
> >>>
> >>> These are probably good as notes on when one approach might be
> >>> preferred over the other. I'm more than willing to go over build
> >>> graphs on a whiteboard at Kona, but I don't know that multiple pages
> >>> of incremental diagrams while trying to describe execution strategies
> >>> of them is going to be easy to digest.
> >>
> >> You might be right it is hard to say how much detail to write here. I
> think we
> >> are all stuck in our own little worlds it is hard to see how this works
> for
> >> everyone. Maybe one good example is better than a lot of small 1/2 done
> >> ones.
> >>
> >>>
> >>>> Maybe even making this more concrete and talk about command lines of
> >> some of these use cases:
> >>>> At least we should talk about:
> >>>> -does modules change anything with:
> >>>> -Include paths -I<dir> vs -isystem
> >>>> -Object/Library path -L<dir>
> >>>> -are there overlap with:
> >>>> -module BMI paths (-fmodules-cache-path=<directory> vs
> -fprebuilt-
> >> module-path=<directory>)
> >>>> -module map path/files -fmodule-map-file=
> >>>
> >>
> >> I was hopping if we enumerated the different ways of making modules and
> >> their different command lines I would get data I was thinking of.
> >>
> >> I am trying to understand where we are with the merged module proposal.
> >> Last time I looked at things was module TS. From my reading of the
> merged
> >> module proposal it sounds like we can do what we did in module TS and
> what
> >> we could do with older clang modules. These both work very differently
> but
> >> now maybe we can do both of these at once with import "some-header.h”;
> >> vs import foo;?
> >>
> >> What I want to know is: what do we think the posable inputs and outputs
> are
> >> for each phase of the build process. In trying to figure out what
> styles of
> >> inputs and outputs we like for our build tools enumerating the current
> set off
> >> posable inputs and outputs seems like a good idea. At the very least
> some
> >> day we will have to teach some or all of these possibilities.
> >>
> >> The kind of issue I am looking for is for example with module TS why
> did we
> >> type "export module foo;" why do we need the “foo” would this not be the
> >> name of the file? IE there was a command line where you needed to write
> >> the name of the modules BMI file /module:output obj\foo.ifc what is the
> >> name of the module? The name of the BMI file or the name in the
> .cppm/ixx.
> >> Why do we need both? (I think 2.3 seems to call out some reasons, but
> with
> >> a module map in clangs doc talks about making many modules out of many
> >> headers with one map. As very different thing then a one to one
> relation I
> >> was playing with in module TS anyways).
> >>
> >> Sorry I am asking so many questions. This is great stuff.
> >>
> >>
> >>> Eh, these might be to low-level for what we're describing. We list what
> >>> is important for compilers to provide in §7.1. The actual flag
> spellings
> >>> aren't (that) important. In any case, I would hope that it would not.
> >>> Changing semantics of flags as fundamental as `-I` or `-L` based on
> >>> `-fmodules` or `-std=c++2a` is not going to be fun for build tools to
> >>> implement.
> >>>
> >>>> -Artifact Hashing (?? How do dependency work with this? see
> >>>> what the process writes out and assume dependency? maybe I
> >>>> don’t understand this.)
> >>>
> >>> This is strictly a dependency detection strategy. `mtime` is common,
> but
> >>> hashing before saying "dirty" is another viable strategy.
> >>>
> >>> The rest of this is off-topic here, but I'll add my 2¢.
> >>
> >> I see, I was thinking you would name the output BMI based on the hash of
> >> the input or something. So this is like dependency in SCons.
> >>
> >>>
> >>>> Note the use case I am trying to understand is EA uses include paths
> >>>> as a layer enforcement mechanism. IE lower layer rendering can’t
> >>>> include high level gameplay. But gameplay can include rendering. We
> >>>> currently have a different set of includes for each library. A game is
> >>>> built out of about 400 to 300 of these libraries. Since we know what
> >>>> library uses what other libraries we can use this to understand what
> >>>> includes path are necessary. These include path dependencies are
> >>>> different than a libraries linkage dependency. You might use a header
> >>>> from a library but as you only use inline functions you don’t need to
> >>>> link to it and therefore you don’t need to build it first. This can be
> >>>> a good speed up when building DLLs.
> >>>
> >>> For CMake, there is a potential to add `$<COMPILE_ONLY>` genex to add
> >>> usage requirements, but ignore the target at link time. Other build
> >>> tools could certainly implement analogous semantics.
> >>>
> >>> https://urldefense.proofpoint.com/v2/url?u=https-
> >> 3A__gitlab.kitware.com_cmake_cmake_issues_18049-23note-
> >> 5F496112&d=DwIGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r
> >> =y8mub81SfUi-
> >> UCZRX0Vl1g&m=gu_Ht4Rd0JlRtQbUKN0ry8naMpc1KMe31hB33VWz3eM&s=
> >> GGDmnCHAvmQL4UQYZ2insqoSkpF1G28Jdc2cxklZuP4&e=
> >>>
> >>>> What I am worried about with the EA include path layering enforcement
> >>>> is:
> >>>> -We are very close to running out of command line (on windows) as we
> >>>> will have 100s of include paths. (A high level, application level
> >>>> modules will need just about every library after all.). With modules
> >>>> I am not sure what is the equivalent of include paths are but it would
> >>>> seem like we need 2x the command line for module paths if not more.
> >>>> -We have had this system for a long time so we probably have duplicate
> >>>> include files names. if we reduced the number of include paths we
> >>>> might hit these problems.
> >>>
> >>> Response files help a lot here.
> >>
> >> Yes this seems like an easy problem at first glance. The problem is we
> >> generate visual studio SLN files and these do not really support huge
> projects
> >> that we are trying to build very well. If EA moved away from SLN files
> it
> >> would be easy to fix. We have done this before then then came back
> visual
> >> studio SLN files as then we get a GUI to do non-permanent changes to the
> >> SLN files. (Since we use glob source files we regen our sln on a sync.)
> >>
> >>>
> >>>> -We have 100s of include paths. I worry this is not very efficient.
> >>>> If the OS has a good directory cache maybe this is good enough however
> >>>> it could be be very slow otherwise. I am not sure if other company do
> >>>> this type of thing.
> >>>
> >>> VTK's build can have ~100 `-I` flags for parts using "lots" of VTK. It
> >>> certainly has *an* effect, but I don't know how much number-wise.
> >>>
> >>> --Ben
> >>
> >> It is nice to hear that other people are hitting similar issues here.
> >>
> >> Scott
> >>
> >> _______________________________________________
> >> Tooling mailing list
> >> Tooling_at_[hidden]
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-
> >> 2Dstd.org_mailman_listinfo_tooling&d=DwIGaQ&c=I_0YwoKy7z5LMTVdyO6
> >> YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-
> >> UCZRX0Vl1g&m=gu_Ht4Rd0JlRtQbUKN0ry8naMpc1KMe31hB33VWz3eM&s=
> >> 13at1MUpgj53U8STD-4214YgePWHYOqWgBZ_5Ne3GQc&e=
> > _______________________________________________
> > Tooling mailing list
> > Tooling_at_[hidden]
> > http://www.open-std.org/mailman/listinfo/tooling
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
Received on 2019-02-10 09:00:38