Date: Sun, 1 Dec 2024 05:15:18 +0100
A lot of users who have to use parallel filesystems (like supercomputers)
will often build their code on those parallel filesystems as well. And
parallel file systems are notoriously slow at metadata operations (compared
to an NVMe drive) and at number of IOPS (again compared to NVMe).
That being said, that doesn't mean that the solution is to support those
parallel filesystems out of the box. But something to keep in mind.
On Sat, Nov 30, 2024, 20:27 Jussi Pakkanen via SG15 <sg15_at_[hidden]>
wrote:
> On Fri, 29 Nov 2024 at 12:35, Mathias Stearn via SG15
> <sg15_at_[hidden]> wrote:
>
> > We are talking about compiling C++ here. Just to be clear, are you
> saying that making a directory per TU is a significant overhead relative to
> actually doing the compile? I could _maybe_ see that if you have 10k very
> small TUs and you are using a remote filesystem from halfway around the
> world. But in any realistic scenario, I would expect an mkdir to be well
> under 1% or even 0.1% of the cost of the compile. Plus, you only need to
> mkdir on clean builds. This should have no impact on incremental build
> times.
>
> We have gotten reports that some people are building huge code bases
> over NFS, where absolutely everything is slow. I personally would
> never do that and the advice I'd give them is "migrate off of that
> setup as fast as possible", but doing so is not very productive.
>
> Which brings us to a much more important topic, a sort of "meta
> discussion" if you will. Which is are the things we are doing here
> meant to be aspirational as in "these are the workflows and setups we
> recommend that might take some work by people with bad build setups
> currently but are much better for everybody in the long and even
> medium term" or are they meant to be conservative as in "we can never
> recommend any course of action that some project somewhere in unknown
> space might not be able to do".
>
> Because if it is the latter, then our hands are tied and we can never
> do anything. Developers are _exceedingly_ good at coming up with weird
> corner cases that might be a problem under some circumstances. This
> thread itself has several examples of this phenomenon (yes, this
> includes me, guilty as charged). This makes the tooling situation even
> worse than the language's, at least from a certain point of view. The
> language maintains backwards compatibility with old code (and pays a
> heavy price for it), but that is by choice. That has been their design
> goal since day 1 and they have had several highly skilled people
> working on this problem for decades. Their backwards compatibility is
> also visible, you can inspect and verify it using only public
> information. None of this is true for tooling.
>
> Nobody has, at any time, promised or even implied that any specific
> method of building code would keep on working, especially in the face
> of major changes. Yet, we are in a situation where we have to keep
> backwards compatibility with build setups that are hidden behind
> corporate firewalls. Trying to guess what they are doing via
> divination and tea leaves and then doing design work based on that is
> not productive, especially if the most common reply you are going to
> get is "well you know it might be that X could cause Y due to A, B or
> C and somebody might be doing it, or a related thing Z which might or
> might not be affected so NACK".
>
> I could go on, but rather not. Instead, let me try to boil this down
> to a single question:
>
> "Updating code bases to use modules requires changes to the source
> code. Are we fine with requiring that it might require changes to
> build setups as well?"
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
will often build their code on those parallel filesystems as well. And
parallel file systems are notoriously slow at metadata operations (compared
to an NVMe drive) and at number of IOPS (again compared to NVMe).
That being said, that doesn't mean that the solution is to support those
parallel filesystems out of the box. But something to keep in mind.
On Sat, Nov 30, 2024, 20:27 Jussi Pakkanen via SG15 <sg15_at_[hidden]>
wrote:
> On Fri, 29 Nov 2024 at 12:35, Mathias Stearn via SG15
> <sg15_at_[hidden]> wrote:
>
> > We are talking about compiling C++ here. Just to be clear, are you
> saying that making a directory per TU is a significant overhead relative to
> actually doing the compile? I could _maybe_ see that if you have 10k very
> small TUs and you are using a remote filesystem from halfway around the
> world. But in any realistic scenario, I would expect an mkdir to be well
> under 1% or even 0.1% of the cost of the compile. Plus, you only need to
> mkdir on clean builds. This should have no impact on incremental build
> times.
>
> We have gotten reports that some people are building huge code bases
> over NFS, where absolutely everything is slow. I personally would
> never do that and the advice I'd give them is "migrate off of that
> setup as fast as possible", but doing so is not very productive.
>
> Which brings us to a much more important topic, a sort of "meta
> discussion" if you will. Which is are the things we are doing here
> meant to be aspirational as in "these are the workflows and setups we
> recommend that might take some work by people with bad build setups
> currently but are much better for everybody in the long and even
> medium term" or are they meant to be conservative as in "we can never
> recommend any course of action that some project somewhere in unknown
> space might not be able to do".
>
> Because if it is the latter, then our hands are tied and we can never
> do anything. Developers are _exceedingly_ good at coming up with weird
> corner cases that might be a problem under some circumstances. This
> thread itself has several examples of this phenomenon (yes, this
> includes me, guilty as charged). This makes the tooling situation even
> worse than the language's, at least from a certain point of view. The
> language maintains backwards compatibility with old code (and pays a
> heavy price for it), but that is by choice. That has been their design
> goal since day 1 and they have had several highly skilled people
> working on this problem for decades. Their backwards compatibility is
> also visible, you can inspect and verify it using only public
> information. None of this is true for tooling.
>
> Nobody has, at any time, promised or even implied that any specific
> method of building code would keep on working, especially in the face
> of major changes. Yet, we are in a situation where we have to keep
> backwards compatibility with build setups that are hidden behind
> corporate firewalls. Trying to guess what they are doing via
> divination and tea leaves and then doing design work based on that is
> not productive, especially if the most common reply you are going to
> get is "well you know it might be that X could cause Y due to A, B or
> C and somebody might be doing it, or a related thing Z which might or
> might not be affected so NACK".
>
> I could go on, but rather not. Instead, let me try to boil this down
> to a single question:
>
> "Updating code bases to use modules requires changes to the source
> code. Are we fine with requiring that it might require changes to
> build setups as well?"
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
Received on 2024-12-01 04:15:35