Date: Fri, 11 Feb 2022 20:00:47 -0500
I think the open question is can we assume that "third-party" (and
second-party) modules are going to be delivered with something like a
library? I know we've talked internally about shipping component modules,
possibly rolled up into package modules, possibly further rolled up into
package group modules, where the package group is usually the library
equivalent. The large thing that is shipped.
At the core, we're trying to find a way to bootstrap from `import
some_module` to something that describes some_module, and I do not see any
practical alternatives for all the build systems out there to having a
locatable file named some_module.something. Finding it in an open search is
the problem. It's either in a place mechanically derived from a file we can
already find, like the library that is the package, or described by
pkg-config metadata, or in some well known location, like
/usr/share/c++/modules/{module-name}. Honestly, as long as I am not
required to scan all files on every accessible filesystem, I could even
live with all of the above, and have a treaty about which to look for
first. Which I suppose would be the TR we're kicking about. I also realize
I don't spend that much time on Windows, where it's less usual to have
shared locations like {/,/usr/,/usr/local}/lib to look for things in, and
instead you point at a directory, somehow, that is the package. Or Conan
does some magic.
On Fri, Feb 11, 2022 at 5:12 PM Daniel Ruoso via SG15 <sg15_at_[hidden]>
wrote:
> Hello,
>
> The conversations we've been having over the past few weeks have been
> challenging, but I think it has been very good to allow time for us to
> mature the ideas.
>
> After the latest round of conversations, something crossed my mind
> that I hadn't connected before, so here goes a summary of it. I want
> to get a general feeling from you all before I sit down to write a
> proper paper.
>
> When I started the original proposal, my main constraint was that I
> couldn't find a way to build on top of the build systems and package
> managers for distributing a C++ module library because there's not
> enough convergence there.
>
> The thing that only now occurred to me is that there is one thing
> where there is a convergence, even if just a very thin one, mostly
> surrounded by implementation-defined semantics, and that is "the link
> line".
>
> What I mean by this is that pretty much any
> build-system+package-manager combo currently needs to find some
> solution to assembling the link line when consuming a library that was
> shipped as a prebuilt artifact.
>
> So, even if we don't have convergence on how that "link line" is
> assembled, we do have convergence that it needs to be.
>
> The breakthrough I had was to understand that it would be fine to
> build the convention for module libraries entirely on top of the
> implementation-defined bits.
>
> Ok, enough preamble, here's the actual idea:
>
> From the publishing side
> ===================
>
> When shipping a C++ library with modules, you will ship a metadata
> file alongside the library artifact (e.g.: .a, .so, .dll, etc) where
> the implementation-defined library extension is replaced by
> ".modules".
>
> That file will contain the metadata necessary to find which BMIs were
> provided off-the-shelf (with information to match the compatibility
> requirements) as well as all the necessary information to produce your
> own BMI when needed -- including path to the interface unit, module
> dependencies, include directories, and compile definitions.
>
> That metadata will cover all modules provided by that library.
>
> From the consuming side
> ===================
>
> The package manager and the build system will use their
> implementation-defined methods to assemble the link line.
>
> Given a link line, they can use the same implementation-defined method
> used to find the library artifacts (e.g.: .a, .so), and look for a
> file that is alongside that artifact with the implementation-defined
> extension replaced by ".modules".
>
> In a POSIX system, using pkg-config, it would look something like this:
>
> Step 1, find the link line:
>
> $ pkg-config --libs --static foobar
> -L/usr/lib -lfoobar -lbarbaz -lbazqux -lquxqux
>
> Step 2, find the library files
>
> /usr/lib/libfoobar.a
> /usr/lib/libbarbaz.so
> /usr/lib/libbazqux.a
> /ust/lib/libquxqux.so
>
> Step 3, locate .modules files alongside those libraries, if those
> exist (non-module libraries wouldn't provide them)
>
> /usr/lib/libfoobar.modules
> /usr/lib/libbarbaz.modules
>
> Step 4, read the metadata files and assemble the entire module graph
> for the modules external to the build system.
>
> I am not familiar with Windows development, but I am moderately
> confident an analogous set of steps can be found.
>
> Interesting side effects
> =================
>
> 1. It strengthens the relationship between the library artifact and
> the parsing of the module interface unit, since the metadata file is
> made available alongside the library file, we can be confident that
> you're doing the parsing that is consistent with that library
> artifact.
>
> 2. Because it's a very thin extension to a lot of
> implementation-defined bits, it requires very little to no change at
> all to the package management side. Even something like pkg-config
> would remain fully useful as-is in this case.
>
> 3. In the case where libraries contain lots of modules, it could
> actually represent a performance gain when compared to finding each
> module independently, since you would read the information about all
> the modules in the library in one go.
>
> 4. Since it's limited to the things in the link line, this cost is not
> affected by the number of libraries available in the system.
>
> Important caveats
> ==============
>
> 1. Libraries that want to avoid shipping object code (i.e.: the module
> equivalent to header-only libraries) will still need to ship an
> archive for them to be discoverable.
>
> 2. If you have a shared object that encapsulates the link of other
> libraries, your modules will need to encapsulate modules provided by
> those. E.g.: if libbarbaz.so.15 is in the NEED section of your shared
> object, and you don't want folks to have -lbarbaz in their linkline,
> you need to make sure your module interface units don't import the
> modules from barbaz.
>
> Other considerations
> ================
>
> Could we just put it inside of the library archive? Yes. But that
> would not be necessarily true for shared objects, and furthermore it
> would cause the build system to do lots of sparse reads on a
> potentially very large file. Therefore, I contend that having it as an
> additional file is more interesting.
>
> Can it be relocatable? It likely makes sense for it to be specified
> that the metadata file can find files relative to itself, e.g.: either
> using something like $ORIGIN or just assuming that non-absolute paths
> are relative to the metadata file. But as with my other proposal,
> package managers may need to replace additional variables at install
> time.
>
> Could we just put everything in a zip file? Probably, but again, this
> comes at the cost of doing lots of sparse reads on potentially very
> large files, so it's likely better to have the library installed
> unpacked.
>
> What is the format of that metadata file? It likely needs to be
> something similar to P1689R4, although it needs to account for either
> not having a BMI or even having multiple BMIs.
>
> Shouldn't the file be c++ specific? Probably not. In fact, IIRC, the
> Kitware proposal went through some lengths to make sure we weren't
> unnecessarily diverging from Fortran modules, so maybe this could host
> information for both C++ and Fortran.
>
> So... what do y'all think?
>
> daniel
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
second-party) modules are going to be delivered with something like a
library? I know we've talked internally about shipping component modules,
possibly rolled up into package modules, possibly further rolled up into
package group modules, where the package group is usually the library
equivalent. The large thing that is shipped.
At the core, we're trying to find a way to bootstrap from `import
some_module` to something that describes some_module, and I do not see any
practical alternatives for all the build systems out there to having a
locatable file named some_module.something. Finding it in an open search is
the problem. It's either in a place mechanically derived from a file we can
already find, like the library that is the package, or described by
pkg-config metadata, or in some well known location, like
/usr/share/c++/modules/{module-name}. Honestly, as long as I am not
required to scan all files on every accessible filesystem, I could even
live with all of the above, and have a treaty about which to look for
first. Which I suppose would be the TR we're kicking about. I also realize
I don't spend that much time on Windows, where it's less usual to have
shared locations like {/,/usr/,/usr/local}/lib to look for things in, and
instead you point at a directory, somehow, that is the package. Or Conan
does some magic.
On Fri, Feb 11, 2022 at 5:12 PM Daniel Ruoso via SG15 <sg15_at_[hidden]>
wrote:
> Hello,
>
> The conversations we've been having over the past few weeks have been
> challenging, but I think it has been very good to allow time for us to
> mature the ideas.
>
> After the latest round of conversations, something crossed my mind
> that I hadn't connected before, so here goes a summary of it. I want
> to get a general feeling from you all before I sit down to write a
> proper paper.
>
> When I started the original proposal, my main constraint was that I
> couldn't find a way to build on top of the build systems and package
> managers for distributing a C++ module library because there's not
> enough convergence there.
>
> The thing that only now occurred to me is that there is one thing
> where there is a convergence, even if just a very thin one, mostly
> surrounded by implementation-defined semantics, and that is "the link
> line".
>
> What I mean by this is that pretty much any
> build-system+package-manager combo currently needs to find some
> solution to assembling the link line when consuming a library that was
> shipped as a prebuilt artifact.
>
> So, even if we don't have convergence on how that "link line" is
> assembled, we do have convergence that it needs to be.
>
> The breakthrough I had was to understand that it would be fine to
> build the convention for module libraries entirely on top of the
> implementation-defined bits.
>
> Ok, enough preamble, here's the actual idea:
>
> From the publishing side
> ===================
>
> When shipping a C++ library with modules, you will ship a metadata
> file alongside the library artifact (e.g.: .a, .so, .dll, etc) where
> the implementation-defined library extension is replaced by
> ".modules".
>
> That file will contain the metadata necessary to find which BMIs were
> provided off-the-shelf (with information to match the compatibility
> requirements) as well as all the necessary information to produce your
> own BMI when needed -- including path to the interface unit, module
> dependencies, include directories, and compile definitions.
>
> That metadata will cover all modules provided by that library.
>
> From the consuming side
> ===================
>
> The package manager and the build system will use their
> implementation-defined methods to assemble the link line.
>
> Given a link line, they can use the same implementation-defined method
> used to find the library artifacts (e.g.: .a, .so), and look for a
> file that is alongside that artifact with the implementation-defined
> extension replaced by ".modules".
>
> In a POSIX system, using pkg-config, it would look something like this:
>
> Step 1, find the link line:
>
> $ pkg-config --libs --static foobar
> -L/usr/lib -lfoobar -lbarbaz -lbazqux -lquxqux
>
> Step 2, find the library files
>
> /usr/lib/libfoobar.a
> /usr/lib/libbarbaz.so
> /usr/lib/libbazqux.a
> /ust/lib/libquxqux.so
>
> Step 3, locate .modules files alongside those libraries, if those
> exist (non-module libraries wouldn't provide them)
>
> /usr/lib/libfoobar.modules
> /usr/lib/libbarbaz.modules
>
> Step 4, read the metadata files and assemble the entire module graph
> for the modules external to the build system.
>
> I am not familiar with Windows development, but I am moderately
> confident an analogous set of steps can be found.
>
> Interesting side effects
> =================
>
> 1. It strengthens the relationship between the library artifact and
> the parsing of the module interface unit, since the metadata file is
> made available alongside the library file, we can be confident that
> you're doing the parsing that is consistent with that library
> artifact.
>
> 2. Because it's a very thin extension to a lot of
> implementation-defined bits, it requires very little to no change at
> all to the package management side. Even something like pkg-config
> would remain fully useful as-is in this case.
>
> 3. In the case where libraries contain lots of modules, it could
> actually represent a performance gain when compared to finding each
> module independently, since you would read the information about all
> the modules in the library in one go.
>
> 4. Since it's limited to the things in the link line, this cost is not
> affected by the number of libraries available in the system.
>
> Important caveats
> ==============
>
> 1. Libraries that want to avoid shipping object code (i.e.: the module
> equivalent to header-only libraries) will still need to ship an
> archive for them to be discoverable.
>
> 2. If you have a shared object that encapsulates the link of other
> libraries, your modules will need to encapsulate modules provided by
> those. E.g.: if libbarbaz.so.15 is in the NEED section of your shared
> object, and you don't want folks to have -lbarbaz in their linkline,
> you need to make sure your module interface units don't import the
> modules from barbaz.
>
> Other considerations
> ================
>
> Could we just put it inside of the library archive? Yes. But that
> would not be necessarily true for shared objects, and furthermore it
> would cause the build system to do lots of sparse reads on a
> potentially very large file. Therefore, I contend that having it as an
> additional file is more interesting.
>
> Can it be relocatable? It likely makes sense for it to be specified
> that the metadata file can find files relative to itself, e.g.: either
> using something like $ORIGIN or just assuming that non-absolute paths
> are relative to the metadata file. But as with my other proposal,
> package managers may need to replace additional variables at install
> time.
>
> Could we just put everything in a zip file? Probably, but again, this
> comes at the cost of doing lots of sparse reads on potentially very
> large files, so it's likely better to have the library installed
> unpacked.
>
> What is the format of that metadata file? It likely needs to be
> something similar to P1689R4, although it needs to account for either
> not having a BMI or even having multiple BMIs.
>
> Shouldn't the file be c++ specific? Probably not. In fact, IIRC, the
> Kitware proposal went through some lengths to make sure we weren't
> unnecessarily diverging from Fortran modules, so maybe this could host
> information for both C++ and Fortran.
>
> So... what do y'all think?
>
> daniel
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
Received on 2022-02-12 01:01:03