C++ Logo

sg15

Advanced search

Re: Exploring another alternative for Distributing C++ Module Libraries

From: David Blaikie <dblaikie_at_[hidden]>
Date: Fri, 11 Feb 2022 14:28:39 -0800
Plausible, and I meant to ask about something like this as it seemed like
the filename/module name based lookup wasn't an intrinsic part of the
proposal/requirements (just that the cost be O(libraries) rather than
O(modules) so it's consistent with existing costs - so something that
requires /some/ extra work (like another flag) would be plausible - but
equally using an existing one like the library search path might work too)

It seems unfortunate to lose some of the portability provided by relative
paths to interface definitions when having the metadata file in the include
path instead of the library search path, but if you reckon the installation
requirements of having to edit/modify/generate these files at install time
is feasible/not prohibitive, I could believe that.

On Fri, Feb 11, 2022 at 2:12 PM Daniel Ruoso via SG15 <sg15_at_[hidden]>
wrote:

> Hello,
>
> The conversations we've been having over the past few weeks have been
> challenging, but I think it has been very good to allow time for us to
> mature the ideas.
>
> After the latest round of conversations, something crossed my mind
> that I hadn't connected before, so here goes a summary of it. I want
> to get a general feeling from you all before I sit down to write a
> proper paper.
>
> When I started the original proposal, my main constraint was that I
> couldn't find a way to build on top of the build systems and package
> managers for distributing a C++ module library because there's not
> enough convergence there.
>
> The thing that only now occurred to me is that there is one thing
> where there is a convergence, even if just a very thin one, mostly
> surrounded by implementation-defined semantics, and that is "the link
> line".
>
> What I mean by this is that pretty much any
> build-system+package-manager combo currently needs to find some
> solution to assembling the link line when consuming a library that was
> shipped as a prebuilt artifact.
>
> So, even if we don't have convergence on how that "link line" is
> assembled, we do have convergence that it needs to be.
>
> The breakthrough I had was to understand that it would be fine to
> build the convention for module libraries entirely on top of the
> implementation-defined bits.
>
> Ok, enough preamble, here's the actual idea:
>
> From the publishing side
> ===================
>
> When shipping a C++ library with modules, you will ship a metadata
> file alongside the library artifact (e.g.: .a, .so, .dll, etc) where
> the implementation-defined library extension is replaced by
> ".modules".
>
> That file will contain the metadata necessary to find which BMIs were
> provided off-the-shelf (with information to match the compatibility
> requirements) as well as all the necessary information to produce your
> own BMI when needed -- including path to the interface unit, module
> dependencies, include directories, and compile definitions.
>
> That metadata will cover all modules provided by that library.
>
> From the consuming side
> ===================
>
> The package manager and the build system will use their
> implementation-defined methods to assemble the link line.
>
> Given a link line, they can use the same implementation-defined method
> used to find the library artifacts (e.g.: .a, .so), and look for a
> file that is alongside that artifact with the implementation-defined
> extension replaced by ".modules".
>
> In a POSIX system, using pkg-config, it would look something like this:
>
> Step 1, find the link line:
>
> $ pkg-config --libs --static foobar
> -L/usr/lib -lfoobar -lbarbaz -lbazqux -lquxqux
>
> Step 2, find the library files
>
> /usr/lib/libfoobar.a
> /usr/lib/libbarbaz.so
> /usr/lib/libbazqux.a
> /ust/lib/libquxqux.so
>
> Step 3, locate .modules files alongside those libraries, if those
> exist (non-module libraries wouldn't provide them)
>
> /usr/lib/libfoobar.modules
> /usr/lib/libbarbaz.modules
>
> Step 4, read the metadata files and assemble the entire module graph
> for the modules external to the build system.
>
> I am not familiar with Windows development, but I am moderately
> confident an analogous set of steps can be found.
>
> Interesting side effects
> =================
>
> 1. It strengthens the relationship between the library artifact and
> the parsing of the module interface unit, since the metadata file is
> made available alongside the library file, we can be confident that
> you're doing the parsing that is consistent with that library
> artifact.
>
> 2. Because it's a very thin extension to a lot of
> implementation-defined bits, it requires very little to no change at
> all to the package management side. Even something like pkg-config
> would remain fully useful as-is in this case.
>
> 3. In the case where libraries contain lots of modules, it could
> actually represent a performance gain when compared to finding each
> module independently, since you would read the information about all
> the modules in the library in one go.
>
> 4. Since it's limited to the things in the link line, this cost is not
> affected by the number of libraries available in the system.
>
> Important caveats
> ==============
>
> 1. Libraries that want to avoid shipping object code (i.e.: the module
> equivalent to header-only libraries) will still need to ship an
> archive for them to be discoverable.
>
> 2. If you have a shared object that encapsulates the link of other
> libraries, your modules will need to encapsulate modules provided by
> those. E.g.: if libbarbaz.so.15 is in the NEED section of your shared
> object, and you don't want folks to have -lbarbaz in their linkline,
> you need to make sure your module interface units don't import the
> modules from barbaz.
>
> Other considerations
> ================
>
> Could we just put it inside of the library archive? Yes. But that
> would not be necessarily true for shared objects, and furthermore it
> would cause the build system to do lots of sparse reads on a
> potentially very large file. Therefore, I contend that having it as an
> additional file is more interesting.
>
> Can it be relocatable? It likely makes sense for it to be specified
> that the metadata file can find files relative to itself, e.g.: either
> using something like $ORIGIN or just assuming that non-absolute paths
> are relative to the metadata file. But as with my other proposal,
> package managers may need to replace additional variables at install
> time.
>
> Could we just put everything in a zip file? Probably, but again, this
> comes at the cost of doing lots of sparse reads on potentially very
> large files, so it's likely better to have the library installed
> unpacked.
>
> What is the format of that metadata file? It likely needs to be
> something similar to P1689R4, although it needs to account for either
> not having a BMI or even having multiple BMIs.
>
> Shouldn't the file be c++ specific? Probably not. In fact, IIRC, the
> Kitware proposal went through some lengths to make sure we weren't
> unnecessarily diverging from Fortran modules, so maybe this could host
> information for both C++ and Fortran.
>
> So... what do y'all think?
>
> daniel
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>

Received on 2022-02-11 22:28:50