C++ Logo

sg15

Advanced search

Re: Proposal for module metadata format to be used by the std library and others

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 13 Dec 2023 18:11:31 -0500
On 12/13/23 6:04 PM, Olga Arkhipova via SG15 wrote:
>
> >> The question is how to refer to that other library manifest. How does
> the provider of a manifest file for one project know where the
> manifest file for another project be located post deployment? I think
> the build system will need to provide that information (preferably as
> provided by some package manager).
>
> Yes, this is the idea.
>
> The library manifest would just state the dependency on other
> libraries by their names, without knowing/caring where they actually
> are on the machine.
>
> The build/package manager will know somehow (could be specified by
> multiple mechanisms, including user provided location(s)) where to
> find library manifests, so they can “resolve” the library dependency
> (specified by library name) to specific manifest and concrete files on
> disk for module, etc. dependencies.
>
Good, yes, we're on the same page.

Tom.

> Olga
>
> *From:*Tom Honermann <tom_at_[hidden]>
> *Sent:* Wednesday, December 13, 2023 2:37 PM
> *To:* sg15_at_[hidden]
> *Cc:* Olga Arkhipova <olgaark_at_[hidden]>
> *Subject:* Re: [SG15] Proposal for module metadata format to be used
> by the std library and others
>
> On 12/13/23 2:19 PM, Olga Arkhipova via SG15 wrote:
>
> >> I think it will be necessary to support paths that are relative
> to some other parameterized location to accommodate dependencies
> on header files or modules provided by other projects/packages.
>
> If a library can reference another libraries, this is not needed.
> All other dependencies (including module dependencies) will be
> resolved by using referenced library manifest.
>
> We may be thinking similarly here. The question is how to refer to
> that other library manifest. How does the provider of a manifest file
> for one project know where the manifest file for another project be
> located post deployment? I think the build system will need to provide
> that information (preferably as provided by some package manager).
>
> >> Do implementation module units need to be mentioned at all? I'm
> not opposed to allowing for them, but if I'm following correctly,
> since the metadata file is consumed to satisfy module imports,
> they don't contribute to anything that is relevant to import. I
> would expect is-interface to always be true.
>
> Currently, is-interface=false means that this is an internal
> partition, which has to be built to a BMI and passed as a module
> reference on the command line to be able to use the primary module
> in the code.
>
> But it does not look to me that we need “is-interface” metadata as
> build will scan the sources and get the exact data what this
> source is.
>
> Actually, I think we should not have “is-interface” as module
> source metadata at all to avoid any conflicts with scan data.
>
> This kind of data (together with dependencies) would be useful for
> BMI, but not the source.
>
> Yes, that matches my intuition.
>
> Tom.
>
> Olga
>
> *From:*SG15 <sg15-bounces_at_[hidden]>
> <mailto:sg15-bounces_at_[hidden]> *On Behalf Of *Tom
> Honermann via SG15
> *Sent:* Wednesday, December 13, 2023 10:46 AM
> *To:* sg15_at_[hidden]
> *Cc:* Tom Honermann <tom_at_[hidden]> <mailto:tom_at_[hidden]>
> *Subject:* Re: [SG15] Proposal for module metadata format to be
> used by the std library and others
>
> On 12/12/23 4:56 PM, Daniel Ruoso via SG15 wrote:
>
> As discussed in the meeting on 2023-12-12, I'm putting
> together a proposal for the metadata format to be used both
> when discovering the std modules as well as for pre-built
> libraries in general.
>
> There was general agreement that while the std library always
> needs special cases, we will need some metadata format to
> describe the std modules, and it would be unfortunate if the
> format used for the std library was different than that used
> in other scenarios.
>
> There was some desire to have this be a starting point for a
> wider metadata format to cover other aspects beyond modules,
> so this proposal will be split in two. First covering modules,
> and later offering an option on how that information could be
> embedded into a wider format.
>
> Before jumping into the format itself, here's the list of
> requirements that guided the design:
>
> * Module discovery should be subordinate to ABI decisions
> having been made already. The outcome is that we don't expect
> the module metadata to be used to discover which ABI settings
> are available for a given standard library or how to choose
> between them. The way that this manifests is that, following
> the lead from P2577R2, we expect the libraries to ship those
> metadata files on a one-to-one mapping with the library binary
> used as input to the linker.
>
> I think someone mentioned it yesterday, but this will presumably
> have to account for multilib libraries in some way.
>
> * The path to the module source files should be discovered
> via the meatadata file. The outcome is that we don't expect a
> mechanism to search for those module files in a directory, nor
> a specific standard on how they should be named. but rather
> the build system will be given that information directly. This
> also allows the choice of the standard library splitting its
> implementation into interface partitions for ease of
> maintainability.
>
> * The compiler should offer a mechanism to introspect the
> path to the metadata file of the standard library, with the
> settings that the compiler will be used. That means build
> systems don't need to perform the lookup of the metadata.
>
> The format should be extensible to cover vendor-specific settings.
>
> The format should be versioned to allow future
> backward-compatible changes.
>
> * The format should allow indicating when the intention is to
> build the std module, since the compiler should reject a
> module accidentally named in a way that conflicts with the std
> module.
>
> Requirements that are likely to become important in the
> future, but that are not otherwise included in this proposal:
>
> * Pre-built libraries, including the std library, may want to
> advertise pre-built BMIs to be reused by the build system.
> This requires more convergence on the mechanisms to identify
> BMI compatibility, which have been discussed in P2581R2, but
> not yet supported by implementors.
>
> * Future integration with more general package management
> facilities. Although my expectation is that this is a step in
> that direction, where it may be possible to either make the
> data described here embedded there, or have the path to this
> metadata referenced instead. Particularly this proposal is not
> trying to specify requirements that would allow discovering
> when using this would be appropriate or not.
>
> Some a-priori decisions that are assumed in this format:
>
> * JSON: while there are many competing alternatives for the
> serialization format for this metadata, various other parts of
> the tooling ecosystem (such as compile_commands.json and the
> dependency scanning output) are already using this format,
> therefore I chose to just stick to it, rather than consider
> introducing a new serialization format.
>
> * Relative file paths: Any non-absolute path described in
> this file will be presumed to have the directory where the
> metadata file was found as the base for the lookup.
>
> I think it will be necessary to support paths that are relative to
> some other parameterized location to accommodate dependencies on
> header files or modules provided by other projects/packages. This
> would help to reduce the otherwise common practice of a build
> system having to concatenate include paths for essentially every
> project it knows about when building any package it knows about.
>
> Since the goal of this proposal is to evaluate specific usage,
> I'll will prioritize describing the file with examples, rather
> than writing a JSON schema for it. The final design should
> still be encoded that way, but I feel the format of json
> schema would make this conversation harder to maintain.
>
> I agree. JSON schema is great ... for later :)
>
> # Envelope
>
> The first thing I want to address is the envelope that will
> contain the module-specific metadata. For that there are two
> options:
>
> ## Module-specific file
>
> If we decide to go with a file that refers only to module
> metadata, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "modules": []
>
> }
>
> ## Wider "library manifest" file
>
> If we decide we should go with a wider format for future
> extensibility, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "c++": {
>
> "modules": []
>
> }
>
> }
>
> # The modules value
>
> In both possible envelopes, module metadata is represented as
> an array of objects, where all the importable module units
> provided by this library that may be reachable by a consumer
> in this library will be described.
>
> While we currently don't expect the std module to depend on
> any module not provided by the std library itself at this
> point, there are already situations where the std library has
> external dependencies (e.g.: tbb for libstdc++). This is an
> area that needs further exploration in the future, and it may
> be the case that the compiler may need to report several
> module metadata files, rather than just one.
>
> ## Describing a module
>
> The following keys and values are expected in the object in
> the modules array:
>
> * logical-name (mandatory): This includes the name of the
> module being provided, the same semantics of P1689 applies.
>
> * is-interface (optional, default to true): This describes
> whether this contributes to the external interface of the
> module, the same semantics of P1689 applies.
>
> Do implementation module units need to be mentioned at all? I'm
> not opposed to allowing for them, but if I'm following correctly,
> since the metadata file is consumed to satisfy module imports,
> they don't contribute to anything that is relevant to import. I
> would expect is-interface to always be true.
>
> * source-path (mandatory): The path to the source code of the
> importable unit. If expressed as a relative path, lookup is
> done from the directory where the module metadata was found.
>
> * is-std-library (optional, default to false): Indicates that
> the module is allowed to use names that are reserved to the
> standard library.
>
> * local-arguments (optional), an object describing arguments
> that should be applied for translating this particular
> importable unit, but that doesn't need to be in the
> compilation of the translation unit importing this module:
>
> Since local-arguments (which I presume to mean compiler command
> line options) are necessarily implementation specific, I think
> this should either be generalized or named such that it reflects
> an implementation dependency. Perhaps:
>
> "local-arguments": [
> { "gcc-compatible": [ "-fconstexpr-depth=512" ] },
> { "cl-compatible": [ "/constexpr:depth512" ] },
> { "circle": [ "--ftemplate-depth=768" ] }
> ]
>
> Tom.
>
> ** include-directories (optional): an array of paths that
> need to be appended to the compilation include search path,
> same semantics as appending -I in gcc and clang.
>
> ** system-include-directories (optional): an array of paths
> that need to be appended to the compilation include path as
> system locations, same semantics as appending -isystem in gcc
> and clang.
>
> ** definitions: an array of objects to be appended in order.
>
> *** name (mandatory): the name of the definition to be used
>
> *** value (optional): the value to be set. If missing, it is
> the equivalent of -DFOO in gcc and clang.
>
> *** undef (optional, defaults to false, incompatible with
> value): The equivalent of -UFOO in gcc and clang.
>
> *** vendor (optional): extension point for vendor-specific
> configurations. This is an object where the key is the name of
> the vendor, and the value is implementation-defined.
>
> * vendor (optional): extension point for vendor-specific
> information. This is an object where the key is the name of
> the vendor and the value is implementation-defined.
>
> # Example
>
> Here's how I would expect that would look like for a standard
> library (assuming the modules file for now), such as libc++:
>
> {
>
> "version": 1,
>
> "revision": 1,
>
> "modules": [
>
> {
>
> "logical-name": "std",
>
> "source-path": "modules/std.cppm",
>
> "is-standard-library": true
>
> },
>
> {
>
> "logical-name": "std.compat",
>
> "source-path": "modules/std.compat.cppm"
>
> "is-std-library": true
>
> },
>
> {
>
> "logical-name": "std:someinterfacepartition",
>
> "source-path": "modules/std-someinterfacepartition.cppm"
>
> "is-std-library": true
>
> }
>
> ]
>
> }
>
> Note that this specifically doesn't use any of the local
> arguments, because I don't really think that's going to be
> needed for the standard library case. The only special case is
> the is-standard-library key, to allow the build system to know
> this is not an accidental collision with the reserved names.
> We may decide not to settle the local-arguments part of the
> proposal now for that reason.
>
> Daniel
>
> _______________________________________________
>
> SG15 mailing list
>
> SG15_at_[hidden]
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
>
> _______________________________________________
>
> SG15 mailing list
>
> SG15_at_[hidden]
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15

Received on 2023-12-13 23:11:32