C++ Logo


Advanced search

Re: Proposal for module metadata format to be used by the std library and others

From: Olga Arkhipova <olgaark_at_[hidden]>
Date: Wed, 13 Dec 2023 07:17:16 +0000
Thanks, Daniel, for writing all this down.

Looks like there is not much difference between library manifest and module metadata currently. So I think it should not be much harm in threat it as a library manifest v0, we can always ignore fields later if we decide otherwise.

If we do it as a library manifest, we have to have the library name there.
Besides being useful in diagnostic messages and output location (and we do use it in current MS implementation), it is needed to allow future library reference resolution. At least on Windows the location of the manifest does not give any information about the library it represents, so we need it in the manifest itself.

The library name has to be unique to allow unambiguous referencing in other libraries. I don’t know if there are other libraries in the world which provide the same functionality besides STL, but to allow the replacement, we should also have something like “library-type” or “library-common-name” – something similar to is-std-library, but more generic. For std libs it will be set to “std” so referencing libraries can just use “std” if they allow the substitution. I also think that this Is a library and not a module attribute, but if there are some nuances I am not aware of, we can sure have additional attributes for each module overriding the general library one.

So, to make it visual, I am proposing to have

   "version": 1,
   "revision": 0,
   “library”: “unique name like Microsoft.STL",
   "c++": {
      "modules": []
   “referenced-libraries”:[] <-- not present for std libs, but other libraries can reference “std” and other libs.


From: SG15 <sg15-bounces_at_[hidden]> On Behalf Of Daniel Ruoso via SG15
Sent: Tuesday, December 12, 2023 13:57
To: sg15_at_[hidden]ocpp.org
Cc: Daniel Ruoso <daniel_at_[hidden]>
Subject: [SG15] Proposal for module metadata format to be used by the std library and others

As discussed in the meeting on 2023-12-12, I'm putting together a proposal for the metadata format to be used both when discovering the std modules as well as for pre-built libraries in general.

There was general agreement that while the std library always needs special cases, we will need some metadata format to describe the std modules, and it would be unfortunate if the format used for the std library was different than that used in other scenarios.

There was some desire to have this be a starting point for a wider metadata format to cover other aspects beyond modules, so this proposal will be split in two. First covering modules, and later offering an option on how that information could be embedded into a wider format.

Before jumping into the format itself, here's the list of requirements that guided the design:

 * Module discovery should be subordinate to ABI decisions having been made already. The outcome is that we don't expect the module metadata to be used to discover which ABI settings are available for a given standard library or how to choose between them. The way that this manifests is that, following the lead from P2577R2, we expect the libraries to ship those metadata files on a one-to-one mapping with the library binary used as input to the linker.

 * The path to the module source files should be discovered via the meatadata file. The outcome is that we don't expect a mechanism to search for those module files in a directory, nor a specific standard on how they should be named. but rather the build system will be given that information directly. This also allows the choice of the standard library splitting its implementation into interface partitions for ease of maintainability.

 * The compiler should offer a mechanism to introspect the path to the metadata file of the standard library, with the settings that the compiler will be used. That means build systems don't need to perform the lookup of the metadata.
The format should be extensible to cover vendor-specific settings.
The format should be versioned to allow future backward-compatible changes.

 * The format should allow indicating when the intention is to build the std module, since the compiler should reject a module accidentally named in a way that conflicts with the std module.

Requirements that are likely to become important in the future, but that are not otherwise included in this proposal:

 * Pre-built libraries, including the std library, may want to advertise pre-built BMIs to be reused by the build system. This requires more convergence on the mechanisms to identify BMI compatibility, which have been discussed in P2581R2, but not yet supported by implementors.

 * Future integration with more general package management facilities. Although my expectation is that this is a step in that direction, where it may be possible to either make the data described here embedded there, or have the path to this metadata referenced instead. Particularly this proposal is not trying to specify requirements that would allow discovering when using this would be appropriate or not.

Some a-priori decisions that are assumed in this format:

 * JSON: while there are many competing alternatives for the serialization format for this metadata, various other parts of the tooling ecosystem (such as compile_commands.json and the dependency scanning output) are already using this format, therefore I chose to just stick to it, rather than consider introducing a new serialization format.

 * Relative file paths: Any non-absolute path described in this file will be presumed to have the directory where the metadata file was found as the base for the lookup.

Since the goal of this proposal is to evaluate specific usage, I'll will prioritize describing the file with examples, rather than writing a JSON schema for it. The final design should still be encoded that way, but I feel the format of json schema would make this conversation harder to maintain.

# Envelope

The first thing I want to address is the envelope that will contain the module-specific metadata. For that there are two options:

## Module-specific file

If we decide to go with a file that refers only to module metadata, the envelope could look like:

   "version": 1,
   "revision": 0,
   "modules": []

## Wider "library manifest" file

If we decide we should go with a wider format for future extensibility, the envelope could look like:

   "version": 1,
   "revision": 0,
   "c++": {
      "modules": []

# The modules value

In both possible envelopes, module metadata is represented as an array of objects, where all the importable module units provided by this library that may be reachable by a consumer in this library will be described.

While we currently don't expect the std module to depend on any module not provided by the std library itself at this point, there are already situations where the std library has external dependencies (e.g.: tbb for libstdc++). This is an area that needs further exploration in the future, and it may be the case that the compiler may need to report several module metadata files, rather than just one.

## Describing a module

The following keys and values are expected in the object in the modules array:

 * logical-name (mandatory): This includes the name of the module being provided, the same semantics of P1689 applies.

 * is-interface (optional, default to true): This describes whether this contributes to the external interface of the module, the same semantics of P1689 applies.

 * source-path (mandatory): The path to the source code of the importable unit. If expressed as a relative path, lookup is done from the directory where the module metadata was found.

 * is-std-library (optional, default to false): Indicates that the module is allowed to use names that are reserved to the standard library.

 * local-arguments (optional), an object describing arguments that should be applied for translating this particular importable unit, but that doesn't need to be in the compilation of the translation unit importing this module:

 ** include-directories (optional): an array of paths that need to be appended to the compilation include search path, same semantics as appending -I in gcc and clang.

 ** system-include-directories (optional): an array of paths that need to be appended to the compilation include path as system locations, same semantics as appending -isystem in gcc and clang.

 ** definitions: an array of objects to be appended in order.

 *** name (mandatory): the name of the definition to be used

 *** value (optional): the value to be set. If missing, it is the equivalent of -DFOO in gcc and clang.

 *** undef (optional, defaults to false, incompatible with value): The equivalent of -UFOO in gcc and clang.

 *** vendor (optional): extension point for vendor-specific configurations. This is an object where the key is the name of the vendor, and the value is implementation-defined.

 * vendor (optional): extension point for vendor-specific information. This is an object where the key is the name of the vendor and the value is implementation-defined.

# Example

Here's how I would expect that would look like for a standard library (assuming the modules file for now), such as libc++:

  "version": 1,
  "revision": 1,
  "modules": [
      "logical-name": "std",
      "source-path": "modules/std.cppm",
      "is-standard-library": true
      "logical-name": "std.compat",
      "source-path": "modules/std.compat.cppm"
      "is-std-library": true
      "logical-name": "std:someinterfacepartition",
      "source-path": "modules/std-someinterfacepartition.cppm"
      "is-std-library": true

Note that this specifically doesn't use any of the local arguments, because I don't really think that's going to be needed for the standard library case. The only special case is the is-standard-library key, to allow the build system to know this is not an accidental collision with the reserved names. We may decide not to settle the local-arguments part of the proposal now for that reason.


Received on 2023-12-13 07:17:20