Date: Wed, 13 Dec 2023 17:37:15 -0500
On 12/13/23 2:19 PM, Olga Arkhipova via SG15 wrote:
>
> >> I think it will be necessary to support paths that are relative to
> some other parameterized location to accommodate dependencies on
> header files or modules provided by other projects/packages.
>
> If a library can reference another libraries, this is not needed. All
> other dependencies (including module dependencies) will be resolved by
> using referenced library manifest.
>
We may be thinking similarly here. The question is how to refer to that
other library manifest. How does the provider of a manifest file for one
project know where the manifest file for another project be located post
deployment? I think the build system will need to provide that
information (preferably as provided by some package manager).
>
> >> Do implementation module units need to be mentioned at all? I'm not
> opposed to allowing for them, but if I'm following correctly, since
> the metadata file is consumed to satisfy module imports, they don't
> contribute to anything that is relevant to import. I would expect
> is-interface to always be true.
>
> Currently, is-interface=false means that this is an internal
> partition, which has to be built to a BMI and passed as a module
> reference on the command line to be able to use the primary module in
> the code.
>
> But it does not look to me that we need “is-interface” metadata as
> build will scan the sources and get the exact data what this source is.
>
> Actually, I think we should not have “is-interface” as module source
> metadata at all to avoid any conflicts with scan data.
>
> This kind of data (together with dependencies) would be useful for
> BMI, but not the source.
>
Yes, that matches my intuition.
Tom.
> Olga
>
> *From:*SG15 <sg15-bounces_at_[hidden]> *On Behalf Of *Tom
> Honermann via SG15
> *Sent:* Wednesday, December 13, 2023 10:46 AM
> *To:* sg15_at_[hidden]
> *Cc:* Tom Honermann <tom_at_[hidden]>
> *Subject:* Re: [SG15] Proposal for module metadata format to be used
> by the std library and others
>
> On 12/12/23 4:56 PM, Daniel Ruoso via SG15 wrote:
>
> As discussed in the meeting on 2023-12-12, I'm putting together a
> proposal for the metadata format to be used both when discovering
> the std modules as well as for pre-built libraries in general.
>
> There was general agreement that while the std library always
> needs special cases, we will need some metadata format to describe
> the std modules, and it would be unfortunate if the format used
> for the std library was different than that used in other scenarios.
>
> There was some desire to have this be a starting point for a wider
> metadata format to cover other aspects beyond modules, so this
> proposal will be split in two. First covering modules, and later
> offering an option on how that information could be embedded into
> a wider format.
>
> Before jumping into the format itself, here's the list of
> requirements that guided the design:
>
> * Module discovery should be subordinate to ABI decisions having
> been made already. The outcome is that we don't expect the module
> metadata to be used to discover which ABI settings are available
> for a given standard library or how to choose between them. The
> way that this manifests is that, following the lead from P2577R2,
> we expect the libraries to ship those metadata files on a
> one-to-one mapping with the library binary used as input to the
> linker.
>
> I think someone mentioned it yesterday, but this will presumably have
> to account for multilib libraries in some way.
>
> * The path to the module source files should be discovered via
> the meatadata file. The outcome is that we don't expect a
> mechanism to search for those module files in a directory, nor a
> specific standard on how they should be named. but rather the
> build system will be given that information directly. This also
> allows the choice of the standard library splitting its
> implementation into interface partitions for ease of maintainability.
>
> * The compiler should offer a mechanism to introspect the path to
> the metadata file of the standard library, with the settings that
> the compiler will be used. That means build systems don't need to
> perform the lookup of the metadata.
>
> The format should be extensible to cover vendor-specific settings.
>
> The format should be versioned to allow future backward-compatible
> changes.
>
> * The format should allow indicating when the intention is to
> build the std module, since the compiler should reject a module
> accidentally named in a way that conflicts with the std module.
>
> Requirements that are likely to become important in the future,
> but that are not otherwise included in this proposal:
>
> * Pre-built libraries, including the std library, may want to
> advertise pre-built BMIs to be reused by the build system. This
> requires more convergence on the mechanisms to identify BMI
> compatibility, which have been discussed in P2581R2, but not yet
> supported by implementors.
>
> * Future integration with more general package management
> facilities. Although my expectation is that this is a step in that
> direction, where it may be possible to either make the data
> described here embedded there, or have the path to this metadata
> referenced instead. Particularly this proposal is not trying to
> specify requirements that would allow discovering when using this
> would be appropriate or not.
>
> Some a-priori decisions that are assumed in this format:
>
> * JSON: while there are many competing alternatives for the
> serialization format for this metadata, various other parts of the
> tooling ecosystem (such as compile_commands.json and the
> dependency scanning output) are already using this format,
> therefore I chose to just stick to it, rather than consider
> introducing a new serialization format.
>
> * Relative file paths: Any non-absolute path described in this
> file will be presumed to have the directory where the metadata
> file was found as the base for the lookup.
>
> I think it will be necessary to support paths that are relative to
> some other parameterized location to accommodate dependencies on
> header files or modules provided by other projects/packages. This
> would help to reduce the otherwise common practice of a build system
> having to concatenate include paths for essentially every project it
> knows about when building any package it knows about.
>
> Since the goal of this proposal is to evaluate specific usage,
> I'll will prioritize describing the file with examples, rather
> than writing a JSON schema for it. The final design should still
> be encoded that way, but I feel the format of json schema would
> make this conversation harder to maintain.
>
> I agree. JSON schema is great ... for later :)
>
> # Envelope
>
> The first thing I want to address is the envelope that will
> contain the module-specific metadata. For that there are two options:
>
> ## Module-specific file
>
> If we decide to go with a file that refers only to module
> metadata, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "modules": []
>
> }
>
> ## Wider "library manifest" file
>
> If we decide we should go with a wider format for future
> extensibility, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "c++": {
>
> "modules": []
>
> }
>
> }
>
> # The modules value
>
> In both possible envelopes, module metadata is represented as an
> array of objects, where all the importable module units provided
> by this library that may be reachable by a consumer in this
> library will be described.
>
> While we currently don't expect the std module to depend on any
> module not provided by the std library itself at this point, there
> are already situations where the std library has external
> dependencies (e.g.: tbb for libstdc++). This is an area that needs
> further exploration in the future, and it may be the case that the
> compiler may need to report several module metadata files, rather
> than just one.
>
> ## Describing a module
>
> The following keys and values are expected in the object in the
> modules array:
>
> * logical-name (mandatory): This includes the name of the module
> being provided, the same semantics of P1689 applies.
>
> * is-interface (optional, default to true): This describes
> whether this contributes to the external interface of the module,
> the same semantics of P1689 applies.
>
> Do implementation module units need to be mentioned at all? I'm not
> opposed to allowing for them, but if I'm following correctly, since
> the metadata file is consumed to satisfy module imports, they don't
> contribute to anything that is relevant to import. I would expect
> is-interface to always be true.
>
> * source-path (mandatory): The path to the source code of the
> importable unit. If expressed as a relative path, lookup is done
> from the directory where the module metadata was found.
>
> * is-std-library (optional, default to false): Indicates that the
> module is allowed to use names that are reserved to the standard
> library.
>
> * local-arguments (optional), an object describing arguments that
> should be applied for translating this particular importable unit,
> but that doesn't need to be in the compilation of the translation
> unit importing this module:
>
> Since local-arguments (which I presume to mean compiler command line
> options) are necessarily implementation specific, I think this should
> either be generalized or named such that it reflects an implementation
> dependency. Perhaps:
>
> "local-arguments": [
> { "gcc-compatible": [ "-fconstexpr-depth=512" ] },
> { "cl-compatible": [ "/constexpr:depth512" ] },
> { "circle": [ "--ftemplate-depth=768" ] }
> ]
>
> Tom.
>
> ** include-directories (optional): an array of paths that need to
> be appended to the compilation include search path, same semantics
> as appending -I in gcc and clang.
>
> ** system-include-directories (optional): an array of paths that
> need to be appended to the compilation include path as system
> locations, same semantics as appending -isystem in gcc and clang.
>
> ** definitions: an array of objects to be appended in order.
>
> *** name (mandatory): the name of the definition to be used
>
> *** value (optional): the value to be set. If missing, it is the
> equivalent of -DFOO in gcc and clang.
>
> *** undef (optional, defaults to false, incompatible with value):
> The equivalent of -UFOO in gcc and clang.
>
> *** vendor (optional): extension point for vendor-specific
> configurations. This is an object where the key is the name of the
> vendor, and the value is implementation-defined.
>
> * vendor (optional): extension point for vendor-specific
> information. This is an object where the key is the name of the
> vendor and the value is implementation-defined.
>
> # Example
>
> Here's how I would expect that would look like for a standard
> library (assuming the modules file for now), such as libc++:
>
> {
>
> "version": 1,
>
> "revision": 1,
>
> "modules": [
>
> {
>
> "logical-name": "std",
>
> "source-path": "modules/std.cppm",
>
> "is-standard-library": true
>
> },
>
> {
>
> "logical-name": "std.compat",
>
> "source-path": "modules/std.compat.cppm"
>
> "is-std-library": true
>
> },
>
> {
>
> "logical-name": "std:someinterfacepartition",
>
> "source-path": "modules/std-someinterfacepartition.cppm"
>
> "is-std-library": true
>
> }
>
> ]
>
> }
>
> Note that this specifically doesn't use any of the local
> arguments, because I don't really think that's going to be needed
> for the standard library case. The only special case is the
> is-standard-library key, to allow the build system to know this is
> not an accidental collision with the reserved names. We may decide
> not to settle the local-arguments part of the proposal now for
> that reason.
>
> Daniel
>
> _______________________________________________
>
> SG15 mailing list
>
> SG15_at_[hidden]
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
> >> I think it will be necessary to support paths that are relative to
> some other parameterized location to accommodate dependencies on
> header files or modules provided by other projects/packages.
>
> If a library can reference another libraries, this is not needed. All
> other dependencies (including module dependencies) will be resolved by
> using referenced library manifest.
>
We may be thinking similarly here. The question is how to refer to that
other library manifest. How does the provider of a manifest file for one
project know where the manifest file for another project be located post
deployment? I think the build system will need to provide that
information (preferably as provided by some package manager).
>
> >> Do implementation module units need to be mentioned at all? I'm not
> opposed to allowing for them, but if I'm following correctly, since
> the metadata file is consumed to satisfy module imports, they don't
> contribute to anything that is relevant to import. I would expect
> is-interface to always be true.
>
> Currently, is-interface=false means that this is an internal
> partition, which has to be built to a BMI and passed as a module
> reference on the command line to be able to use the primary module in
> the code.
>
> But it does not look to me that we need “is-interface” metadata as
> build will scan the sources and get the exact data what this source is.
>
> Actually, I think we should not have “is-interface” as module source
> metadata at all to avoid any conflicts with scan data.
>
> This kind of data (together with dependencies) would be useful for
> BMI, but not the source.
>
Yes, that matches my intuition.
Tom.
> Olga
>
> *From:*SG15 <sg15-bounces_at_[hidden]> *On Behalf Of *Tom
> Honermann via SG15
> *Sent:* Wednesday, December 13, 2023 10:46 AM
> *To:* sg15_at_[hidden]
> *Cc:* Tom Honermann <tom_at_[hidden]>
> *Subject:* Re: [SG15] Proposal for module metadata format to be used
> by the std library and others
>
> On 12/12/23 4:56 PM, Daniel Ruoso via SG15 wrote:
>
> As discussed in the meeting on 2023-12-12, I'm putting together a
> proposal for the metadata format to be used both when discovering
> the std modules as well as for pre-built libraries in general.
>
> There was general agreement that while the std library always
> needs special cases, we will need some metadata format to describe
> the std modules, and it would be unfortunate if the format used
> for the std library was different than that used in other scenarios.
>
> There was some desire to have this be a starting point for a wider
> metadata format to cover other aspects beyond modules, so this
> proposal will be split in two. First covering modules, and later
> offering an option on how that information could be embedded into
> a wider format.
>
> Before jumping into the format itself, here's the list of
> requirements that guided the design:
>
> * Module discovery should be subordinate to ABI decisions having
> been made already. The outcome is that we don't expect the module
> metadata to be used to discover which ABI settings are available
> for a given standard library or how to choose between them. The
> way that this manifests is that, following the lead from P2577R2,
> we expect the libraries to ship those metadata files on a
> one-to-one mapping with the library binary used as input to the
> linker.
>
> I think someone mentioned it yesterday, but this will presumably have
> to account for multilib libraries in some way.
>
> * The path to the module source files should be discovered via
> the meatadata file. The outcome is that we don't expect a
> mechanism to search for those module files in a directory, nor a
> specific standard on how they should be named. but rather the
> build system will be given that information directly. This also
> allows the choice of the standard library splitting its
> implementation into interface partitions for ease of maintainability.
>
> * The compiler should offer a mechanism to introspect the path to
> the metadata file of the standard library, with the settings that
> the compiler will be used. That means build systems don't need to
> perform the lookup of the metadata.
>
> The format should be extensible to cover vendor-specific settings.
>
> The format should be versioned to allow future backward-compatible
> changes.
>
> * The format should allow indicating when the intention is to
> build the std module, since the compiler should reject a module
> accidentally named in a way that conflicts with the std module.
>
> Requirements that are likely to become important in the future,
> but that are not otherwise included in this proposal:
>
> * Pre-built libraries, including the std library, may want to
> advertise pre-built BMIs to be reused by the build system. This
> requires more convergence on the mechanisms to identify BMI
> compatibility, which have been discussed in P2581R2, but not yet
> supported by implementors.
>
> * Future integration with more general package management
> facilities. Although my expectation is that this is a step in that
> direction, where it may be possible to either make the data
> described here embedded there, or have the path to this metadata
> referenced instead. Particularly this proposal is not trying to
> specify requirements that would allow discovering when using this
> would be appropriate or not.
>
> Some a-priori decisions that are assumed in this format:
>
> * JSON: while there are many competing alternatives for the
> serialization format for this metadata, various other parts of the
> tooling ecosystem (such as compile_commands.json and the
> dependency scanning output) are already using this format,
> therefore I chose to just stick to it, rather than consider
> introducing a new serialization format.
>
> * Relative file paths: Any non-absolute path described in this
> file will be presumed to have the directory where the metadata
> file was found as the base for the lookup.
>
> I think it will be necessary to support paths that are relative to
> some other parameterized location to accommodate dependencies on
> header files or modules provided by other projects/packages. This
> would help to reduce the otherwise common practice of a build system
> having to concatenate include paths for essentially every project it
> knows about when building any package it knows about.
>
> Since the goal of this proposal is to evaluate specific usage,
> I'll will prioritize describing the file with examples, rather
> than writing a JSON schema for it. The final design should still
> be encoded that way, but I feel the format of json schema would
> make this conversation harder to maintain.
>
> I agree. JSON schema is great ... for later :)
>
> # Envelope
>
> The first thing I want to address is the envelope that will
> contain the module-specific metadata. For that there are two options:
>
> ## Module-specific file
>
> If we decide to go with a file that refers only to module
> metadata, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "modules": []
>
> }
>
> ## Wider "library manifest" file
>
> If we decide we should go with a wider format for future
> extensibility, the envelope could look like:
>
> {
>
> "version": 1,
>
> "revision": 0,
>
> "c++": {
>
> "modules": []
>
> }
>
> }
>
> # The modules value
>
> In both possible envelopes, module metadata is represented as an
> array of objects, where all the importable module units provided
> by this library that may be reachable by a consumer in this
> library will be described.
>
> While we currently don't expect the std module to depend on any
> module not provided by the std library itself at this point, there
> are already situations where the std library has external
> dependencies (e.g.: tbb for libstdc++). This is an area that needs
> further exploration in the future, and it may be the case that the
> compiler may need to report several module metadata files, rather
> than just one.
>
> ## Describing a module
>
> The following keys and values are expected in the object in the
> modules array:
>
> * logical-name (mandatory): This includes the name of the module
> being provided, the same semantics of P1689 applies.
>
> * is-interface (optional, default to true): This describes
> whether this contributes to the external interface of the module,
> the same semantics of P1689 applies.
>
> Do implementation module units need to be mentioned at all? I'm not
> opposed to allowing for them, but if I'm following correctly, since
> the metadata file is consumed to satisfy module imports, they don't
> contribute to anything that is relevant to import. I would expect
> is-interface to always be true.
>
> * source-path (mandatory): The path to the source code of the
> importable unit. If expressed as a relative path, lookup is done
> from the directory where the module metadata was found.
>
> * is-std-library (optional, default to false): Indicates that the
> module is allowed to use names that are reserved to the standard
> library.
>
> * local-arguments (optional), an object describing arguments that
> should be applied for translating this particular importable unit,
> but that doesn't need to be in the compilation of the translation
> unit importing this module:
>
> Since local-arguments (which I presume to mean compiler command line
> options) are necessarily implementation specific, I think this should
> either be generalized or named such that it reflects an implementation
> dependency. Perhaps:
>
> "local-arguments": [
> { "gcc-compatible": [ "-fconstexpr-depth=512" ] },
> { "cl-compatible": [ "/constexpr:depth512" ] },
> { "circle": [ "--ftemplate-depth=768" ] }
> ]
>
> Tom.
>
> ** include-directories (optional): an array of paths that need to
> be appended to the compilation include search path, same semantics
> as appending -I in gcc and clang.
>
> ** system-include-directories (optional): an array of paths that
> need to be appended to the compilation include path as system
> locations, same semantics as appending -isystem in gcc and clang.
>
> ** definitions: an array of objects to be appended in order.
>
> *** name (mandatory): the name of the definition to be used
>
> *** value (optional): the value to be set. If missing, it is the
> equivalent of -DFOO in gcc and clang.
>
> *** undef (optional, defaults to false, incompatible with value):
> The equivalent of -UFOO in gcc and clang.
>
> *** vendor (optional): extension point for vendor-specific
> configurations. This is an object where the key is the name of the
> vendor, and the value is implementation-defined.
>
> * vendor (optional): extension point for vendor-specific
> information. This is an object where the key is the name of the
> vendor and the value is implementation-defined.
>
> # Example
>
> Here's how I would expect that would look like for a standard
> library (assuming the modules file for now), such as libc++:
>
> {
>
> "version": 1,
>
> "revision": 1,
>
> "modules": [
>
> {
>
> "logical-name": "std",
>
> "source-path": "modules/std.cppm",
>
> "is-standard-library": true
>
> },
>
> {
>
> "logical-name": "std.compat",
>
> "source-path": "modules/std.compat.cppm"
>
> "is-std-library": true
>
> },
>
> {
>
> "logical-name": "std:someinterfacepartition",
>
> "source-path": "modules/std-someinterfacepartition.cppm"
>
> "is-std-library": true
>
> }
>
> ]
>
> }
>
> Note that this specifically doesn't use any of the local
> arguments, because I don't really think that's going to be needed
> for the standard library case. The only special case is the
> is-standard-library key, to allow the build system to know this is
> not an accidental collision with the reserved names. We may decide
> not to settle the local-arguments part of the proposal now for
> that reason.
>
> Daniel
>
> _______________________________________________
>
> SG15 mailing list
>
> SG15_at_[hidden]
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
Received on 2023-12-13 22:37:17