Date: Sun, 6 Feb 2022 16:31:12 -0500
See clang’s model here:
https://clang.llvm.org/docs/Modules.html#module-map-language
It seems simple and appropriate. The idea of using using modulemap files, which are found beside their accompanying header files, works and is neatly simple. In other words, can we propose to standardize something very similar to clang’s method of finding module and consuming modules? It is lost on me what the benefit of a BMI is, especially if it will eventually be superseded by packages in c++.
Is there anything amiss with how clang is handling modules?
WL
> On Feb 6, 2022, at 9:33 AM, Tom Honermann via SG15 <sg15_at_[hidden]> wrote:
>
>
> On 2/4/22 1:23 PM, Gabriel Dos Reis wrote:
>> [Tom]
>> As I mentioned, a .ifc for module M may be passed on the command line or be present in a search path (one that might be present in order to find other module BMIs).
>> Yes, but I am not understanding how that is the scenario since the IFC for the same module can not be specified twice. Could you make the scenario a little bit more concrete (and then we can abstract back once I get the hang on it)?
>>
>> Additionally, if there are transitive dependencies, BMIs for those transitive dependencies would presumably also be included.
>>
>> Again, I don’t follow how this changes anything. Could you make the scenario more concrete?
> Imagine this scenario:
> An external library, libA, with an interface provided via a module interface unit for module A that imports module M1. libA contains an embedded BMI for modules A and M1.
> An external library, libB, with an interface provided via a module interface unit for module B that imports modules M1 and M2. libB contains an embedded BMI for modules B, M1, and M2.
> Another external library, libC, with an interface provided via a module interface unit for module C that imports modules M1 and M2. libC does not contain embedded BMIs; they are instead provided in a libC\modules directory.
> An internal executable, X.exe, with a source file, X.cpp, that depends on libA, libB, and libC and imports M2.
> When compiling X.exe, a command line such as one of the following might be used. In the first example, I'm assuming that use of an embedded BMI would be implicit for link libraries specified on the command line. For the second, I'm assuming that an embedded BMI is specified via /reference. Both of these are intended to be equivalent. I acknowledge that neither may match your expectations for embedded BMI support.
>
> cl /std:c++20 X.cpp /Fex.exe libA.lib libB.lib /ifcSearchDir libC\modules libC.lib /reference:M2.ifc
> cl /std:c++20 X.cpp /Fex.exe libA.lib /reference:libA.lib libB.lib /reference:libB.lib /ifcSearchDir libC\modules libC.lib /reference M2.ifc
> The question is, when the compiler goes searching for a BMI for modules M1 and M2, which ones win? The BMIs in libA.lib, the BMIs in libB.lib, or the BMIs in libC\modules? Possible answers that I can imagine include the following (commentary here suggests that /ifcSearchDir specifies fall back directories to search if a BMI isn't identified by a /reference option, so I'm not considering some other possible answers).
>
> It doesn't matter. If the BMIs aren't identical, then there is an ODR violation that may or may not be diagnosed.
> Not specified; libA and libB should only embed the BMIs for modules A and B respectively since modules M1 and M2 are "owned" elsewhere.
> The BMI for M1 is found in libA.lib and the BMI for M2 is found in libB.lib. This result occurs if selection favors explicit references that appear earlier on the command line.
> The BMI for M1 is found in libA.lib and the BMI for M2 is found in M2.ifc. This result occurs if selection favors explicit references that appear earlier on the command line unless a more specific reference (e.g., M2.ifc) follows.
> The BMI for M1 is found in libB.lib and the BMI for M2 is found in M2.ifc. This result occurs if selection favors explicit references that appear later on the command line.
> None of the above; the build system is responsible for identifying and extracting embedded BMIs and passing explicit references to the extracted files on the command line.
> None of the above; distinct /reference options would be required for each embedded BMI (e.g., /reference A=libA.lib /reference M1=libA.lib, etc...).
> Tom.
>
>>
>> -- Gaby
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 10:21 AM
>> To: Gabriel Dos Reis <gdr_at_[hidden]>; sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 1:10 PM, Gabriel Dos Reis wrote:
>> [Tom]
>> Because those module artifacts may be present in multiple such artifacts or also be present in a .bmi or .ifc file specified on the command line or present in a search path. Which one wins?
>>
>> I am not sure I understand.
>>
>> Consider that today, you have libA.a and A.h for a component A. That component gets modularized into a module M, where libA.a now contains the BMI for module M. Where are the multiple artifacts coming from?
>> As I mentioned, a .ifc for module M may be passed on the command line or be present in a search path (one that might be present in order to find other module BMIs).
>>
>> Additionally, if there are transitive dependencies, BMIs for those transitive dependencies would presumably also be included. If so, they would likely be present in other libraries as well.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 9:58 AM
>> To: sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Gabriel Dos Reis <gdr_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 12:08 PM, Gabriel Dos Reis via SG15 wrote:
>> [Tom]
>> I'm struggling with this one though. Modules are needed at parse time. Placing them in link-time artifacts seems too late in the tooling pipeline; a compiler may not know how to extract them from archives or shared objects.
>>
>> I don’t follow. At install time (or when the modules is built and distributed), exactly why putting the needed information in the distributed artifact matters where its use it at compile-time? All you need is that the toolset gives you the mean to extract the information you need. For example, GCC already provides some of that. MSVC has always been planned to be doing that too.
>> My (possibly incorrect) understanding of your statement was that the compiler would have the ability to extract module artifacts directly from the link-time artifacts. That would create friction with other tools by imposing on them a requirement to be able to do such extraction themselves, or to use an implementor specific tool to extract the information. In either case, that imposes implementation specific requirements on such tools. I'd prefer a solution that allows tools to perform module artifact discovery without having to be customized for every implementation.
>>
>>
>> It also complicates the search for module artifacts by adding more sources;
>>
>> Why is that an additional source if the information is embedded in the needed artifact (.a or .so or whatever)?
>> Because those module artifacts may be present in multiple such artifacts or also be present in a .bmi or .ifc file specified on the command line or present in a search path. Which one wins?
>>
>>
>> Is the .d.json the additional source you’re worried about?
>> No.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 9:03 AM
>> To: sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Gabriel Dos Reis <gdr_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 11:39 AM, Gabriel Dos Reis via SG15 wrote:
>> We also need to account for scenarios where one single container hosts several BMIs (for several modules),
>> That seems reasonable.
>>
>>
>>
>> or where the BMI is embedded either in the .a or the .so (or equivalent) to provide self-describing artifacts less disconnected (your header units aren’t out-of-sync with the object files anymore). I don’t think that does violence to the unixy world.
>> I'm struggling with this one though. Modules are needed at parse time. Placing them in link-time artifacts seems too late in the tooling pipeline; a compiler may not know how to extract them from archives or shared objects. It also complicates the search for module artifacts by adding more sources; do you search for matching modules by BMIs named on the command line first, then in linker artifacts? Or perhaps in the order that they appear on the command line? If so, that has implications for '-lmylib' vs '-Wl,-lmylib'.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>> From: SG15 <sg15-bounces_at_[hidden]> On Behalf Of Steve Downey via SG15
>> Sent: Thursday, February 3, 2022 5:44 PM
>> To: Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Steve Downey <sdowney_at_[hidden]>; ISO C++ Tooling Study Group <sg15_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>>
>>
>> On Thu, Feb 3, 2022 at 8:07 PM Olga Arkhipova <olgaark_at_[hidden]> wrote:
>> The compiler will have to find all BMIs so their locations should be defined by some command line options.
>> My point is that the same options can be used to find the .d.json files.
>>
>> Thanks,
>> Olga
>> I agree that if the build system can figure out how to do one, it can do the other, as long as there is some discernible relationship between the bmi and the .d.json file. But in a typical unixy environment, libraries and other artifacts to be consumed are not separated out. Perhaps, though the bmi and .d.json both live together in an isolated filesystem-like thing based on the module name? E.g. a directory or zip file, or some such. On the other hand, since .d.json is intended to be portable, I would expect to find it in something like /usr/share/module_${name} in an FHS style system? Or if a library provides multiple modules, underneath /usr/share/lib${name}/?
>> Replace /usr with /usr/local/, ~, ${etcetera}, etc above.
>>
>> (sorry I sent this only to Olga, now replying on list, Olga if you reply, either here or add the list back?)
>>
>>
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
>>
>>
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
>>
>>
>>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
https://clang.llvm.org/docs/Modules.html#module-map-language
It seems simple and appropriate. The idea of using using modulemap files, which are found beside their accompanying header files, works and is neatly simple. In other words, can we propose to standardize something very similar to clang’s method of finding module and consuming modules? It is lost on me what the benefit of a BMI is, especially if it will eventually be superseded by packages in c++.
Is there anything amiss with how clang is handling modules?
WL
> On Feb 6, 2022, at 9:33 AM, Tom Honermann via SG15 <sg15_at_[hidden]> wrote:
>
>
> On 2/4/22 1:23 PM, Gabriel Dos Reis wrote:
>> [Tom]
>> As I mentioned, a .ifc for module M may be passed on the command line or be present in a search path (one that might be present in order to find other module BMIs).
>> Yes, but I am not understanding how that is the scenario since the IFC for the same module can not be specified twice. Could you make the scenario a little bit more concrete (and then we can abstract back once I get the hang on it)?
>>
>> Additionally, if there are transitive dependencies, BMIs for those transitive dependencies would presumably also be included.
>>
>> Again, I don’t follow how this changes anything. Could you make the scenario more concrete?
> Imagine this scenario:
> An external library, libA, with an interface provided via a module interface unit for module A that imports module M1. libA contains an embedded BMI for modules A and M1.
> An external library, libB, with an interface provided via a module interface unit for module B that imports modules M1 and M2. libB contains an embedded BMI for modules B, M1, and M2.
> Another external library, libC, with an interface provided via a module interface unit for module C that imports modules M1 and M2. libC does not contain embedded BMIs; they are instead provided in a libC\modules directory.
> An internal executable, X.exe, with a source file, X.cpp, that depends on libA, libB, and libC and imports M2.
> When compiling X.exe, a command line such as one of the following might be used. In the first example, I'm assuming that use of an embedded BMI would be implicit for link libraries specified on the command line. For the second, I'm assuming that an embedded BMI is specified via /reference. Both of these are intended to be equivalent. I acknowledge that neither may match your expectations for embedded BMI support.
>
> cl /std:c++20 X.cpp /Fex.exe libA.lib libB.lib /ifcSearchDir libC\modules libC.lib /reference:M2.ifc
> cl /std:c++20 X.cpp /Fex.exe libA.lib /reference:libA.lib libB.lib /reference:libB.lib /ifcSearchDir libC\modules libC.lib /reference M2.ifc
> The question is, when the compiler goes searching for a BMI for modules M1 and M2, which ones win? The BMIs in libA.lib, the BMIs in libB.lib, or the BMIs in libC\modules? Possible answers that I can imagine include the following (commentary here suggests that /ifcSearchDir specifies fall back directories to search if a BMI isn't identified by a /reference option, so I'm not considering some other possible answers).
>
> It doesn't matter. If the BMIs aren't identical, then there is an ODR violation that may or may not be diagnosed.
> Not specified; libA and libB should only embed the BMIs for modules A and B respectively since modules M1 and M2 are "owned" elsewhere.
> The BMI for M1 is found in libA.lib and the BMI for M2 is found in libB.lib. This result occurs if selection favors explicit references that appear earlier on the command line.
> The BMI for M1 is found in libA.lib and the BMI for M2 is found in M2.ifc. This result occurs if selection favors explicit references that appear earlier on the command line unless a more specific reference (e.g., M2.ifc) follows.
> The BMI for M1 is found in libB.lib and the BMI for M2 is found in M2.ifc. This result occurs if selection favors explicit references that appear later on the command line.
> None of the above; the build system is responsible for identifying and extracting embedded BMIs and passing explicit references to the extracted files on the command line.
> None of the above; distinct /reference options would be required for each embedded BMI (e.g., /reference A=libA.lib /reference M1=libA.lib, etc...).
> Tom.
>
>>
>> -- Gaby
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 10:21 AM
>> To: Gabriel Dos Reis <gdr_at_[hidden]>; sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 1:10 PM, Gabriel Dos Reis wrote:
>> [Tom]
>> Because those module artifacts may be present in multiple such artifacts or also be present in a .bmi or .ifc file specified on the command line or present in a search path. Which one wins?
>>
>> I am not sure I understand.
>>
>> Consider that today, you have libA.a and A.h for a component A. That component gets modularized into a module M, where libA.a now contains the BMI for module M. Where are the multiple artifacts coming from?
>> As I mentioned, a .ifc for module M may be passed on the command line or be present in a search path (one that might be present in order to find other module BMIs).
>>
>> Additionally, if there are transitive dependencies, BMIs for those transitive dependencies would presumably also be included. If so, they would likely be present in other libraries as well.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 9:58 AM
>> To: sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Gabriel Dos Reis <gdr_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 12:08 PM, Gabriel Dos Reis via SG15 wrote:
>> [Tom]
>> I'm struggling with this one though. Modules are needed at parse time. Placing them in link-time artifacts seems too late in the tooling pipeline; a compiler may not know how to extract them from archives or shared objects.
>>
>> I don’t follow. At install time (or when the modules is built and distributed), exactly why putting the needed information in the distributed artifact matters where its use it at compile-time? All you need is that the toolset gives you the mean to extract the information you need. For example, GCC already provides some of that. MSVC has always been planned to be doing that too.
>> My (possibly incorrect) understanding of your statement was that the compiler would have the ability to extract module artifacts directly from the link-time artifacts. That would create friction with other tools by imposing on them a requirement to be able to do such extraction themselves, or to use an implementor specific tool to extract the information. In either case, that imposes implementation specific requirements on such tools. I'd prefer a solution that allows tools to perform module artifact discovery without having to be customized for every implementation.
>>
>>
>> It also complicates the search for module artifacts by adding more sources;
>>
>> Why is that an additional source if the information is embedded in the needed artifact (.a or .so or whatever)?
>> Because those module artifacts may be present in multiple such artifacts or also be present in a .bmi or .ifc file specified on the command line or present in a search path. Which one wins?
>>
>>
>> Is the .d.json the additional source you’re worried about?
>> No.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>>
>> From: Tom Honermann <tom_at_[hidden]>
>> Sent: Friday, February 4, 2022 9:03 AM
>> To: sg15_at_[hidden]; Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Gabriel Dos Reis <gdr_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>> On 2/4/22 11:39 AM, Gabriel Dos Reis via SG15 wrote:
>> We also need to account for scenarios where one single container hosts several BMIs (for several modules),
>> That seems reasonable.
>>
>>
>>
>> or where the BMI is embedded either in the .a or the .so (or equivalent) to provide self-describing artifacts less disconnected (your header units aren’t out-of-sync with the object files anymore). I don’t think that does violence to the unixy world.
>> I'm struggling with this one though. Modules are needed at parse time. Placing them in link-time artifacts seems too late in the tooling pipeline; a compiler may not know how to extract them from archives or shared objects. It also complicates the search for module artifacts by adding more sources; do you search for matching modules by BMIs named on the command line first, then in linker artifacts? Or perhaps in the order that they appear on the command line? If so, that has implications for '-lmylib' vs '-Wl,-lmylib'.
>>
>> Tom.
>>
>>
>> -- Gaby
>>
>> From: SG15 <sg15-bounces_at_[hidden]> On Behalf Of Steve Downey via SG15
>> Sent: Thursday, February 3, 2022 5:44 PM
>> To: Olga Arkhipova <olgaark_at_[hidden]>
>> Cc: Steve Downey <sdowney_at_[hidden]>; ISO C++ Tooling Study Group <sg15_at_[hidden]>
>> Subject: Re: [SG15] Meeting on February 4th at 9AM Pacific
>>
>>
>>
>> On Thu, Feb 3, 2022 at 8:07 PM Olga Arkhipova <olgaark_at_[hidden]> wrote:
>> The compiler will have to find all BMIs so their locations should be defined by some command line options.
>> My point is that the same options can be used to find the .d.json files.
>>
>> Thanks,
>> Olga
>> I agree that if the build system can figure out how to do one, it can do the other, as long as there is some discernible relationship between the bmi and the .d.json file. But in a typical unixy environment, libraries and other artifacts to be consumed are not separated out. Perhaps, though the bmi and .d.json both live together in an isolated filesystem-like thing based on the module name? E.g. a directory or zip file, or some such. On the other hand, since .d.json is intended to be portable, I would expect to find it in something like /usr/share/module_${name} in an FHS style system? Or if a library provides multiple modules, underneath /usr/share/lib${name}/?
>> Replace /usr with /usr/local/, ~, ${etcetera}, etc above.
>>
>> (sorry I sent this only to Olga, now replying on list, Olga if you reply, either here or add the list back?)
>>
>>
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
>>
>>
>>
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
>>
>>
>>
>>
>
> _______________________________________________
> SG15 mailing list
> SG15_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg15
Received on 2022-02-06 21:31:15