Date: Sat, 12 Jan 2019 00:04:55 +0100
On Fri, 11 Jan 2019 at 23:31 Tom Honermann <tom_at_[hidden]> wrote:
> On 1/11/19 9:29 AM, Ben Craig wrote:
>
> > The manifest file could be generated (based on information in source
> code) and
>
> > usable by multiple tools, IDEs, and build systems. It need not be a
> statically
>
> > maintained file. I like this approach because it appropriately
> separates the
>
> > steps of identifying/building the module name/file map vs using the
> module
>
> > name/file map (most tools don't want to be build systems).
>
>
>
> This sounds like a clang compilation database.
>
>
> I don't think so. It is a database, but not one that stores the history
> of how a build was performed. Rather, it stores some of the information
> that would be needed to construct a compiler invocation (or for a tool/IDE
> to parse source files that require resolving module names to source files).
>
It's exactly what a compilation database is ( it includes informations such
as include directories, defines and some flags) for every translation unit
> Such a thing is useful for non-build tools, and can be generated
> automatically, as you said. I am concerned that generating such a file
> early in the build process would cause performance problems. Each source
> file would need to be partially parsed before the build DAG could be fully
> formed. However, I don’t have benchmarks indicating how expensive an extra
> pass over all the source files is.
>
> Without such a file (or equivalent information statically encoded in "the"
> build system), the only alternative is to examine every source file in
> order to construct the build DAG in memory anyway. How else could such a
> scan be avoided?
>
Build systems will always need to parse every file at least once before
invoking the build.
They can hopefully extract both the name and the dependency in a single
pass.
The issue with lack of module mapping is that the build system can't really
have a top-down
approach that would ensure some nodes of the build graph are fully resolved
soon after
the build starts.
> (There is another alternative; implicitly building modules on demand as is
> done for Clang modules today. But to my knowledge, no implementors are
> pursuing that for standard proposed modules).
>
I imagine this would solve a lot of issues : Give the compiler a bunch of
source files and let it figure things out. It's more or less how rust and
go works unless I'm mistaking?
I highly recommand this article about how that problem can be solved in D
https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/
I don't know if implementers would be willing to move in this direction ?
However, it's important to note that even if the compiler handle the module
dependency itself
it would also benefit from a fast module -> file mapping, even more so than
build systems.
> Tom.
>
>
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]>
> <tooling-bounces_at_[hidden]> *On Behalf Of *Tom Honermann
> *Sent:* Thursday, January 10, 2019 9:27 PM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> <tooling_at_[hidden]>
> *Subject:* Re: [Tooling] Modules naming
>
>
>
> On 1/10/19 5:17 PM, Ben Craig wrote:
>
> Can you elaborate more on the kinds of historical pains caused by tying a
> #include directive to a file name? I know of issues with #pragma once, but
> that feels like a distinct problem from file names.
>
>
>
> > This is based on decades of experience caused by header files.
>
> I think most of the participants in wg21 have years, often decades of
> experience with header files. I know of plenty of issues with the
> preprocessor, but I am not yet aware of any major problems on the file name
> front. (ok, getting the right slashes can be annoying… but it’s not a huge
> problem for me personally).
>
>
>
> I’m not a fan of the MANIFEST / module map approach in general. It
> requires duplicating information that is already in the source. I get that
> it has the potential to speed up builds, but I’d rather not have to update
> another location when I add a new .cpp file to my project. Many build
> systems allow for the user to make the tradeoff in whether they will use a
> file system glob to enumerate their source, or require the user to list the
> source manually. I usually fall into the file system glob crowd.
>
> The manifest file could be generated (based on information in source code)
> and usable by multiple tools, IDEs, and build systems. It need not be a
> statically maintained file. I like this approach because it appropriately
> separates the steps of identifying/building the module name/file map vs
> using the module name/file map (most tools don't want to be build systems).
>
> Tom.
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]>
> <tooling-bounces_at_[hidden]> * On Behalf Of *Gabriel Dos Reis
> *Sent:* Thursday, January 10, 2019 3:15 PM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> <tooling_at_[hidden]>
> *Subject:* Re: [Tooling] Modules naming
>
>
>
> Microsoft strongly encourages its developers and customers to NOT tie a
> module name with the containing source file of its interface. This is
> based on decades of experience caused by header files. I would rather see
> us move in the direction of some sort of MANIFEST file that map modules to
> source files and artifacts.
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> *On
> Behalf Of *Corentin
> *Sent:* Thursday, January 10, 2019 6:53 AM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> *Subject:* [Tooling] Modules naming
>
>
>
> Hello.
>
> I would like to suggest two modules related proposals that I think SG15
> should look at.
>
>
>
> -* Compiler enforced mapping between module names and module interface
> file (resource) name. *
>
>
>
> Currently, modules interfaces can be declared in any file - which makes
> dependency scanning more tedious than it needs to be and have performance
> implications
>
> (The build system needs to open all files to gather a list of modules) -
> notably when the build system tries to start building while the dependency
> graph isn't yet complete.
>
>
>
> Tools ( ide, code servers, indexers, refactoring) may also greatly benefit
> from an easier way to locate the source file which declares a module.
>
>
>
> The specifics of the mapping are open to bikeshedding. However, I think we
> would have better luck sticking to something simple like <module
> identifier> <=> <file name>.<extension>
>
> (The standardese would mention *resource identifier* rather than filename)
>
>
>
> - *A standing document giving guidelines for modules naming.*
>
>
>
> The goal is to take everything the community had to learn the hard way
> about header naming over the past 30 years and apply it to modules by
> providing a set of guidelines
>
> that could be partially enforced by build system vendors.
>
> Encouraging consistency and uniqueness of module identifiers across the
> industry is I think a necessary step towards sane package management.
>
> Note that the standard requires uniqueness of modules identifiers within
> (the standard definition of) a program but says little about a way to
> ensure this uniqueness.
>
>
>
> Here is a rough draft of what I think would be good guidelines, partially
> inspired by what is done by other languages facing similar issues.
>
> · *Prefix module names with an entity and/or a project name to
> prevent modules from different companies, entities and projects of
> declaring the same module names.*
>
> · *Exported top-level namespaces should have a name identic to
> the project name used as part of the name of the module(s) from which it is
> exported.*
>
> · *Do not export multiple top-level namespaces*
>
> · *Do not export entities in the global namespace outside of the
> global module fragment.*
>
> · *Organize modules hierarchically.* For example, if both modules
> example.foo and example.foo.bar exist as part of the public API of example
> , example.foo should reexport example.foo.bar
>
> · *Avoid common names such as *util* and *core* for module name
> prefix and top-level namespace names.*
>
> · *Use lower-case module names*
>
> · *Do not use characters outside of the basic source character
> set in module name identifiers.*
>
> My hope is that these 2 proposals (whose impact on the standard is
> minimal) would make it easier for current tooling to deal with modules
>
> while making possible for example to design dependency managers and build
> systems able to work at the module level.
>
>
>
> I'd love to gather feedback and opinions before going further in that
> direction.
>
> Thanks a lot!
>
>
>
> Corentin
>
>
>
> PS: For a bit of background, I talked about these issues there
>
>
>
> https://cor3ntin.github.io/posts/modules_mapping/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fmapping-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DRbCelyBe1YDW4eNJtYEgKkAeHGxvkhsYqzPk0wf3F58-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=v7Z40T9WgivvxWUJ6plSphOw4d8bdvfEz9NAqCruKwE&e=>
>
> https://cor3ntin.github.io/posts/modules_naming/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fnaming-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DtMhQa4ijeqUd2qxXV4loP47nU5NESRTKJLwZqe-252FI1fc-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=O9uUoT3QItO0vkb2QTG-EnXjsGfOiq7t93GgFz4YHx8&e=>
>
>
>
>
>
>
>
> _______________________________________________
>
> Tooling mailing list
>
> Tooling_at_[hidden]
>
> http://www.open-std.org/mailman/listinfo/tooling <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_mailman_listinfo_tooling&d=DwMDaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=_dusEGqwzSzglMFFwUFdPvzdZCb1dTUZ9DjjrQwHaUw&s=o4EJMe6pKxUA_1edRMmVmbN3paM7ckt_7iDjgIveiwA&e=>
>
>
>
> _______________________________________________
> Tooling mailing listTooling_at_[hidden]://www.open-std.org/mailman/listinfo/tooling
>
>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
> On 1/11/19 9:29 AM, Ben Craig wrote:
>
> > The manifest file could be generated (based on information in source
> code) and
>
> > usable by multiple tools, IDEs, and build systems. It need not be a
> statically
>
> > maintained file. I like this approach because it appropriately
> separates the
>
> > steps of identifying/building the module name/file map vs using the
> module
>
> > name/file map (most tools don't want to be build systems).
>
>
>
> This sounds like a clang compilation database.
>
>
> I don't think so. It is a database, but not one that stores the history
> of how a build was performed. Rather, it stores some of the information
> that would be needed to construct a compiler invocation (or for a tool/IDE
> to parse source files that require resolving module names to source files).
>
It's exactly what a compilation database is ( it includes informations such
as include directories, defines and some flags) for every translation unit
> Such a thing is useful for non-build tools, and can be generated
> automatically, as you said. I am concerned that generating such a file
> early in the build process would cause performance problems. Each source
> file would need to be partially parsed before the build DAG could be fully
> formed. However, I don’t have benchmarks indicating how expensive an extra
> pass over all the source files is.
>
> Without such a file (or equivalent information statically encoded in "the"
> build system), the only alternative is to examine every source file in
> order to construct the build DAG in memory anyway. How else could such a
> scan be avoided?
>
Build systems will always need to parse every file at least once before
invoking the build.
They can hopefully extract both the name and the dependency in a single
pass.
The issue with lack of module mapping is that the build system can't really
have a top-down
approach that would ensure some nodes of the build graph are fully resolved
soon after
the build starts.
> (There is another alternative; implicitly building modules on demand as is
> done for Clang modules today. But to my knowledge, no implementors are
> pursuing that for standard proposed modules).
>
I imagine this would solve a lot of issues : Give the compiler a bunch of
source files and let it figure things out. It's more or less how rust and
go works unless I'm mistaking?
I highly recommand this article about how that problem can be solved in D
https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/
I don't know if implementers would be willing to move in this direction ?
However, it's important to note that even if the compiler handle the module
dependency itself
it would also benefit from a fast module -> file mapping, even more so than
build systems.
> Tom.
>
>
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]>
> <tooling-bounces_at_[hidden]> *On Behalf Of *Tom Honermann
> *Sent:* Thursday, January 10, 2019 9:27 PM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> <tooling_at_[hidden]>
> *Subject:* Re: [Tooling] Modules naming
>
>
>
> On 1/10/19 5:17 PM, Ben Craig wrote:
>
> Can you elaborate more on the kinds of historical pains caused by tying a
> #include directive to a file name? I know of issues with #pragma once, but
> that feels like a distinct problem from file names.
>
>
>
> > This is based on decades of experience caused by header files.
>
> I think most of the participants in wg21 have years, often decades of
> experience with header files. I know of plenty of issues with the
> preprocessor, but I am not yet aware of any major problems on the file name
> front. (ok, getting the right slashes can be annoying… but it’s not a huge
> problem for me personally).
>
>
>
> I’m not a fan of the MANIFEST / module map approach in general. It
> requires duplicating information that is already in the source. I get that
> it has the potential to speed up builds, but I’d rather not have to update
> another location when I add a new .cpp file to my project. Many build
> systems allow for the user to make the tradeoff in whether they will use a
> file system glob to enumerate their source, or require the user to list the
> source manually. I usually fall into the file system glob crowd.
>
> The manifest file could be generated (based on information in source code)
> and usable by multiple tools, IDEs, and build systems. It need not be a
> statically maintained file. I like this approach because it appropriately
> separates the steps of identifying/building the module name/file map vs
> using the module name/file map (most tools don't want to be build systems).
>
> Tom.
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]>
> <tooling-bounces_at_[hidden]> * On Behalf Of *Gabriel Dos Reis
> *Sent:* Thursday, January 10, 2019 3:15 PM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> <tooling_at_[hidden]>
> *Subject:* Re: [Tooling] Modules naming
>
>
>
> Microsoft strongly encourages its developers and customers to NOT tie a
> module name with the containing source file of its interface. This is
> based on decades of experience caused by header files. I would rather see
> us move in the direction of some sort of MANIFEST file that map modules to
> source files and artifacts.
>
>
>
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> *On
> Behalf Of *Corentin
> *Sent:* Thursday, January 10, 2019 6:53 AM
> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
> *Subject:* [Tooling] Modules naming
>
>
>
> Hello.
>
> I would like to suggest two modules related proposals that I think SG15
> should look at.
>
>
>
> -* Compiler enforced mapping between module names and module interface
> file (resource) name. *
>
>
>
> Currently, modules interfaces can be declared in any file - which makes
> dependency scanning more tedious than it needs to be and have performance
> implications
>
> (The build system needs to open all files to gather a list of modules) -
> notably when the build system tries to start building while the dependency
> graph isn't yet complete.
>
>
>
> Tools ( ide, code servers, indexers, refactoring) may also greatly benefit
> from an easier way to locate the source file which declares a module.
>
>
>
> The specifics of the mapping are open to bikeshedding. However, I think we
> would have better luck sticking to something simple like <module
> identifier> <=> <file name>.<extension>
>
> (The standardese would mention *resource identifier* rather than filename)
>
>
>
> - *A standing document giving guidelines for modules naming.*
>
>
>
> The goal is to take everything the community had to learn the hard way
> about header naming over the past 30 years and apply it to modules by
> providing a set of guidelines
>
> that could be partially enforced by build system vendors.
>
> Encouraging consistency and uniqueness of module identifiers across the
> industry is I think a necessary step towards sane package management.
>
> Note that the standard requires uniqueness of modules identifiers within
> (the standard definition of) a program but says little about a way to
> ensure this uniqueness.
>
>
>
> Here is a rough draft of what I think would be good guidelines, partially
> inspired by what is done by other languages facing similar issues.
>
> · *Prefix module names with an entity and/or a project name to
> prevent modules from different companies, entities and projects of
> declaring the same module names.*
>
> · *Exported top-level namespaces should have a name identic to
> the project name used as part of the name of the module(s) from which it is
> exported.*
>
> · *Do not export multiple top-level namespaces*
>
> · *Do not export entities in the global namespace outside of the
> global module fragment.*
>
> · *Organize modules hierarchically.* For example, if both modules
> example.foo and example.foo.bar exist as part of the public API of example
> , example.foo should reexport example.foo.bar
>
> · *Avoid common names such as *util* and *core* for module name
> prefix and top-level namespace names.*
>
> · *Use lower-case module names*
>
> · *Do not use characters outside of the basic source character
> set in module name identifiers.*
>
> My hope is that these 2 proposals (whose impact on the standard is
> minimal) would make it easier for current tooling to deal with modules
>
> while making possible for example to design dependency managers and build
> systems able to work at the module level.
>
>
>
> I'd love to gather feedback and opinions before going further in that
> direction.
>
> Thanks a lot!
>
>
>
> Corentin
>
>
>
> PS: For a bit of background, I talked about these issues there
>
>
>
> https://cor3ntin.github.io/posts/modules_mapping/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fmapping-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DRbCelyBe1YDW4eNJtYEgKkAeHGxvkhsYqzPk0wf3F58-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=v7Z40T9WgivvxWUJ6plSphOw4d8bdvfEz9NAqCruKwE&e=>
>
> https://cor3ntin.github.io/posts/modules_naming/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fnaming-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DtMhQa4ijeqUd2qxXV4loP47nU5NESRTKJLwZqe-252FI1fc-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=O9uUoT3QItO0vkb2QTG-EnXjsGfOiq7t93GgFz4YHx8&e=>
>
>
>
>
>
>
>
> _______________________________________________
>
> Tooling mailing list
>
> Tooling_at_[hidden]
>
> http://www.open-std.org/mailman/listinfo/tooling <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_mailman_listinfo_tooling&d=DwMDaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=_dusEGqwzSzglMFFwUFdPvzdZCb1dTUZ9DjjrQwHaUw&s=o4EJMe6pKxUA_1edRMmVmbN3paM7ckt_7iDjgIveiwA&e=>
>
>
>
> _______________________________________________
> Tooling mailing listTooling_at_[hidden]://www.open-std.org/mailman/listinfo/tooling
>
>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
Received on 2019-01-12 00:05:10