Date: Fri, 11 Jan 2019 14:29:54 +0000
> The manifest file could be generated (based on information in source code) and
> usable by multiple tools, IDEs, and build systems. It need not be a statically
> maintained file. I like this approach because it appropriately separates the
> steps of identifying/building the module name/file map vs using the module
> name/file map (most tools don't want to be build systems).
This sounds like a clang compilation database. Such a thing is useful for non-build tools, and can be generated automatically, as you said. I am concerned that generating such a file early in the build process would cause performance problems. Each source file would need to be partially parsed before the build DAG could be fully formed. However, I don’t have benchmarks indicating how expensive an extra pass over all the source files is.
From: tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> On Behalf Of Tom Honermann
Sent: Thursday, January 10, 2019 9:27 PM
To: WG21 Tooling Study Group SG15 <tooling_at_open-std.org>
Subject: Re: [Tooling] Modules naming
On 1/10/19 5:17 PM, Ben Craig wrote:
Can you elaborate more on the kinds of historical pains caused by tying a #include directive to a file name? I know of issues with #pragma once, but that feels like a distinct problem from file names.
> This is based on decades of experience caused by header files.
I think most of the participants in wg21 have years, often decades of experience with header files. I know of plenty of issues with the preprocessor, but I am not yet aware of any major problems on the file name front. (ok, getting the right slashes can be annoying… but it’s not a huge problem for me personally).
I’m not a fan of the MANIFEST / module map approach in general. It requires duplicating information that is already in the source. I get that it has the potential to speed up builds, but I’d rather not have to update another location when I add a new .cpp file to my project. Many build systems allow for the user to make the tradeoff in whether they will use a file system glob to enumerate their source, or require the user to list the source manually. I usually fall into the file system glob crowd.
The manifest file could be generated (based on information in source code) and usable by multiple tools, IDEs, and build systems. It need not be a statically maintained file. I like this approach because it appropriately separates the steps of identifying/building the module name/file map vs using the module name/file map (most tools don't want to be build systems).
Tom.
From: tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]> <tooling-bounces_at_[hidden]><mailto:tooling-bounces_at_[hidden]> On Behalf Of Gabriel Dos Reis
Sent: Thursday, January 10, 2019 3:15 PM
To: WG21 Tooling Study Group SG15 <tooling_at_[hidden]><mailto:tooling_at_[hidden]>
Subject: Re: [Tooling] Modules naming
Microsoft strongly encourages its developers and customers to NOT tie a module name with the containing source file of its interface. This is based on decades of experience caused by header files. I would rather see us move in the direction of some sort of MANIFEST file that map modules to source files and artifacts.
From: tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]> <tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]>> On Behalf Of Corentin
Sent: Thursday, January 10, 2019 6:53 AM
To: WG21 Tooling Study Group SG15 <tooling_at_[hidden]<mailto:tooling_at_[hidden]>>
Subject: [Tooling] Modules naming
Hello.
I would like to suggest two modules related proposals that I think SG15 should look at.
- Compiler enforced mapping between module names and module interface file (resource) name.
Currently, modules interfaces can be declared in any file - which makes dependency scanning more tedious than it needs to be and have performance implications
(The build system needs to open all files to gather a list of modules) - notably when the build system tries to start building while the dependency graph isn't yet complete.
Tools ( ide, code servers, indexers, refactoring) may also greatly benefit from an easier way to locate the source file which declares a module.
The specifics of the mapping are open to bikeshedding. However, I think we would have better luck sticking to something simple like <module identifier> <=> <file name>.<extension>
(The standardese would mention resource identifier rather than filename)
- A standing document giving guidelines for modules naming.
The goal is to take everything the community had to learn the hard way about header naming over the past 30 years and apply it to modules by providing a set of guidelines
that could be partially enforced by build system vendors.
Encouraging consistency and uniqueness of module identifiers across the industry is I think a necessary step towards sane package management.
Note that the standard requires uniqueness of modules identifiers within (the standard definition of) a program but says little about a way to ensure this uniqueness.
Here is a rough draft of what I think would be good guidelines, partially inspired by what is done by other languages facing similar issues.
· Prefix module names with an entity and/or a project name to prevent modules from different companies, entities and projects of declaring the same module names.
· Exported top-level namespaces should have a name identic to the project name used as part of the name of the module(s) from which it is exported.
· Do not export multiple top-level namespaces
· Do not export entities in the global namespace outside of the global module fragment.
· Organize modules hierarchically. For example, if both modules example.foo and example.foo.bar exist as part of the public API of example, example.foo should reexport example.foo.bar
· Avoid common names such as util and core for module name prefix and top-level namespace names.
· Use lower-case module names
· Do not use characters outside of the basic source character set in module name identifiers.
My hope is that these 2 proposals (whose impact on the standard is minimal) would make it easier for current tooling to deal with modules
while making possible for example to design dependency managers and build systems able to work at the module level.
I'd love to gather feedback and opinions before going further in that direction.
Thanks a lot!
Corentin
PS: For a bit of background, I talked about these issues there
https://cor3ntin.github.io/posts/modules_mapping/<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fmapping-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DRbCelyBe1YDW4eNJtYEgKkAeHGxvkhsYqzPk0wf3F58-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=v7Z40T9WgivvxWUJ6plSphOw4d8bdvfEz9NAqCruKwE&e=>
https://cor3ntin.github.io/posts/modules_naming/<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fnaming-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DtMhQa4ijeqUd2qxXV4loP47nU5NESRTKJLwZqe-252FI1fc-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=O9uUoT3QItO0vkb2QTG-EnXjsGfOiq7t93GgFz4YHx8&e=>
_______________________________________________
Tooling mailing list
Tooling_at_[hidden]<mailto:Tooling_at_[hidden]>
http://www.open-std.org/mailman/listinfo/tooling<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_mailman_listinfo_tooling&d=DwMDaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=_dusEGqwzSzglMFFwUFdPvzdZCb1dTUZ9DjjrQwHaUw&s=o4EJMe6pKxUA_1edRMmVmbN3paM7ckt_7iDjgIveiwA&e=>
> usable by multiple tools, IDEs, and build systems. It need not be a statically
> maintained file. I like this approach because it appropriately separates the
> steps of identifying/building the module name/file map vs using the module
> name/file map (most tools don't want to be build systems).
This sounds like a clang compilation database. Such a thing is useful for non-build tools, and can be generated automatically, as you said. I am concerned that generating such a file early in the build process would cause performance problems. Each source file would need to be partially parsed before the build DAG could be fully formed. However, I don’t have benchmarks indicating how expensive an extra pass over all the source files is.
From: tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> On Behalf Of Tom Honermann
Sent: Thursday, January 10, 2019 9:27 PM
To: WG21 Tooling Study Group SG15 <tooling_at_open-std.org>
Subject: Re: [Tooling] Modules naming
On 1/10/19 5:17 PM, Ben Craig wrote:
Can you elaborate more on the kinds of historical pains caused by tying a #include directive to a file name? I know of issues with #pragma once, but that feels like a distinct problem from file names.
> This is based on decades of experience caused by header files.
I think most of the participants in wg21 have years, often decades of experience with header files. I know of plenty of issues with the preprocessor, but I am not yet aware of any major problems on the file name front. (ok, getting the right slashes can be annoying… but it’s not a huge problem for me personally).
I’m not a fan of the MANIFEST / module map approach in general. It requires duplicating information that is already in the source. I get that it has the potential to speed up builds, but I’d rather not have to update another location when I add a new .cpp file to my project. Many build systems allow for the user to make the tradeoff in whether they will use a file system glob to enumerate their source, or require the user to list the source manually. I usually fall into the file system glob crowd.
The manifest file could be generated (based on information in source code) and usable by multiple tools, IDEs, and build systems. It need not be a statically maintained file. I like this approach because it appropriately separates the steps of identifying/building the module name/file map vs using the module name/file map (most tools don't want to be build systems).
Tom.
From: tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]> <tooling-bounces_at_[hidden]><mailto:tooling-bounces_at_[hidden]> On Behalf Of Gabriel Dos Reis
Sent: Thursday, January 10, 2019 3:15 PM
To: WG21 Tooling Study Group SG15 <tooling_at_[hidden]><mailto:tooling_at_[hidden]>
Subject: Re: [Tooling] Modules naming
Microsoft strongly encourages its developers and customers to NOT tie a module name with the containing source file of its interface. This is based on decades of experience caused by header files. I would rather see us move in the direction of some sort of MANIFEST file that map modules to source files and artifacts.
From: tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]> <tooling-bounces_at_[hidden]<mailto:tooling-bounces_at_[hidden]>> On Behalf Of Corentin
Sent: Thursday, January 10, 2019 6:53 AM
To: WG21 Tooling Study Group SG15 <tooling_at_[hidden]<mailto:tooling_at_[hidden]>>
Subject: [Tooling] Modules naming
Hello.
I would like to suggest two modules related proposals that I think SG15 should look at.
- Compiler enforced mapping between module names and module interface file (resource) name.
Currently, modules interfaces can be declared in any file - which makes dependency scanning more tedious than it needs to be and have performance implications
(The build system needs to open all files to gather a list of modules) - notably when the build system tries to start building while the dependency graph isn't yet complete.
Tools ( ide, code servers, indexers, refactoring) may also greatly benefit from an easier way to locate the source file which declares a module.
The specifics of the mapping are open to bikeshedding. However, I think we would have better luck sticking to something simple like <module identifier> <=> <file name>.<extension>
(The standardese would mention resource identifier rather than filename)
- A standing document giving guidelines for modules naming.
The goal is to take everything the community had to learn the hard way about header naming over the past 30 years and apply it to modules by providing a set of guidelines
that could be partially enforced by build system vendors.
Encouraging consistency and uniqueness of module identifiers across the industry is I think a necessary step towards sane package management.
Note that the standard requires uniqueness of modules identifiers within (the standard definition of) a program but says little about a way to ensure this uniqueness.
Here is a rough draft of what I think would be good guidelines, partially inspired by what is done by other languages facing similar issues.
· Prefix module names with an entity and/or a project name to prevent modules from different companies, entities and projects of declaring the same module names.
· Exported top-level namespaces should have a name identic to the project name used as part of the name of the module(s) from which it is exported.
· Do not export multiple top-level namespaces
· Do not export entities in the global namespace outside of the global module fragment.
· Organize modules hierarchically. For example, if both modules example.foo and example.foo.bar exist as part of the public API of example, example.foo should reexport example.foo.bar
· Avoid common names such as util and core for module name prefix and top-level namespace names.
· Use lower-case module names
· Do not use characters outside of the basic source character set in module name identifiers.
My hope is that these 2 proposals (whose impact on the standard is minimal) would make it easier for current tooling to deal with modules
while making possible for example to design dependency managers and build systems able to work at the module level.
I'd love to gather feedback and opinions before going further in that direction.
Thanks a lot!
Corentin
PS: For a bit of background, I talked about these issues there
https://cor3ntin.github.io/posts/modules_mapping/<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fmapping-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DRbCelyBe1YDW4eNJtYEgKkAeHGxvkhsYqzPk0wf3F58-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=v7Z40T9WgivvxWUJ6plSphOw4d8bdvfEz9NAqCruKwE&e=>
https://cor3ntin.github.io/posts/modules_naming/<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fnaming-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DtMhQa4ijeqUd2qxXV4loP47nU5NESRTKJLwZqe-252FI1fc-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=O9uUoT3QItO0vkb2QTG-EnXjsGfOiq7t93GgFz4YHx8&e=>
_______________________________________________
Tooling mailing list
Tooling_at_[hidden]<mailto:Tooling_at_[hidden]>
http://www.open-std.org/mailman/listinfo/tooling<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_mailman_listinfo_tooling&d=DwMDaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=_dusEGqwzSzglMFFwUFdPvzdZCb1dTUZ9DjjrQwHaUw&s=o4EJMe6pKxUA_1edRMmVmbN3paM7ckt_7iDjgIveiwA&e=>
Received on 2019-01-11 15:30:00