C++ Logo

sg15

Advanced search

Re: [Tooling] Modules naming

From: Tom Honermann <tom_at_[hidden]>
Date: Sat, 12 Jan 2019 00:22:58 -0500
On 1/11/19 6:04 PM, Corentin wrote:
>
>
> On Fri, 11 Jan 2019 at 23:31 Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
>
> On 1/11/19 9:29 AM, Ben Craig wrote:
>>
>> > The manifest file could be generated (based on information in
>> source code) and
>>
>> > usable by multiple tools, IDEs, and build systems. It need not
>> be a statically
>>
>> > maintained file. I like this approach because it appropriately
>> separates the
>>
>> > steps of identifying/building the module name/file map vs using
>> the module
>>
>> > name/file map (most tools don't want to be build systems).
>>
>> This sounds like a clang compilation database.
>>
> I don't think so. It is a database, but not one that stores the
> history of how a build was performed. Rather, it stores some of
> the information that would be needed to construct a compiler
> invocation (or for a tool/IDE to parse source files that require
> resolving module names to source files).
>
>
>
> It's exactly what a compilation database is ( it includes informations
> such as include directories, defines and some flags) for every
> translation unit
I think of the manifest file as more of a meta-build thing. I think of
a compilation database as something that records the operations that
were done as part of a build, not the instructions for how to do a
build. Perhaps my perspective on this is unique.
>
>> Such a thing is useful for non-build tools, and can be generated
>> automatically, as you said. I am concerned that generating such
>> a file early in the build process would cause performance
>> problems. Each source file would need to be partially parsed
>> before the build DAG could be fully formed. However, I don’t
>> have benchmarks indicating how expensive an extra pass over all
>> the source files is.
>>
> Without such a file (or equivalent information statically encoded
> in "the" build system), the only alternative is to examine every
> source file in order to construct the build DAG in memory anyway.
> How else could such a scan be avoided?
>
>
> Build systems will always need to parse every file at least once
> before invoking the build.
> They can hopefully extract both the name and the dependency in a
> single pass.

Unless the information is already available elsewhere.

I feel like this point keeps getting missed. We use many tools that
require semantic analysis. Many of those tools today do not have
anything that we would call a build system. They might be configured
with a collection of include paths and macro definitions, but that is
pretty much it. I think too much of this discussion is in regard to
"the" build system and not enough on how IDEs/tools will function
(preferably collaboratively) in a modular world. How are vim and emacs
going to resolve module imports?

>
> The issue with lack of module mapping is that the build system can't
> really have a top-down
> approach that would ensure some nodes of the build graph are fully
> resolved soon after
> the build starts.
I don't see this issue as an artifact of a module mapping requirement
but rather the separate compilation model for modules. It is exactly
equivalent to complications with generated headers today. Knowing the
module interface unit file name doesn't tell you which source files have
a dependency on it.
>
> (There is another alternative; implicitly building modules on
> demand as is done for Clang modules today. But to my knowledge, no
> implementors are pursuing that for standard proposed modules).
>
>
> I imagine this would solve a lot of issues : Give the compiler a bunch
> of source files and let it figure things out. It's more or less how
> rust and go works unless I'm mistaking?

It solves some, but it has some scaling issues as well, particularly
when a module must be built (and cached) multiple times to satisfy
different sets of compilation options by different consumers. Also due
to the need for parallel compiler invocations to synchronize.

Tom.

>
> I highly recommand this article about how that problem can be solved in D
> https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/
>
> I don't know if implementers would be willing to move in this direction ?
>
> However, it's important to note that even if the compiler handle the
> module dependency itself
> it would also benefit from a fast module -> file mapping, even more so
> than build systems.
>
> Tom.
>
>> *From:* tooling-bounces_at_[hidden]
>> <mailto:tooling-bounces_at_[hidden]>
>> <tooling-bounces_at_[hidden]>
>> <mailto:tooling-bounces_at_[hidden]> *On Behalf Of *Tom Honermann
>> *Sent:* Thursday, January 10, 2019 9:27 PM
>> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
>> <mailto:tooling_at_[hidden]>
>> *Subject:* Re: [Tooling] Modules naming
>>
>> On 1/10/19 5:17 PM, Ben Craig wrote:
>>
>> Can you elaborate more on the kinds of historical pains
>> caused by tying a #include directive to a file name? I know
>> of issues with #pragma once, but that feels like a distinct
>> problem from file names.
>>
>> > This is based on decades of experience caused by header files.
>>
>> I think most of the participants in wg21 have years, often
>> decades of experience with header files. I know of plenty of
>> issues with the preprocessor, but I am not yet aware of any
>> major problems on the file name front. (ok, getting the
>> right slashes can be annoying… but it’s not a huge problem
>> for me personally).
>>
>> I’m not a fan of the MANIFEST / module map approach in
>> general. It requires duplicating information that is already
>> in the source. I get that it has the potential to speed up
>> builds, but I’d rather not have to update another location
>> when I add a new .cpp file to my project. Many build systems
>> allow for the user to make the tradeoff in whether they will
>> use a file system glob to enumerate their source, or require
>> the user to list the source manually. I usually fall into
>> the file system glob crowd.
>>
>> The manifest file could be generated (based on information in
>> source code) and usable by multiple tools, IDEs, and build
>> systems. It need not be a statically maintained file. I like
>> this approach because it appropriately separates the steps of
>> identifying/building the module name/file map vs using the module
>> name/file map (most tools don't want to be build systems).
>>
>> Tom.
>>
>> *From:* tooling-bounces_at_[hidden]
>> <mailto:tooling-bounces_at_[hidden]>
>> <tooling-bounces_at_[hidden]>
>> <mailto:tooling-bounces_at_[hidden]> *On Behalf Of *Gabriel
>> Dos Reis
>> *Sent:* Thursday, January 10, 2019 3:15 PM
>> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]>
>> <mailto:tooling_at_[hidden]>
>> *Subject:* Re: [Tooling] Modules naming
>>
>> Microsoft strongly encourages its developers and customers to
>> NOT tie a module name with the containing source file of its
>> interface. This is based on decades of experience caused by
>> header files. I would rather see us move in the direction of
>> some sort of MANIFEST file that map modules to source files
>> and artifacts.
>>
>> *From:* tooling-bounces_at_[hidden]
>> <mailto:tooling-bounces_at_[hidden]>
>> <tooling-bounces_at_[hidden]
>> <mailto:tooling-bounces_at_[hidden]>> *On Behalf Of *Corentin
>> *Sent:* Thursday, January 10, 2019 6:53 AM
>> *To:* WG21 Tooling Study Group SG15 <tooling_at_[hidden]
>> <mailto:tooling_at_[hidden]>>
>> *Subject:* [Tooling] Modules naming
>>
>> Hello.
>>
>> I would like to suggest two modules related proposals that I
>> think SG15 should look at.
>>
>> -*Compiler enforced mapping between module names and module
>> interface file (resource) name. *
>>
>> Currently, modules interfaces can be declared in any file -
>> which makes dependency scanning more tedious than it needs to
>> be and have performance implications
>>
>> (The build system needs to open all files to gather a list of
>> modules) - notably when the build system tries to start
>> building while the dependency graph isn't yet complete.
>>
>> Tools ( ide, code servers, indexers, refactoring) may also
>> greatly benefit from an easier way to locate the source file
>> which declares a module.
>>
>> The specifics of the mapping are open to bikeshedding.
>> However, I think we would have better luck sticking to
>> something simple like <module identifier> <=> <file
>> name>.<extension>
>>
>> (The standardese would mention /resource identifier/ rather
>> than filename)
>>
>> - *A standing document giving guidelines for modules naming.*
>>
>> The goal is to take everything the community had to learn the
>> hard way about header naming over the past 30 years and apply
>> it to modules by providing a set of guidelines
>>
>> that could be partially enforced by build system vendors.
>>
>> Encouraging consistency and uniqueness of module identifiers
>> across the industry is I think a necessary step towards sane
>> package management.
>>
>> Note that the standard requires uniqueness of modules
>> identifiers within (the standard definition of) a program but
>> says little about a way to ensure this uniqueness.
>>
>> Here is a rough draft of what I think would be good
>> guidelines, partially inspired by what is done by other
>> languages facing similar issues.
>>
>> ·*Prefix module names with an entity and/or a project name to
>> prevent modules from different companies, entities and
>> projects of declaring the same module names.*
>>
>> ·*Exported top-level namespaces should have a name identic to
>> the project name used as part of the name of the module(s)
>> from which it is exported.*
>>
>> ·*Do not export multiple top-level namespaces*
>>
>> ·*Do not export entities in the global namespace outside of
>> the global module fragment.*
>>
>> ·*Organize modules hierarchically.* For example, if both
>> modules |example.foo| and |example.foo.bar| exist as part of
>> the public API of |example|, |example.foo| should reexport
>> |example.foo.bar|
>>
>> ·*Avoid common names such as *|util|* and *|core|* for module
>> name prefix and top-level namespace names.*
>>
>> ·*Use lower-case module names*
>>
>> ·*Do not use characters outside of the basic source character
>> set in module name identifiers.*
>>
>> My hope is that these 2 proposals (whose impact on the
>> standard is minimal) would make it easier for current tooling
>> to deal with modules
>>
>> while making possible for example to design dependency
>> managers and build systems able to work at the module level.
>>
>> I'd love to gather feedback and opinions before going further
>> in that direction.
>>
>> Thanks a lot!
>>
>> Corentin
>>
>> PS: For a bit of background, I talked about these issues there
>>
>> https://cor3ntin.github.io/posts/modules_mapping/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fmapping-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DRbCelyBe1YDW4eNJtYEgKkAeHGxvkhsYqzPk0wf3F58-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=v7Z40T9WgivvxWUJ6plSphOw4d8bdvfEz9NAqCruKwE&e=>
>>
>> https://cor3ntin.github.io/posts/modules_naming/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam06.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fcor3ntin.github.io-252Fposts-252Fmodules-5Fnaming-252F-26data-3D02-257C01-257Cgdr-2540microsoft.com-257C1139eb25a2ca43b5cb2e08d6770b6606-257C72f988bf86f141af91ab2d7cd011db47-257C1-257C0-257C636827288180838903-26sdata-3DtMhQa4ijeqUd2qxXV4loP47nU5NESRTKJLwZqe-252FI1fc-253D-26reserved-3D0&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=Yv6fjy4yWnfBkW_0m604prnwiQIO5K6DRLBHMjpiaxI&s=O9uUoT3QItO0vkb2QTG-EnXjsGfOiq7t93GgFz4YHx8&e=>
>>
>>
>>
>> _______________________________________________
>>
>> Tooling mailing list
>>
>> Tooling_at_[hidden] <mailto:Tooling_at_[hidden]>
>>
>> http://www.open-std.org/mailman/listinfo/tooling <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_mailman_listinfo_tooling&d=DwMDaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=_dusEGqwzSzglMFFwUFdPvzdZCb1dTUZ9DjjrQwHaUw&s=o4EJMe6pKxUA_1edRMmVmbN3paM7ckt_7iDjgIveiwA&e=>
>>
>>
>> _______________________________________________
>> Tooling mailing list
>> Tooling_at_[hidden] <mailto:Tooling_at_[hidden]>
>> http://www.open-std.org/mailman/listinfo/tooling
>
>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden] <mailto:Tooling_at_[hidden]>
> http://www.open-std.org/mailman/listinfo/tooling
>
>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling



Received on 2019-01-12 06:23:08