On Mar 5, 2019, at 9:14 AM, Mathias Stearn <redbeard0531+isocpp@gmail.com> wrote:



On Tue, Mar 5, 2019, 11:19 AM Gabriel Dos Reis via Modules <modules@lists.isocpp.org> wrote:


> On Mar 5, 2019, at 8:12 AM, Ben Craig <ben.craig@ni.com> wrote:

>
> I think the textual inclusion format will still be very useful to distribution and caching tools though, as they don't need to understand the code. 

I'd suggest using a term like "single stream format" rather than "textual inclusion format". It emphasizes what is important (ability to losslessly transit from stdout to stdin, transmit over the network, hash and cache, etc) and avoids confusion with #include style textual inclusion.

There is an additional use case for this format that I've been considering, although it is quite a bit less baked. We could use it as a replacement for "umbrella headers" as a distribution/consumption format. So each library will ship just two or three files, a binary library file with ELF/PE/MACH-O symbols (possibly both static and dynamic files), and an "interface" file that contains source for all of the module interfaces. Optionally, platforms could even define a way to combine that file into dynamic and static libs, so we can have the ideal of single-file library distribution format. By combining all of the internal module interfaces into a single file (even if separate from the binary lib), it insulates consumers from the internal structure of the module.

See my 2015 CppCon presentation (near the end) :-)

— Gaby

It should also be a nice perf boost on platforms where opening files is expensive because it can reduce the number of opened files by over 100x.

Those tools frequently lean on the compiler's preprocessor today, and don't know how to do include lookups.

I don’t know there is an actual spec of such flattened text file that isn’t at least as involved as the module spec - if not more involved.

I think one simple solution would be to just use an existing format designed for this purpose such as tar or zip. ar may be a low-friction choice because build tools already need to know how to handle it to work with static libraries (I was surprised to learn today that it is also the format used for windows .lib files!) However something like zip is better because it has a centralized index so you don't need to scan the whole file to figure out where each sub-file is. To aid in name-mapping it may be best to just directly use the module names where these formats would normally use file names, so that no translation needs to occur.


I worry that we may be creating a more complex problem than the issue we are set to address.

That is fair. However, this is currently a feature we have now, with gcc's -fdirectives-only and clang's -frewrite-includes. Many tools are designed to take advantage of this ability, so it would be nice if we can provide something similar in modules-land. And as stated above, I think if done well, it will provide some nice advantages even in a future fully modularized world, without adding too much complexity.