On Sun, Oct 17, 2021 at 11:39 AM Bjarne Stroustrup <bjarne@stroustrup.com> wrote:

and if I understand the discussion correctly, objections are that

(1) makes he meaning of a module name dependent on all details of a file system convention

Not at all. The standard defines the semantics for what are valid identifiers as well as how they're normalized. This proposal would not change that.

There are specific caveats called out in the paper. Specifically, modules with difference only in case would likely generate conflicts in filesystems that are case insensitive, and that unicode codepoints in module names would be subject to portability issues depending on the encoding of files in the filesystem.

We could, however, easily protect from both cases by translating any codepoint outside of [a-z0-9] in the identifier parts of the module name with a simple convention like %UDEADBEEF% where DEADBEEF would be the hex number for the unicode codepoint in the filename.

(2) significantly slows down compilation by forcing lookup of many long file names

This is a bold claim. We have plenty of prior art on C++-aware build systems as well as in other languages demonstrating that this is not a significant factor in build times, at least not in POSIX systems. I would like to hear specific evidence on why this is as big a problem as it's being claimed.

Moreover, the proposal specifically states that an early step of the build configuration would be to perform that discovery, at which point you would have the full mapping of all the relevant files related to the modules. So this objection is limited to the performance impact of discovering how to consume modules from the system, which should be a cost paid only once in the build workspace.

Nothing in this proposal forces the discovery cost to be paid for every compiler invocation.

Daniel