C++ Logo

sg15

Advanced search

Re: [isocpp-ext] Can we expect that all C++ source files can have the same suffix?

From: Gabriel Dos Reis <gdr_at_[hidden]>
Date: Sat, 16 Apr 2022 19:26:34 +0000
[Iain]
> We have some compromises here...
>
> For the convenience of the user, in reducing command line keystrokes, we recognise some set of extensions as having default compilation behaviour.

Yes, we are in violent agreement on this principle. It is part of the reasons why MSVC introduced one more extension (".ixx").

The new extension isn't necessary to make the compiler work. No, it is not; the compiler has the proper command line options in place to cover all scenarios.
As you say, it is for convenience. The next problem is to define "convenience for what?"

Modules were designed to help with software architecture at scale, in particular "componentization" (the first of the four goals I consistently list in my presentations of Modules). Componentization implies delineation of "interface", abstraction boundaries. Here, by interface, I don't just mean "export module Blah;". Prior to modules, we used "header files" to express interface (roughly: things people need to call or depend on go into header files, details go into implementation files). The C and C++ communities have built tools and practices around that notion. Many tools (including compilers!) don't need to read the content of interface files in order to "understand" that header files have special meaning. So, the notion of "interface" is central here, not just from the language definition perspective. That is why MSVC defaulted on the "module interface file" suffix processing. It isn't because of intrinsic compiler needs like "I need to produce a metadata file", which would make the extension compiler-centric, as opposed to software development centric.

I hope teaching of modules to professionals and students alike don't focus too much on compiler needs, but rather help them understand how to use Modules for better software constructions.

> So if the user renames the file with an extension with alternate default meaning (as above), then the compiler needs to be told the revised intention -
so -xc++-header-unit-header 1.cc (in this particular case).

I agree. My initial response was to draw attention to the fact that the principle " every compiler should be able to process C++ translation and module units just based on their content" needs a huge asterisk in order to display any relationship with reality.

> In the case of header units, there is no specific content that allows the compiler to determine that it should be a HU c.f. a regular source (unlike the case of module/partition sources).

Agreed.
I wasn't being facetious when I said the "main()" function would be defined in a header file. I've seen "frameworks" where all the programmer has to do is to include a header file, adds their functions with specific naming scheme, and voila. For starters, think "header-only test frameworks" for example.

> Yes, we do need that the driver can be told that the “mode” is C++20 but PCH … (at present, -fmodules on the command line will continue to mean ‘clang modules’) . FWIW I’m not particularly happy that we’ve got the modality properly tamed yet, and there’s some on-going discussion in both phab reviews and amongst the modules folks.

Fun, fun, fun...
Please keep up the good work! I'm really itching to being able to show more module examples in Godbolt using Clang and GCC.

-- Gaby

-----Original Message-----
From: Iain Sandoe <iain_at_sandoe.co.uk>
Sent: Friday, April 15, 2022 3:55 PM
To: Gabriel Dos Reis <gdr_at_microsoft.com>
Cc: sg15_at_[hidden]ocpp.org; Nathan Sidwell <nathan_at_acm.org>; Nico Josuttis <nico_at_josuttis.de>
Subject: Re: [SG15] [isocpp-ext] Can we expect that all C++ source files can have the same suffix?



> On 15 Apr 2022, at 21:47, Gabriel Dos Reis <gdr_at_microsoft.com> wrote:
>
>> For header units I’ve followed the same pattern as GCC (we would consume your example header as a HU for -std=c++20; but as a PCH otherwise), that includes removing the “deprecated behaviour” warning in the C++20 “mode”***
>
> Ah, but that means that if the content was
>
> // 1.h
> struct S { int i; };
>
> constexpr S s { 3 };
>
> int main() { return s.i; }
>
> the behavior of the compiler will change when the file is renamed from 1.h to 1.cc, right?

We have some compromises here...

For the convenience of the user, in reducing command line keystrokes, we recognise some set of extensions as having default compilation behaviour.

So if the user renames the file with an extension with alternate default meaning (as above), then the compiler needs to be told the revised intention -
so -xc++-header-unit-header 1.cc (in this particular case).

In the case of header units, there is no specific content that allows the compiler to determine that it should be a HU c.f. a regular source (unlike the case of module/partition sources).

A similar problem occurs with “ clang++ -fmodule-header=system vector “ where the default behaviour of the driver is to claim such inputs as “for the linker”.

so, one needs as a minimum “ clang++ -fmodule-header=system -xc++-header vector “ there.

One can default (from the build tools) to being specific for every case - the convenience factor is not important there.

>
>> *** Real Life is made more complex by the need for multiple modules implementations to co-exist - which is also completely outside the remit of the standard.
>
> I fully agree 😊

> Oh, I forgot to add: in real life, header units have to co-exist with PCHs, all compiled in C++20 mode. That is what we found the hard way with Office codebase, for example. So, the command line has to determine the compiler's behavior.

Yes, we do need that the driver can be told that the “mode” is C++20 but PCH … (at present, -fmodules on the command line will continue to mean ‘clang modules’) . FWIW I’m not particularly happy that we’ve got the modality properly tamed yet, and there’s some on-going discussion in both phab reviews and amongst the modules folks.

Iain


>
> -- Gaby
>
> -----Original Message-----
> From: Iain Sandoe <iain_at_[hidden]>
> Sent: Friday, April 15, 2022 1:12 PM
> To: Gabriel Dos Reis <gdr_at_microsoft.com>
> Cc: sg15_at_[hidden]; Nathan Sidwell <nathan_at_acm.org>; Nico Josuttis <nico_at_josuttis.de>
> Subject: Re: [SG15] [isocpp-ext] Can we expect that all C++ source files can have the same suffix?
>
>
>
>> On 15 Apr 2022, at 21:01, Gabriel Dos Reis <gdr_at_microsoft.com> wrote:
>>
>> [Iain]
>>> +1 on that.
>>
>> What does clang++ say when invoked as "clang++ 1.h" on the following translation unit stored in the source file 1.h?
>>
>> // 1.h
>> struct S { int i; };
>>
>> constexpr S s { 3 };
>>
>> ?
>>
>> I get a diagnostic saying something about 'c-header', 'c++-header', and deprecated behavior. I don't have that diagnostic when I rename 1.h to 1.cc.
>> Is that to be expected? Why?
>
> Work In Progress; actually I hope to land the patches for driver header unit work over the next few days (the FE part is landed already).
>
> For header units I’ve followed the same pattern as GCC (we would consume your example header as a HU for -std=c++20; but as a PCH otherwise), that includes removing the “deprecated behaviour” warning in the C++20 “mode”***
>
> clang will get the -fmodule-header{=} command line switches as part of that set; which allows the user to specify that headers for HUs are searched for in the system / user header search paths.
>
> Iain
>
> *** Real Life is made more complex by the need for multiple modules implementations to co-exist - which is also completely outside the remit of the standard.
>
>>
>> -- Gaby
>>
>> -----Original Message-----
>> From: SG15 <sg15-bounces_at_lists.isocpp.org> On Behalf Of Iain Sandoe via SG15
>> Sent: Friday, April 15, 2022 12:02 PM
>> To: sg15_at_[hidden]pp.org
>> Cc: Iain Sandoe <iain_at_[hidden]>; Nathan Sidwell <nathan_at_acm.org>; Fred J. Tydeman via Ext <ext_at_lists.isocpp.org>; WG21 Tooling Study Group SG15 <tooling_at_open-std.org>; Nico Josuttis <nico_at_josuttis.de>
>> Subject: Re: [SG15] [isocpp-ext] Can we expect that all C++ source files can have the same suffix?
>>
>>
>>
>>> On 15 Apr 2022, at 19:45, Nico Josuttis via SG15 <sg15_at_lists.isocpp.org> wrote:
>>>
>>> Thanks Nathan.
>>> And as I see it now I agree.
>>>
>>> While convention for file suffixes help, every compiler should be able to process C++ translation and module units just based on their content.
>>> That gives BY FAR the best flexibility.
>>
>> +1 on that.
>>>
>>> I opened a bug for Visual C++ to enable that.
>>> I'd assume there is no problem to fulfill that request, because you only have to find the module declaration to know which option to turn on.
>>> And with "module;" at the beginning you know whether such a declaration will come.
>>>
>>> Once Visual C++ has the fix, suddenly all the suffix and options discussion is gone.
>>> (ok, clang could then follow).
>>
>> As far as my current work on the clang implementation goes this is the case - including an implementation of the scheme were a BMI is generated "on demand" as the source parsing discovers the need to emit an interface (which is what Nathan implemented in GCC, avoiding multiple invocations of the compiler for most cases where a BMI _and_ an object are needed). The latter is somewhat cleaner with the module-mapper scheme (but is also workable without). There is also an implementation of the module mapper in the pipeline.
>>
>> Iain
>>
>>>
>>> Am 15. April 2022 20:20:45 MESZ schrieb Nathan Sidwell <nathan_at_acm.org>:
>>> On 4/13/22 17:10, Nico Josuttis via SG15 wrote:
>>> I should add that the fact that we need
>>> module;
>>> at the beginning of the global module fragment was only introduced to let a file identify itself as module file.
>>> If we would require different suffixes, that would not have been necessary.
>>>
>>> But correct me if I am wrong.
>>>
>>> I shall correct you :)
>>>
>>> Here's the history (as I recall, all persons mentioned are real, and not to be confused with ficticious characters)
>>>
>>> * prior to me doing things with gcc, there was only 'module FOO;' as a module declaration at-most once within a TU. MSVC (the only compiler with module smarts at the time), had a flag to tell it 'this is an interface' vs 'this is an implementation'.
>>>
>>> * I found this unsatisfying, as it meant that there was something outside the source tokens that told you how to interpret them. In effect we had two languages.
>>>
>>> * IIRC, Gaby, Jason (Merrill) and I came up with the 'export module FOO;' vs 'module foo;' distinction. But still this could be anywhere in the source stream. I was able to implement this functionality to a working system.
>>>
>>> * Daveed proposed an early signifier of 'hey, this is gonna be a module', should the actual module declaration not be first. Hence 'module;' was born. (My understanding was that this was driven by implementors, as they had difficulty entering a module-like mode not at start of compilation, and indeed it was a little tricky to do that. I do not know if this was also a user request.)
>>>
>>> * post p1103, the requirement that everything between 'module;' and the module decl come from #include came to be.
>>>
>>> Hope that helps.
>>>
>>> I know I've mentioned it more than once, but I find it unsettling, given there was great opposition to there being a (two way?) mapping between file names and module names, that there is a move in the direction of making file names 'significant'. ISTM that the desire for bob.$REGULARSUFFIX and alice.$MODULESUFFIX is taking us all the way back to the first objection above about having two languages.
>>>
>>> nathan
>>>
>>>
>>>
>>>
>>> Am 13. April 2022 22:58:13 MESZ schrieb Nicolai Josuttis via Ext <ext_at_lists.isocpp.org>:
>>>
>>> What I teach about modules is compelling. Programmers like and want to use it.
>>> However, they ask how they should organize module files in practice.
>>>
>>> So far I cannot recommend a specific suffix (and I might never be able to do
>>> that).
>>> However there is one important question that IMO the standard should answer:
>>> *Do we **/need /**different suffixes?*
>>>
>>> I understand that a suffix discussion is only of practical value.
>>> But IMO the standard has to give an answer here (which has nothing to do
>>> with which suffixes are used).
>>>
>>> Let me elaborate that in detail:
>>>
>>> Not having a standard suffix has interesting consequences.
>>> So far we have header files and translation units.
>>> But once we know what a C++ translation unit is, we can just compile them
>>> all with the same compiler options or commands. Because in practice we have
>>> different suffixes for header and source files, we can set-up generic rules
>>> to compile our code.
>>>
>>> This works for any suffix, provided you know the way to tell the compiler
>>> that we have a C++ file here:
>>> (use /Tp with VC++ and -xc++ with gcc and you are done).
>>>
>>> Is this still true with modules?
>>> That is: Can we expect that identifying a file as C++ file is enough to be
>>> able to (pre) compile it as C++ file?
>>>
>>> Current compilers give different answers (AFAIK):
>>>
>>> - *gcc *says the same suffix is possible. There is not special option for
>>> modules.
>>> I can still have my own suffixes and use -xc++ though.
>>>
>>> - *VC++* currently requires different suffixes or different command-line
>>> arguments.
>>> Identifying a file as C++ file is not enough.
>>> For example
>>> - This is not enough: /Tp mymod.cppm
>>> - You need: /interface /Tp mymod.cppm
>>>
>>> I wonder whether the behavior of VC++ is standard conforming.
>>>
>>> I see no place in the C++ standard saying that there has to be different
>>> treatment of C++ source files to make them work.
>>> Or do we require this somewhere?
>>> We do not require different treatment just because we have templates,
>>> namespaces, or exceptions used inside.
>>> Therefore, I would expect that also using modules does not require special
>>> handling.
>>> (This is independent from the question whether different suffixes help to
>>> deal with these files).
>>>
>>> If I am right, VC++ is not standard conforming.
>>>
>>>
>>> In any case it would help a lot to clarify:
>>> Can all C++ source files expect that treating them the same way works fine?
>>>
>>> If not, we obviously need different suffixes. But then we should clearly say
>>> so (without necessarily saying which suffix it is).
>>>
>>>
>>> I hope this questions brings us a bit forward to be able teach the first
>>> *portable *"hello, modules" example.
>>>
>>> Thanks
>>>
>>> Nico
>>>
>>> -- ---
>>> Nicolai M. Josuttis
>>> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.josuttis.de%2F&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=7bYysLMzo0NaPhmxkWYrY%2FBgXK7Z%2FqlaLQ%2FtKDw5bIo%3D&amp;reserved=0
>>> +49 (0)531 / 129 88 86
>>> +49 (0)700 / JOSUTTIS
>>>
>>> Books:
>>> C++:https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcppstd20.com%2F&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=M1akaBTqVcIpgQDJfqFvfpLeYLrXu2%2BRoc7kFncrgAY%3D&amp;reserved=0,
>>> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcppstdlib.com%2F&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=D2YT%2BAUFO4TibyiHN01BQ1gjhyq7XvKh9o3OyMYB2xE%3D&amp;reserved=0
>>>
>>> --
>>> Nico Josuttis
>>> (sent from my mobile phone)
>>> SG15 mailing list
>>> SG15_at_lists.isocpp.org
>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Fsg15&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LsUMpbSE%2Br8TlkBi6po4Vf9bHDPQ4CyKL8u7w3vJk1M%3D&amp;reserved=0
>>>
>>>
>>> --
>>> Nathan Sidwell
>>> --
>>> Nico Josuttis
>>> (sent from my mobile phone)
>>> _______________________________________________
>>> SG15 mailing list
>>> SG15_at_[hidden]
>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Fsg15&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LsUMpbSE%2Br8TlkBi6po4Vf9bHDPQ4CyKL8u7w3vJk1M%3D&amp;reserved=0
>>
>> _______________________________________________
>> SG15 mailing list
>> SG15_at_lists.isocpp.org
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Fsg15&amp;data=05%7C01%7Cgdr%40microsoft.com%7Ca7e0465379d8433513b808da1f330858%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637856601300067517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LsUMpbSE%2Br8TlkBi6po4Vf9bHDPQ4CyKL8u7w3vJk1M%3D&amp;reserved=0

Received on 2022-04-16 19:26:38