C++ Logo

sg15

Advanced search

Re: [isocpp-ext] Can we expect that all C++ source files can have the same suffix?

From: Ben Boeckel <ben.boeckel_at_[hidden]>
Date: Wed, 20 Apr 2022 14:46:48 -0400
On Wed, Apr 20, 2022 at 20:10:09 +0200, Nicolai Josuttis via Ext wrote:
> But if we are interested in the success of modules, we should care.
> Simplicity is a key for the success of new ideas.
> So, while I agree that we need build systems in projects of significant
> size, the question is whether we still keep things as simple as
> possible. And the question is which burden we put not the build tools.
>
> IMO a compiler should parse a C++ file while a build system should not
> have to do that.

Yes. I agree. CMake doesn't parse C++ code. It asks the compiler what
it thinks of a given source file. Because `#ifdef __GNUC__` is still a
thing, I *have* to ask the compiler because chasing built-in
preprocessor state and magic like `__has_feature` for a build system is
just not possible without *being* the compiler in the end. Otherwise,
one would have to wait for a new build system release for each new
compiler that comes out that isn't trivially the same as its previously
supported version(s). But then I question the value of such a release,
so that is likely to be the empty set in practice.

> I assume we all know what interesting corner cases can occur to find out
> the first two tokens of a C++ unit (outside leading (nested) comment).
> Do we really want that all tools that call compilers (build systems,
> scripts, and yes, even programmers) find out the module type by parsing
> code?
> And how long do we want to wait until all possible tools have written
> the necessary C++ parser?

Again, compilers should have ways of asking this information. It should
ideally be communicated in the format described by P1689 (though it
doesn't distinguish between `-interface`, `-internalPartition`, and
flagless because the scanning needs that information up front anyways).

> My workaround, clmod.py at https://github.com/josuttis/cppmodules took
> me half a day, but it is not perfect and I already had to react on
> requests to fix possible issues (so the time I need for it grows)?
> Do we really pass such a buck to all programmers and tool vendors?

Do I wish compiling C++ was `$CXX -Dsimple=1 -Iflags -Wallowed *.cxx`
worked? Yes. Would everyone be happy with such a thing? Given how common
per-source flags, ABI-affecting `-f` flags, and custom linker woo is
actually in use in the world, I think that ship has sailed around the
world multiple times by this point.

> I learned yesterday that the/one reason Microsoft has a problem with
> supporting no specific extensions and not specific command-line options
> is that they have options to inject headers into c++ source files (/FI
> and -include) and want to support that still for module units. They have
> to know when starting the compilation, whether it is a module to decide
> whether to inject the header file at the front or in the global module
> fragment.

It is the existence (and use) of flags like this that make building C++
such a tedious and hard problem (IMO). I understand *why* these flags
exist, but I think they'd be better specified with the source code
itself than in sidecar metadata files (Makefile, CMake, shell scripts,
.vcxproj files, whatever) because if that `-ffast-math` flag is *so*
important, I'd appreciate a way to put it beside the code that cares so
much in a more meaningful way than a comment.

> I don't know whether that approach is valid at all. But it seems to
> hinder Microsoft to come up with a simple clean solution for the problem
> (yes, Gaby, formally this is */not /*a "problem", you do everything
> standard conforming).
>
> This problem can be solved and should be solve by the compiler. So I
> strongly recommend to do that.

At what point do compilers say "your build graph is too complicated, go
use a build system"? When you have more than one binary artifact
involved (because now you need two compiler commands to communicate in
some way about module caches)? When you have non-stdlib dependencies
(because this just piles onto the "more than one artifact" use case)?

> If we would agree on "compilers parse; built tools call compilers", I
> see only the following options:
>
> *a) There is a portable way so signal (different) module files*
>
> Visual C++ has .ixx for that, but unfortunately that is not standardized.
> In addition, there is no suffix for internal partitions although VC++
> needs a special option for them (and only for them).
>
> From what I hear, nobody uses internal partitions yet. So OK, we might
> come up with a suffix later.
> However, I see a problem of agreeing of a standard suffix like .ixx
> (after 5 years we were not successful).

Especially when it will likely need to be a *new* suffix to avoid
applying new semantics to some extension that got popular in one corner
of the world.

> *b) Compilers find out themselves which kind of modules unit they get.*
>
> This is an approach gcc/g++ already runs successfully AFAIU.
>
> As I am not aware of any technical reason not to be able to follow option b)
> (with minor effort it is even possible to still inject header files) I
> trongly recommend that we all agree on option b).

I do as well. Because this is what using C++ modules looks like in CMake
looks like (today) because of the pre-classification:

add_library(simple)
target_sources(simple
  PRIVATE
    use.mpp
  PRIVATE
    FILE_SET cxx_modules TYPE CXX_MODULES FILES
      duplicate.mpp
      another.mpp)

Note that `use.mpp` can also be in the `add_library` call itself; just
moved for uniformity here. The semantics are that if any source file not
in a `CXX_MODULES` fileset provides a module (or
`CXX_MODULE_INTERNAL_PARTITIONS` providing a non-exported partition), we
error (at build time) to aid in cross-platform projects. If we allow
what worked with GCC:

add_library(simple
  use.mpp
  duplicate.mpp
  another.mpp)

then projects porting to MSVC after migrating with GCC is a pain for
projects because the former classification is necessary.

Note that the visibility classification is helpful *anyways* to know
whether a module is private to a library or is eligible to be consumed
by targets which link to `simple`. It would just far nicer if we could
remove `CXX_MODULE_INTERNAL_PARTITIONS` by just saying
`-internalPartition` is not supported (by removing the ambiguous
"partition implementation unit" interpretation). Alas, the flag exists
and people will want to use it, so it has to be considered in the
design.

> Yes, we all know now that Microsoft does not agree on option b).
>
> As we don't have two new standard suffixes yet (not even one), that has
> the following consequences:
>
> To use modules we have to wait for all possible tools we use to
> establish non-portable support for module files. That is each and every
> tool needs one way or the other to categorize the various C++ source files.
>
> IIUC, for cmake, the consequence is that "in order to specify C++20
> modules, one /must/ use |FILE_SETS| to list the sources"
> (see
> https://discourse.cmake.org/t/api-design-c-modules-source-listings-and-interface-properties/5389?s=09).

Yes, see above for the difference in what a user has to do. I think
filesets still have benefits, but the two flavors is beyond "named" and
"header unit"…not great.

> I am not aware of what the other tool vendors do (except me; now
> implementing a simple build tool for Visual C++);
> but please allow me to ask when do you think that all the major tools
> are in a state that we can use modules everywhere.

There's still a lot of work to be done. SG15 is discussing how to
communicate module information between projects (be it `.pc` files, JSON
documents, etc.). Until something like that gets decent adoption,
modules are locked to project-local usage so far AFAICT.

> I currently tell that modules are not ready for commercial use yet. This
> is not the only problem. But by waiting until all tool vendors have
> implemented a way to categorize or parse C++ source files, we give the
> success in the hands outside our control.
> I suggest we keep control and solve the other issues and let modules
> become a success story.

--Ben

Received on 2022-04-20 18:46:51