C++ Logo

SG15

Advanced search

Subject: Re: [Tooling] BMI distribution and reading BMI data
From: Corentin (corentin.jabot_at_[hidden])
Date: 2019-05-24 04:14:27


On Fri, 24 May 2019 at 04:44, Ben Craig <ben.craig_at_[hidden]> wrote:

> What are the current restrictions with regards to static libraries and
> link time optimization? Do those restrictions also apply to modules and
> link time optimizations?
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
> ------------------------------
> *From:* tooling-bounces_at_[hidden] <tooling-bounces_at_[hidden]> on
> behalf of Olga Arkhipova <olgaark_at_[hidden]>
> *Sent:* Thursday, May 23, 2019 7:30:57 PM
> *To:* Tooling_at_[hidden]; Gabriel Dos Reis; Anna Gringauze;
> Lukasz Mendakiewicz; Cameron DaCamara
> *Subject:* [EXTERNAL] [Tooling] BMI distribution and reading BMI data
>
>
> Hi all,
>
> I’d like to discuss the BMI usage and distribution topics – can we do on
> tomorrow’s SG15 meeting? Or later?
>
> Thanks,
>
> Olga
>
>
>
>
>
> *BMI distribution*
>
>
>
> Currently, built modules (BMI) are very similar to static libraries from
> build perspective:
>
> 1. They are specific to the compiler version (i.e. can only be used
> by the compiler binary compatible with the one which produced them)
>
> 2. If they depend on other modules, their BMIs need to be present
> too for successful build.
>
> 3. A number of compiler switches which were used to build the
> module should match the compiler switches for the source which uses this
> module.
>
>
>
> So distribution of BMIs currently has similar limitations as the
> distribution of built static libraries:
>
> · has strict requirements on the compiler and other used libraries
> versions
>
> · limited to the platforms, architectures and #defines it is built
> for.
>
>
>
> The BMI distribution definitely has performance advantage for the builds
> which meet all restrictions and requirements, i.e. the same ones which can
> use built static libraries.
>

Only for the first build

>
>
> If BMI's restriction of the specific compiler and exact command line can
> be weakened somehow, or at least some data can be extracted from all BMIs,
> the performance advantage of the BMI distribution can be wider.
>

I really do not think that we should make any effort to ease the
portability of BMI - this opens to more ODR and ABI issues that we already
have

>
>
> *Scenarios where extracting at least some data from BMIs is needed*
>
>
>
> VS instellisense (EDG)
>
> Visual Studio and VS Code support not only MSVC, but also clang and gcc.
>
> VS is using EDG compiler as intellisense engine, which currently supports
> MSVC, Clang and gcc modes. As performance of EDG compilation is very
> critical, ideally, EDG should be able to use modules already built by MSVC,
> clang and gcc.
>

Isn't that brittle? - all these compilers may change their modules format
(and do so) at any time. how will edg keep up ?

>
>
> Linters (as-you-type code analysis) require additional data specified in
> the source (annotations, pragmas, attributes, contracts). Ideally, this
> information should be always be present in BMI, independent on whether the
> code has been compiled for analysis or code generation.
>
> o Alternative: producing a new BMI for linters and IntelliSense, making
> it slower.
>
>
>
> Note: MSVC will have an option to include the original input source (not
> TU-expanded) into the IFCs.
>
>
>
> Clang-cl and clang-gcc
>
> Currently, clang-cl is (almost) ABI compatible with cl. I believe the same
> is true for clang and gcc.
>
> Should Clang-cl be able to use modules produced by MSVC and vice versa?
>
Should Clang-gcc be able to use modules produced by gcc and vice versa?
>

Very personal opinion: no. Modules were designed to be transient build
artifacts.

>
>
> Build systems
>
> If BMIs are distributed together with their sources (like modules for MS
> standard libs) build systems might want to check if the available BMIs are
> actually compatible with the current build settings and if not, produce a
> different BMI from the source
>

I have a question about Microsoft plans. There are a few compilation flags
routinely used whether optimization levels, threading, debug, iterator
debug and so forth.
that could really easily lead to 64 or a lot more different configurations.
Does Microsoft plans to ship a set of modules for each ?

>
>
> Static analysis (background code analysis, code analysis at build)
>
> Static analysis often requires additional data computed from the source
> which is normally not stored in the BMI. Such additional information is
> produced by static analysis tools during a separate analysis phase of the
> module and needs to be stored into a different BMI file.
>
> o Alternative 1: Always adding extra info which is not always needed is
> an unjustifiable performance expense.
>
> o Alternative 2: Adding this information to already created BMI file
> creates build system complications.
>
> The format of additional information is not defined at this point – the
> tools decide how to read and write the data. We recommend storing the data
> in a way that can be consumed by other tools and compilers.
>
>
>
>
>
> Other scenarios?
>
>
>
> *BMI data to extract*
>
>
>
> At minimum, the following data should be extractable from any BMI
>
>
>
> · Module source file name/location (as it was during the module
> build)
>
> · Compilation options used to build BMI, especially the ones which
> have to match in the source using this module (#defines, etc.)
>
> · Referenced modules and their BMIs (unless included into
> compilation options)
>
> · Static analysis data
>
> o Imported header units and the referenced header
>
> o Additional information stored by static analysis tools
>
> o Anything that is possible to extract from header files using a
> compiler frontend – i.e. types, symbols, ASTs, source information,
> annotations (declspecs), pragmas, attributes, etc (for consumption by code
> analysis)
>
>
>
> Should we encourage the compiler vendors to provide a way to do this for
> the BMIs they produce?
>

I think there should be a separate, non-compiler dependent - agreed upon -
stable format for static analysis and completion purposes.
Gabriel and Bjarne's IPR seem to be an interesting area to explore.

Regards,
Corentin

>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>



SG15 list run by sg15-owner@lists.isocpp.org

Older archives