std-proposals: Re: Resolving Static Initialization Order Fiasco by standardizing `weak` and `naked` symbols

From: Henry Miller <hank_at_[hidden]>
Date: Thu, 06 Jun 2019 17:02:37 -0500

I don't know what others will think, but I support this. Right now violations of the one definition rule are undefined by the standard. Tools like undefined behavior sanitizer will flag all cases where someone has two definitions, by definition the tool is right to do this, but in fact some cases are implementation defined in a useful way. Just getting the standard to acknowledge that cases where two different definitions exist is allowed, and the results are implementation defined would be a big win.

I would recommend you work this in phases, first just get existing practice (implementation defined) into the standard. Then work on getting stronger definitions into the standard where possible. I think getting implementations to document their current process would help frame discussion.

Do we actually need a different definition of weak and naked? It seems the only difference is you would have only one weak definition and maybe a strong one. With naked you have many possible definitions but no strong one. Is there any reason to maintain this, or can we just say "A definition is weak if a stronger definition supersedes it, otherwise the implementation shall choose one".

Note that I said implementation not linker - I find weak symbols useful when they are chosen by the runtime linker, but generally I think of the compile time linker when the term linker is used.

I took a stab at doing this in standard-ese to help me think about what it means. This is a strawman in that it is probably wrong, but it should help people think. There are some things that I don't like about it, but I hope it codifies existing practice.

6.2.1 A variable, function [...] shall not be defined where a prior definition is necessarily reachable unless at most one of the definitions is not marked "weak"; or all are marked "naked" in an implementation-defined manor. [rest of paragraph]

6.2.10 Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program outside of a discarded statement. If there are weak definitions and one non-weak definition the non-weak definition shall be considered the definition and the program shall ignore the other definitions. If there are only weak or naked definitions the implementation shall choose one in an implementation-defined manor and not used the others. No diagnostics is required for all other cases. [rest of paragraph]

6.2.?-a If a weak or naked definition is marked inline the implementation may inline that definition as per the normal rules of inline without considering the one definition rule. If there is more than one definition is marked inline it is unspecified which to inline. A diagnostics recommended.

6.2.12? Where more than one definition exists in a program all definitions shall be compatible by implementation-defined rules; no diagnostic required

Notes on above

(6.2.1) This is the new one definition rule. I decided not to define a syntax to mark weak or naked. By not forcing anything we make this round easier on the vendors and have a better chance of getting somewhere quick.

(6.2.10) I think I got the rules right, I hope in a way that all vendors can agree to.

(6.2.?-a) This is probably the most controversial. Inline is one place where I think we need to be careful or things fall apart. Basically I'm saying automatic inline of a weak definition is a stupid idea and no self respecting compiler would do it. However if the users marks a weak symbol as inline they must know what they are doing and it is on them if the result breaks. Diagnostics is recommended because I think most code standards will ban this practice, so a diagnostics to help them enforce the ban is helpful - even if the compiler chooses not to inline. I don't think we need to note that inline of weak symbols is generally illegal since a stronger definition might exist.

(6.2.12?) you need to define your ABI well enough that I can write two different implementations and substitute one for the other. The standard currently says "same sequence of tokens" for this, but since we want to allow different sequences of tokens - I'm not sure how else to write this. No diagnostic because at least some cases cannot be detected.

I hope this helps you write your paper.

On Wed, Jun 5, 2019, at 4:01 PM, Omer Rosler via Std-Proposals wrote:
> Hello,
> This post is supposed to be a motivation for standardized `weak` and `naked` attributes.
> If this motivation deemed worthy, I will write a proposal.
> Note I am not a compiler/linker expert and I have no idea on the implementability of the proposed feature and will need help.
>
> I want to start a discussion both on `weak` and `naked` as well as the resolution to SIOF using the described construct.
>
> *Terminology*
> * a definition is said to be weak if we could provide another definition of the same entity, and if another non weak definition is given, the non weak definition is used by the linker.
> * a definition is said to be naked if we could provide more naked definitions of the same entity and the linker will choose one of them.
>
> I attempted to implement a library solution for SIOF, and unfortunately found none of the major compilers support both "weak" and "naked" definitions of the same symbol.
> I hope the motivation will inspire vendors to support both simultaneously, regardless of standardization attempts.
>
> Note that most vendors support these in some way or another (because it is useful, but the behavior is slightly different in each one.
> This is why the standard exists - to standardize common practice!
>
> Now for the real reason I needed *both (override a weak definition with multiple naked ones, which none of the major compiler support)*
> **
> **
> *The Problem*
>
> The static initialization order fiasco
>
> *Approach*
>
> Any solution directly in the language would either be a complete breaking change, or would need an automated refactor tool (or a spec that enables one to exist).
> As breaking changes on this scale are not acceptable, we need an automated refactoring tool, which needs to be scalable to operate on large code bases.
> The solution would be completely non breaking.
>
> Here, scalability means no "whole program" analysis, but per TU (or module? I'm not too familiar with them yet) code transformation that is non breaking if only some TUs refactored (no recompiling of the entire project to refactor).
>
> *Scalability*
> If we manage to define the SIO (static initialization order) for all globals/statics of one TU, how do we handle TUs depending on it?
> If we have a circular dependency in initialization of globals, there really is nothing to do (this behavior should be undefined, NDR).
> In any other case, the order is clear: the order of initialization should be according to the order of dependencies of the TUs.
>
> Whatever code transformation we do per TU, it needs not know what other TUs depend on it.
>
> *Single Translation Unit*
> There are two parts of resolving this per translation unit.
> 1. Detecting the "correct" order which is the tool's job and not the standard:
> This is a hard problem, but it seems this can be solved (in a single TU) assuming there aren't many globals defined in a single TU (unless there are many inline variables in the codebase, we're probably fine).
>
> 2. Apply a code transformation on the globals that guarantees their initialization order.
>
> *Scalable code transformation*
> This is the resolution for 2, and originally I envisioned this as a library only solution, but it required several ODR violations that seem reasonable, but none of the major compiler implements - a combination of `weak` and `naked` attributes.
> I will describe this library solution here, and it is based on the existence of `weak` and `naked` context sensitive keywords.
>
> *Library Design Guidelines*
> * Zero runtime overhead (no static reference counters like the Nifty Counter Idiom, which is sites as a solution to SIOF in the C++ FAQ)
> * Define the deinitialization order (not necessarily resolve the deinitialization fiasco)
> * The only required change to the content of TU is where we access the global variables.
> * Works equally (ish) for static member variables.
> *Library Description*
> We assume the TU knows the correct order of initialization (we assume a tool analyzed this before the code transformation)
> and so we only need to group the globals into a structure where the order of deinitialization is defined.
> *//lib's most basic API*
> *template*<*char*... TU>
> *struct* globals_t {};

> *
template*<*char*... TU>
> *weak *globals_t<TU>& globals() {
> static globals_t<TU...> ret; //globals ctor is here in default
> return ret;
> }
> *inline* *auto*& get_globals()
> {
    *using return_t = globals_t<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>;
> static_assert(std::is_default_constructible<return_t>, "Your specialization is wrong");
> return* globals<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>();
> }
> The TU can access the global variables after initialization using `get_globals` and the TU would define the globals by specializing `globals_t` and replacing the implementation of `globals()`
>
> For scalability, every TU's `globals_t` must contain the the `global_t` structure of every dependency it has, but this means the definition of `globals<Base>` is wrong when compiling the dependent TU. This is where the `weak` definition comes in, we override the definition of the `globals()` function. For example:
> //Filesystem.h

> #include "MemoryDevice.h"

> *struct* Filesystem {
    Filesystem(MemoryDevice& memory) {}
> };
>
> //declare all global variables in a struct for assured order of initialization
> *template*<>
> *struct* globals_t<"Filesystem"> {
    ProtectedMemoryDevice protected_block_device;
    Filesystem protected_fs;
    MemoryDevice unprotected_block_device;
    Filesystem unprotected_fs;
>
> //initialization code goes here
    globals_t(): protected_block_device(),
                 protected_fs(protected_block_device),
                 unprotected_block_device(),
                 unprotected_fs(unprotected_block_device) {}
> };
>
> //Add legacy definitions for the global variables, which are for TUs depending on this one, which were not refactored yet
> extern ProtectedMemoryDevice& [[deprecated("Use globals().protected_block_device")]] protected_block_device;
extern Filesystem& [[deprecated("Use globals().protected_fs")]] protected_fs;
extern MemoryDevice& [[deprecated("Use globals().unprotected_block_device")]] unprotected_block_device;
extern Filesystem& [[deprecated("Use globals().unprotected_fs")]] unprotected_fs;
> //Filesystem.cpp
>
ProtectedMemoryDevice& protected_block_device = get_globals().protected_block_device;
Filesystem& protected_fs = get_globals().protected_fs;
MemoryDevice& unprotected_block_device = get_globals().unprotected_block_device;
Filesystem& unprotected_fs = get_globals().unprotected_fs;

>
>
>
>
> //Logger.h
>
> //`Logger` depends on Filesystem, therefore the globals of `Logger` must be initialized afterwards.
> #include "Filesystem.h"

> //define LogFile and LoggerTaskThread here
> *
*//the inheritance explained below*
> *template*<>
**struct *globals_t<"Logger"> : *public virtual* globals_t<"Filesystem">
> {
    LogFile log_file*;*
    LoggerTaskThread logger_task*;*

    globals_t(): log_file(this->protected_fs.open(*"*log*"*, *"*w*"*)),
                 logger_task(log_file) *{}*
> }*;*
> //legacy declerations here...*
>
*// override the weak default definition, the naked explained below*
> *template*<>
extern naked *globals_t*<"*Filesystem*">& *globals();*
*
> //Logger.cpp**
> //legacy definitions here... **
> *template*<>
globals_t<"Filesystem">& globals() {
    *return* *static_cast*<globals_t<"FileSystem">&>(globals<"Logger">());
> }
>
>
> Overriding the weak definition of `globals<"Filesystem">` enables control of the order of initializations across translation units.
> Now why the virtual inheritance and naked definitions: to resolve diamond dependencies "automagically".
> Assume Four TUs: A, B, C, D
> D depends on B, C
> B and C both depend on A
>
> All have global variables, now if D's globals_t inherit from both B and C's globals_t, the overriden definition*s* of `globals<"A">()` return the same address, but without the `naked` specifier we'll have an ODR violation, even though we don't really care about it.
>
> The scalability here is the fact that we only need to override the definitions of the direct dependencies's `globals()` function and provide legacy definitions for our own TU.
>
> This is the main idea.
> For static globals, we can put them in the `globals_t` struct as private members, and befriend everything dependent on it (this is one file only, so this is scalable).
> Inline globals (or static globals in header files) can be handled with template metaprogramming.
> Static members are the hardest to support:
> As a static member's initalization may depend on a global variable's initialization and vice versa, we have to put their initialization inside the `globals_t` struct.
> This complicates the whole thing slightly as now we need to support access specifiers (`protected` is the hardest to support in a scalable way, I think it requires getting a typelist of all bases of a given type at compile time which is not standard as of yet), "templatize" `globals_t` somewhat (for static non-inline members of template classes) and handle name collisions (as we put static members of different classes in the same `globals_t` structure).
> Note this is achievable using heavy template metaprogramming and introducing global template function `statics<T>()` to get the static members of some type.
>
> Note that for backward compatibility, we can replace any global variable by a reference to the member inside `get_globals()` and the order is definitely defined, i.e. if we recompile A after the transformation and add all these references, the old B, C or D would still compile just fine, even if they are not updated (due to the legacy reference members).
>
> I'm leaving the details of all of these out of here for now, as I wan't to discuss the merit of this proposal before delving into complicated library implementation.
>
> *Advantages*
> * It solves the static initialization order fiasco
> * It has zero runtime overhead
> * It can be automated
> * Every access to global variables would look like `globals().var`- i.e. the reader (both human and compiler) could detect `pure`ness very simply.
> * We get static constructors/destructors for free: put them in the body of the ctor of `globals_t`
> *Disadvantages*
> * It does not solve the static *de*initialization order fiasco, but forces an order instead of an undefined one (which is better IMO). But this is a breaking change for a code base - if the "correct" deinitialization order is nesscesarily different from the initialization one, then it would stay broken. It can be supported with some ugliness (seperating the ctor/dtor of the globals from their space's allocation).
> * It requires one place in the translation unit where all globals are known (the place where `globals_t` is defined).
> * The template metaprogramming might take a toll on compile times (preserving access specifiers of static members through inheritance hieriechy is not very scalable)
> * We have to wrap a lot of the code here in ugly macros to hide all the gory template metaprogramming details (and the naked, weak tricks, and the multiple inheritance etc.). Reflection and Metaclasses would remove all this ugliness, and provide a (complicated) library only solution.
>
> *Proposals*
> This is comprised of three parts
> 1. [Core] Add `weak` and `naked` context sensitive keywords (they have observable effect on the program, therefore cannot be attributes according to the guidelines).
> The behaviour of those keyword in the standard would be defined by exceptions to the ODR rule.
> 2. [Library] Add to the standard library the needed ingredients for this construct.
> 3. [SG15] Some description of the code transformation of the tool (I'm not sure what can actually be proposed there).
>
> The motivation to put the library in the standard as well is that in the (optimistic, non realistic) future where the access to globals everywhere is via `lib::globals()` this "lib" should very definitely be "std"
> especially as this library would be a very complex template machine which might need compiler hooks for optimized implementation.
>
> Note I have not written any details for the proposal as I want to discuss merit beforehand.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> http://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2019-06-06 17:04:25