Hello,

This post is supposed to be a motivation for standardized `weak` and `naked` attributes.

If this motivation deemed worthy, I will write a proposal.

Note I am not a compiler/linker expert and I have no idea on the implementability of the proposed feature and will need help.

I want to start a discussion both on `weak` and `naked` as well as the resolution to SIOF using the described construct.

Terminology

a definition is said to be weak if we could provide another definition of the same entity, and if another non weak definition is given, the non weak definition is used by the linker.
a definition is said to be naked if we could provide more naked definitions of the same entity and the linker will choose one of them.

I attempted to implement a library solution for SIOF, and unfortunately found none of the major compilers support both "weak" and "naked" definitions of the same symbol.

I hope the motivation will inspire vendors to support both simultaneously, regardless of standardization attempts.

Note that most vendors support these in some way or another (because it is useful, but the behavior is slightly different in each one.

This is why the standard exists - to standardize common practice!

Now for the real reason I needed both (override a weak definition with multiple naked ones, which none of the major compiler support)

The Problem

The static initialization order fiasco

Approach

Any solution directly in the language would either be a complete breaking change, or would need an automated refactor tool (or a spec that enables one to exist).

As breaking changes on this scale are not acceptable, we need an automated refactoring tool, which needs to be scalable to operate on large code bases.

The solution would be completely non breaking.

Here, scalability means no "whole program" analysis, but per TU (or module? I'm not too familiar with them yet) code transformation that is non breaking if only some TUs refactored (no recompiling of the entire project to refactor).

Scalability

If we manage to define the SIO (static initialization order) for all globals/statics of one TU, how do we handle TUs depending on it?

If we have a circular dependency in initialization of globals, there really is nothing to do (this behavior should be undefined, NDR).

In any other case, the order is clear: the order of initialization should be according to the order of dependencies of the TUs.

Whatever code transformation we do per TU, it needs not know what other TUs depend on it.

Single Translation Unit

There are two parts of resolving this per translation unit.

1. Detecting the "correct" order which is the tool's job and not the standard:

This is a hard problem, but it seems this can be solved (in a single TU) assuming there aren't many globals defined in a single TU (unless there are many inline variables in the codebase, we're probably fine).

2. Apply a code transformation on the globals that guarantees their initialization order.

Scalable code transformation

This is the resolution for 2, and originally I envisioned this as a library only solution, but it required several ODR violations that seem reasonable, but none of the major compiler implements - a combination of `weak` and `naked` attributes.

I will describe this library solution here, and it is based on the existence of `weak` and `naked` context sensitive keywords.

Library Design Guidelines

Zero runtime overhead (no static reference counters like the Nifty Counter Idiom, which is sites as a solution to SIOF in the C++ FAQ)
Define the deinitialization order (not necessarily resolve the deinitialization fiasco)
The only required change to the content of TU is where we access the global variables.
Works equally (ish) for static member variables.

Library Description

We assume the TU knows the correct order of initialization (we assume a tool analyzed this before the code transformation)

and so we only need to group the globals into a structure where the order of deinitialization is defined.

//lib's most basic API

template<char... TU>
struct globals_t {};


template<char... TU>
weak globals_t<TU>& globals() {

    static globals_t<TU...> ret; //globals ctor is here in default

    return ret;
}

inline auto& get_globals()
{
    using return_t = globals_t<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>;
    static_assert(std::is_default_constructible<return_t>, "Your specialization is wrong");
    return globals<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>();
}

The TU can access the global variables after initialization using `get_globals` and the TU would define the globals by specializing `globals_t` and replacing the implementation of `globals()`

For scalability, every TU's `globals_t` must contain the the `global_t` structure of every dependency it has, but this means the definition of `globals<Base>` is wrong when compiling the dependent TU. This is where the `weak` definition comes in, we override the definition of the `globals()` function. For example:

//Filesystem.h

#include "MemoryDevice.h"

struct Filesystem {
    Filesystem(MemoryDevice& memory) {}
};

//declare all global variables in a struct for assured order of initialization
template<>
struct globals_t<"Filesystem"> {
    ProtectedMemoryDevice protected_block_device;
    Filesystem protected_fs;
    MemoryDevice unprotected_block_device;
    Filesystem unprotected_fs;

    //initialization code goes here
    globals_t(): protected_block_device(),
                 protected_fs(protected_block_device),
                 unprotected_block_device(),
                 unprotected_fs(unprotected_block_device) {}
};

//Add legacy definitions for the global variables, which are for TUs depending on this one, which were not refactored yet

extern ProtectedMemoryDevice& [[deprecated("Use globals().protected_block_device")]] protected_block_device;
extern Filesystem& [[deprecated("Use globals().protected_fs")]] protected_fs;
extern MemoryDevice& [[deprecated("Use globals().unprotected_block_device")]] unprotected_block_device;
extern Filesystem& [[deprecated("Use globals().unprotected_fs")]] unprotected_fs;

//Filesystem.cpp

ProtectedMemoryDevice& protected_block_device = get_globals().protected_block_device;
Filesystem& protected_fs = get_globals().protected_fs;
MemoryDevice& unprotected_block_device = get_globals().unprotected_block_device;
Filesystem& unprotected_fs = get_globals().unprotected_fs;



 
//Logger.h

//`Logger` depends on Filesystem, therefore the globals of `Logger` must be initialized afterwards.
#include "Filesystem.h"

//define LogFile and LoggerTaskThread here

//the inheritance explained below
template<>
struct globals_t<"Logger"> : public virtual globals_t<"Filesystem">
{
    LogFile log_file;
    LoggerTaskThread logger_task;

    globals_t(): log_file(this->protected_fs.open("log", "w")),
                 logger_task(log_file) {}
};
//legacy declerations here...

// override the weak default definition, the naked explained below
template<>
extern naked globals_t<"Filesystem">& globals();

//Logger.cpp

//legacy definitions here...

template<>
globals_t<"Filesystem">& globals() {
    return static_cast<globals_t<"FileSystem">&>(globals<"Logger">());
}

Overriding the weak definition of `globals<"Filesystem">` enables control of the order of initializations across translation units.

Now why the virtual inheritance and naked definitions: to resolve diamond dependencies "automagically".

Assume Four TUs: A, B, C, D

D depends on B, C

B and C both depend on A

All have global variables, now if D's globals_t inherit from both B and C's globals_t, the overriden definitions of `globals<"A">()` return the same address, but without the `naked` specifier we'll have an ODR violation, even though we don't really care about it.

The scalability here is the fact that we only need to override the definitions of the direct dependencies's `globals()` function and provide legacy definitions for our own TU.

This is the main idea.

For static globals, we can put them in the `globals_t` struct as private members, and befriend everything dependent on it (this is one file only, so this is scalable).

Inline globals (or static globals in header files) can be handled with template metaprogramming.

Static members are the hardest to support:

As a static member's initalization may depend on a global variable's initialization and vice versa, we have to put their initialization inside the `globals_t` struct.

This complicates the whole thing slightly as now we need to support access specifiers (`protected` is the hardest to support in a scalable way, I think it requires getting a typelist of all bases of a given type at compile time which is not standard as of yet), "templatize" `globals_t` somewhat (for static non-inline members of template classes) and handle name collisions (as we put static members of different classes in the same `globals_t` structure).

Note this is achievable using heavy template metaprogramming and introducing global template function `statics<T>()` to get the static members of some type.

Note that for backward compatibility, we can replace any global variable by a reference to the member inside `get_globals()` and the order is definitely defined, i.e. if we recompile A after the transformation and add all these references, the old B, C or D would still compile just fine, even if they are not updated (due to the legacy reference members).

I'm leaving the details of all of these out of here for now, as I wan't to discuss the merit of this proposal before delving into complicated library implementation.

Advantages

It solves the static initialization order fiasco
It has zero runtime overhead
It can be automated
Every access to global variables would look like `globals().var`- i.e. the reader (both human and compiler) could detect `pure`ness very simply.
We get static constructors/destructors for free: put them in the body of the ctor of `globals_t`

Disadvantages

It does not solve the static deinitialization order fiasco, but forces an order instead of an undefined one (which is better IMO). But this is a breaking change for a code base - if the "correct" deinitialization order is nesscesarily different from the initialization one, then it would stay broken. It can be supported with some ugliness (seperating the ctor/dtor of the globals from their space's allocation).
It requires one place in the translation unit where all globals are known (the place where `globals_t` is defined).
The template metaprogramming might take a toll on compile times (preserving access specifiers of static members through inheritance hieriechy is not very scalable)
We have to wrap a lot of the code here in ugly macros to hide all the gory template metaprogramming details (and the naked, weak tricks, and the multiple inheritance etc.). Reflection and Metaclasses would remove all this ugliness, and provide a (complicated) library only solution.

Proposals

This is comprised of three parts

1. [Core] Add `weak` and `naked` context sensitive keywords (they have observable effect on the program, therefore cannot be attributes according to the guidelines).

The behaviour of those keyword in the standard would be defined by exceptions to the ODR rule.

2. [Library] Add to the standard library the needed ingredients for this construct.

3. [SG15] Some description of the code transformation of the tool (I'm not sure what can actually be proposed there).

The motivation to put the library in the standard as well is that in the (optimistic, non realistic) future where the access to globals everywhere is via `lib::globals()` this "lib" should very definitely be "std"

especially as this library would be a very complex template machine which might need compiler hooks for optimized implementation.

Note I have not written any details for the proposal as I want to discuss merit beforehand.