Date: Thu, 6 Jun 2019 00:00:54 +0300
Hello,
This post is supposed to be a motivation for standardized `weak` and
`naked` attributes.
If this motivation deemed worthy, I will write a proposal.
Note I am not a compiler/linker expert and I have no idea on the
implementability of the proposed feature and will need help.
I want to start a discussion both on `weak` and `naked` as well as the
resolution to SIOF using the described construct.
*Terminology*
- a definition is said to be weak if we could provide another definition
of the same entity, and if another non weak definition is given, the non
weak definition is used by the linker.
- a definition is said to be naked if we could provide more naked
definitions of the same entity and the linker will choose one of them.
I attempted to implement a library solution for SIOF, and unfortunately
found none of the major compilers support both "weak" and "naked"
definitions of the same symbol.
I hope the motivation will inspire vendors to support both simultaneously,
regardless of standardization attempts.
Note that most vendors support these in some way or another (because it is
useful, but the behavior is slightly different in each one.
This is why the standard exists - to standardize common practice!
Now for the real reason I needed
*both (override a weak definition with multiple naked ones, which none of
the major compiler support)*
*The Problem*
The static initialization order fiasco
*Approach*
Any solution directly in the language would either be a complete breaking
change, or would need an automated refactor tool (or a spec that enables
one to exist).
As breaking changes on this scale are not acceptable, we need an automated
refactoring tool, which needs to be scalable to operate on large code bases.
The solution would be completely non breaking.
Here, scalability means no "whole program" analysis, but per TU (or module?
I'm not too familiar with them yet) code transformation that is non
breaking if only some TUs refactored (no recompiling of the entire project
to refactor).
*Scalability*
If we manage to define the SIO (static initialization order) for all
globals/statics of one TU, how do we handle TUs depending on it?
If we have a circular dependency in initialization of globals, there really
is nothing to do (this behavior should be undefined, NDR).
In any other case, the order is clear: the order of initialization should
be according to the order of dependencies of the TUs.
Whatever code transformation we do per TU, it needs not know what other TUs
depend on it.
*Single Translation Unit*
There are two parts of resolving this per translation unit.
1. Detecting the "correct" order which is the tool's job and not the
standard:
This is a hard problem, but it seems this can be solved (in a single
TU) assuming there aren't many globals defined in a single TU (unless there
are many inline variables in the codebase, we're probably fine).
2. Apply a code transformation on the globals that guarantees their
initialization order.
*Scalable code transformation*
This is the resolution for 2, and originally I envisioned this as a library
only solution, but it required several ODR violations that seem reasonable,
but none of the major compiler implements - a combination of `weak` and
`naked` attributes.
I will describe this library solution here, and it is based on the
existence of `weak` and `naked` context sensitive keywords.
*Library Design Guidelines*
- Zero runtime overhead (no static reference counters like the Nifty
Counter Idiom, which is sites as a solution to SIOF in the C++ FAQ)
- Define the deinitialization order (not necessarily resolve the
deinitialization fiasco)
- The only required change to the content of TU is where we access the
global variables.
- Works equally (ish) for static member variables.
*Library Description*
We assume the TU knows the correct order of initialization (we assume a
tool analyzed this before the code transformation)
and so we only need to group the globals into a structure where the order
of deinitialization is defined.
//lib's most basic API
template<char... TU>struct globals_t {};
template<char... TU>*weak *globals_t<TU>& globals() {
static globals_t<TU...> ret; //globals ctor is here in default
return ret;
}
inline auto& get_globals(){
using return_t =
globals_t<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>;
static_assert(std::is_default_constructible<return_t>, "Your
specialization is wrong");
return globals<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>();}
The TU can access the global variables after initialization using
`get_globals` and the TU would define the globals by specializing
`globals_t` and replacing the implementation of `globals()`
For scalability, every TU's `globals_t` must contain the the `global_t`
structure of every dependency it has, but this means the definition of
`globals<Base>` is wrong when compiling the dependent TU. This is where the
`weak` definition comes in, we override the definition of the `globals()`
function. For example:
//Filesystem.h
#include "MemoryDevice.h"
struct Filesystem {
Filesystem(MemoryDevice& memory) {}};
//declare all global variables in a struct for assured order of
initializationtemplate<>struct globals_t<"Filesystem"> {
ProtectedMemoryDevice protected_block_device;
Filesystem protected_fs;
MemoryDevice unprotected_block_device;
Filesystem unprotected_fs;
//initialization code goes here
globals_t(): protected_block_device(),
protected_fs(protected_block_device),
unprotected_block_device(),
unprotected_fs(unprotected_block_device) {}};
//Add legacy definitions for the global variables, which are for TUs
depending on this one, which were not refactored yet
extern ProtectedMemoryDevice& [[deprecated("Use
globals().protected_block_device")]] protected_block_device;
extern Filesystem& [[deprecated("Use globals().protected_fs")]] protected_fs;
extern MemoryDevice& [[deprecated("Use
globals().unprotected_block_device")]] unprotected_block_device;
extern Filesystem& [[deprecated("Use globals().unprotected_fs")]]
unprotected_fs;
//Filesystem.cpp
ProtectedMemoryDevice& protected_block_device =
get_globals().protected_block_device;
Filesystem& protected_fs = get_globals().protected_fs;
MemoryDevice& unprotected_block_device = get_globals().unprotected_block_device;
Filesystem& unprotected_fs = get_globals().unprotected_fs;
//Logger.h
//`Logger` depends on Filesystem, therefore the globals of `Logger`
must be initialized afterwards.#include "Filesystem.h"
//define LogFile and LoggerTaskThread here//the inheritance explained
belowtemplate<>*struct *globals_t<"Logger"> : *public virtual*
globals_t<"Filesystem">{
LogFile log_file*;*
LoggerTaskThread logger_task*;*
globals_t(): log_file(this->protected_fs.open(*"*log*"*, *"*w*"*)),
logger_task(log_file) *{}*}*;*//legacy declerations here...
// override the weak default definition, the naked explained belowtemplate<>
extern naked globals_t<"Filesystem">& globals();
//Logger.cpp
//legacy definitions here...
template<>
globals_t<"Filesystem">& globals() {
return static_cast<globals_t<"FileSystem">&>(globals<"Logger">());}
Overriding the weak definition of `globals<"Filesystem">` enables control
of the order of initializations across translation units.
Now why the virtual inheritance and naked definitions: to resolve diamond
dependencies "automagically".
Assume Four TUs: A, B, C, D
D depends on B, C
B and C both depend on A
All have global variables, now if D's globals_t inherit from both B and C's
globals_t, the overriden definition*s* of `globals<"A">()` return the same
address, but without the `naked` specifier we'll have an ODR violation,
even though we don't really care about it.
The scalability here is the fact that we only need to override the
definitions of the direct dependencies's `globals()` function and provide
legacy definitions for our own TU.
This is the main idea.
For static globals, we can put them in the `globals_t` struct as private
members, and befriend everything dependent on it (this is one file only, so
this is scalable).
Inline globals (or static globals in header files) can be handled with
template metaprogramming.
Static members are the hardest to support:
As a static member's initalization may depend on a global variable's
initialization and vice versa, we have to put their initialization inside
the `globals_t` struct.
This complicates the whole thing slightly as now we need to support access
specifiers (`protected` is the hardest to support in a scalable way, I
think it requires getting a typelist of all bases of a given type at
compile time which is not standard as of yet), "templatize" `globals_t`
somewhat (for static non-inline members of template classes) and handle
name collisions (as we put static members of different classes in the same
`globals_t` structure).
Note this is achievable using heavy template metaprogramming and
introducing global template function `statics<T>()` to get the static
members of some type.
Note that for backward compatibility, we can replace any global variable by
a reference to the member inside `get_globals()` and the order is
definitely defined, i.e. if we recompile A after the transformation and add
all these references, the old B, C or D would still compile just fine, even
if they are not updated (due to the legacy reference members).
I'm leaving the details of all of these out of here for now, as I wan't to
discuss the merit of this proposal before delving into complicated library
implementation.
*Advantages*
- It solves the static initialization order fiasco
- It has zero runtime overhead
- It can be automated
- Every access to global variables would look like `globals().var`- i.e.
the reader (both human and compiler) could detect `pure`ness very simply.
- We get static constructors/destructors for free: put them in the body
of the ctor of `globals_t`
*Disadvantages*
- It does not solve the static *de*initialization order fiasco, but
forces an order instead of an undefined one (which is better IMO). But this
is a breaking change for a code base - if the "correct" deinitialization
order is nesscesarily different from the initialization one, then it would
stay broken. It can be supported with some ugliness (seperating the
ctor/dtor of the globals from their space's allocation).
- It requires one place in the translation unit where all globals are
known (the place where `globals_t` is defined).
- The template metaprogramming might take a toll on compile times
(preserving access specifiers of static members through inheritance
hieriechy is not very scalable)
- We have to wrap a lot of the code here in ugly macros to hide all the
gory template metaprogramming details (and the naked, weak tricks, and the
multiple inheritance etc.). Reflection and Metaclasses would remove all
this ugliness, and provide a (complicated) library only solution.
*Proposals*
This is comprised of three parts
1. [Core] Add `weak` and `naked` context sensitive keywords (they have
observable effect on the program, therefore cannot be attributes according
to the guidelines).
The behaviour of those keyword in the standard would be defined by
exceptions to the ODR rule.
2. [Library] Add to the standard library the needed ingredients for this
construct.
3. [SG15] Some description of the code transformation of the tool (I'm not
sure what can actually be proposed there).
The motivation to put the library in the standard as well is that in the
(optimistic, non realistic) future where the access to globals everywhere
is via `lib::globals()` this "lib" should very definitely be "std"
especially as this library would be a very complex template machine which
might need compiler hooks for optimized implementation.
Note I have not written any details for the proposal as I want to discuss
merit beforehand.
This post is supposed to be a motivation for standardized `weak` and
`naked` attributes.
If this motivation deemed worthy, I will write a proposal.
Note I am not a compiler/linker expert and I have no idea on the
implementability of the proposed feature and will need help.
I want to start a discussion both on `weak` and `naked` as well as the
resolution to SIOF using the described construct.
*Terminology*
- a definition is said to be weak if we could provide another definition
of the same entity, and if another non weak definition is given, the non
weak definition is used by the linker.
- a definition is said to be naked if we could provide more naked
definitions of the same entity and the linker will choose one of them.
I attempted to implement a library solution for SIOF, and unfortunately
found none of the major compilers support both "weak" and "naked"
definitions of the same symbol.
I hope the motivation will inspire vendors to support both simultaneously,
regardless of standardization attempts.
Note that most vendors support these in some way or another (because it is
useful, but the behavior is slightly different in each one.
This is why the standard exists - to standardize common practice!
Now for the real reason I needed
*both (override a weak definition with multiple naked ones, which none of
the major compiler support)*
*The Problem*
The static initialization order fiasco
*Approach*
Any solution directly in the language would either be a complete breaking
change, or would need an automated refactor tool (or a spec that enables
one to exist).
As breaking changes on this scale are not acceptable, we need an automated
refactoring tool, which needs to be scalable to operate on large code bases.
The solution would be completely non breaking.
Here, scalability means no "whole program" analysis, but per TU (or module?
I'm not too familiar with them yet) code transformation that is non
breaking if only some TUs refactored (no recompiling of the entire project
to refactor).
*Scalability*
If we manage to define the SIO (static initialization order) for all
globals/statics of one TU, how do we handle TUs depending on it?
If we have a circular dependency in initialization of globals, there really
is nothing to do (this behavior should be undefined, NDR).
In any other case, the order is clear: the order of initialization should
be according to the order of dependencies of the TUs.
Whatever code transformation we do per TU, it needs not know what other TUs
depend on it.
*Single Translation Unit*
There are two parts of resolving this per translation unit.
1. Detecting the "correct" order which is the tool's job and not the
standard:
This is a hard problem, but it seems this can be solved (in a single
TU) assuming there aren't many globals defined in a single TU (unless there
are many inline variables in the codebase, we're probably fine).
2. Apply a code transformation on the globals that guarantees their
initialization order.
*Scalable code transformation*
This is the resolution for 2, and originally I envisioned this as a library
only solution, but it required several ODR violations that seem reasonable,
but none of the major compiler implements - a combination of `weak` and
`naked` attributes.
I will describe this library solution here, and it is based on the
existence of `weak` and `naked` context sensitive keywords.
*Library Design Guidelines*
- Zero runtime overhead (no static reference counters like the Nifty
Counter Idiom, which is sites as a solution to SIOF in the C++ FAQ)
- Define the deinitialization order (not necessarily resolve the
deinitialization fiasco)
- The only required change to the content of TU is where we access the
global variables.
- Works equally (ish) for static member variables.
*Library Description*
We assume the TU knows the correct order of initialization (we assume a
tool analyzed this before the code transformation)
and so we only need to group the globals into a structure where the order
of deinitialization is defined.
//lib's most basic API
template<char... TU>struct globals_t {};
template<char... TU>*weak *globals_t<TU>& globals() {
static globals_t<TU...> ret; //globals ctor is here in default
return ret;
}
inline auto& get_globals(){
using return_t =
globals_t<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>;
static_assert(std::is_default_constructible<return_t>, "Your
specialization is wrong");
return globals<__SOME_COMPILER_SPECIFIC_MACRO_FOR_CURRENT_TU_NAME>();}
The TU can access the global variables after initialization using
`get_globals` and the TU would define the globals by specializing
`globals_t` and replacing the implementation of `globals()`
For scalability, every TU's `globals_t` must contain the the `global_t`
structure of every dependency it has, but this means the definition of
`globals<Base>` is wrong when compiling the dependent TU. This is where the
`weak` definition comes in, we override the definition of the `globals()`
function. For example:
//Filesystem.h
#include "MemoryDevice.h"
struct Filesystem {
Filesystem(MemoryDevice& memory) {}};
//declare all global variables in a struct for assured order of
initializationtemplate<>struct globals_t<"Filesystem"> {
ProtectedMemoryDevice protected_block_device;
Filesystem protected_fs;
MemoryDevice unprotected_block_device;
Filesystem unprotected_fs;
//initialization code goes here
globals_t(): protected_block_device(),
protected_fs(protected_block_device),
unprotected_block_device(),
unprotected_fs(unprotected_block_device) {}};
//Add legacy definitions for the global variables, which are for TUs
depending on this one, which were not refactored yet
extern ProtectedMemoryDevice& [[deprecated("Use
globals().protected_block_device")]] protected_block_device;
extern Filesystem& [[deprecated("Use globals().protected_fs")]] protected_fs;
extern MemoryDevice& [[deprecated("Use
globals().unprotected_block_device")]] unprotected_block_device;
extern Filesystem& [[deprecated("Use globals().unprotected_fs")]]
unprotected_fs;
//Filesystem.cpp
ProtectedMemoryDevice& protected_block_device =
get_globals().protected_block_device;
Filesystem& protected_fs = get_globals().protected_fs;
MemoryDevice& unprotected_block_device = get_globals().unprotected_block_device;
Filesystem& unprotected_fs = get_globals().unprotected_fs;
//Logger.h
//`Logger` depends on Filesystem, therefore the globals of `Logger`
must be initialized afterwards.#include "Filesystem.h"
//define LogFile and LoggerTaskThread here//the inheritance explained
belowtemplate<>*struct *globals_t<"Logger"> : *public virtual*
globals_t<"Filesystem">{
LogFile log_file*;*
LoggerTaskThread logger_task*;*
globals_t(): log_file(this->protected_fs.open(*"*log*"*, *"*w*"*)),
logger_task(log_file) *{}*}*;*//legacy declerations here...
// override the weak default definition, the naked explained belowtemplate<>
extern naked globals_t<"Filesystem">& globals();
//Logger.cpp
//legacy definitions here...
template<>
globals_t<"Filesystem">& globals() {
return static_cast<globals_t<"FileSystem">&>(globals<"Logger">());}
Overriding the weak definition of `globals<"Filesystem">` enables control
of the order of initializations across translation units.
Now why the virtual inheritance and naked definitions: to resolve diamond
dependencies "automagically".
Assume Four TUs: A, B, C, D
D depends on B, C
B and C both depend on A
All have global variables, now if D's globals_t inherit from both B and C's
globals_t, the overriden definition*s* of `globals<"A">()` return the same
address, but without the `naked` specifier we'll have an ODR violation,
even though we don't really care about it.
The scalability here is the fact that we only need to override the
definitions of the direct dependencies's `globals()` function and provide
legacy definitions for our own TU.
This is the main idea.
For static globals, we can put them in the `globals_t` struct as private
members, and befriend everything dependent on it (this is one file only, so
this is scalable).
Inline globals (or static globals in header files) can be handled with
template metaprogramming.
Static members are the hardest to support:
As a static member's initalization may depend on a global variable's
initialization and vice versa, we have to put their initialization inside
the `globals_t` struct.
This complicates the whole thing slightly as now we need to support access
specifiers (`protected` is the hardest to support in a scalable way, I
think it requires getting a typelist of all bases of a given type at
compile time which is not standard as of yet), "templatize" `globals_t`
somewhat (for static non-inline members of template classes) and handle
name collisions (as we put static members of different classes in the same
`globals_t` structure).
Note this is achievable using heavy template metaprogramming and
introducing global template function `statics<T>()` to get the static
members of some type.
Note that for backward compatibility, we can replace any global variable by
a reference to the member inside `get_globals()` and the order is
definitely defined, i.e. if we recompile A after the transformation and add
all these references, the old B, C or D would still compile just fine, even
if they are not updated (due to the legacy reference members).
I'm leaving the details of all of these out of here for now, as I wan't to
discuss the merit of this proposal before delving into complicated library
implementation.
*Advantages*
- It solves the static initialization order fiasco
- It has zero runtime overhead
- It can be automated
- Every access to global variables would look like `globals().var`- i.e.
the reader (both human and compiler) could detect `pure`ness very simply.
- We get static constructors/destructors for free: put them in the body
of the ctor of `globals_t`
*Disadvantages*
- It does not solve the static *de*initialization order fiasco, but
forces an order instead of an undefined one (which is better IMO). But this
is a breaking change for a code base - if the "correct" deinitialization
order is nesscesarily different from the initialization one, then it would
stay broken. It can be supported with some ugliness (seperating the
ctor/dtor of the globals from their space's allocation).
- It requires one place in the translation unit where all globals are
known (the place where `globals_t` is defined).
- The template metaprogramming might take a toll on compile times
(preserving access specifiers of static members through inheritance
hieriechy is not very scalable)
- We have to wrap a lot of the code here in ugly macros to hide all the
gory template metaprogramming details (and the naked, weak tricks, and the
multiple inheritance etc.). Reflection and Metaclasses would remove all
this ugliness, and provide a (complicated) library only solution.
*Proposals*
This is comprised of three parts
1. [Core] Add `weak` and `naked` context sensitive keywords (they have
observable effect on the program, therefore cannot be attributes according
to the guidelines).
The behaviour of those keyword in the standard would be defined by
exceptions to the ODR rule.
2. [Library] Add to the standard library the needed ingredients for this
construct.
3. [SG15] Some description of the code transformation of the tool (I'm not
sure what can actually be proposed there).
The motivation to put the library in the standard as well is that in the
(optimistic, non realistic) future where the access to globals everywhere
is via `lib::globals()` this "lib" should very definitely be "std"
especially as this library would be a very complex template machine which
might need compiler hooks for optimized implementation.
Note I have not written any details for the proposal as I want to discuss
merit beforehand.
Received on 2019-06-05 16:02:45