Date: Sun, 27 Oct 2024 11:15:18 -0700
On Sunday 27 October 2024 11:02:37 Pacific Daylight Time Peter C++ via Std-
Proposals wrote:
> I wonder if the optimization of something happening once is actually
> measurable.
The point is that it doesn't happen only once. The guard check happens on
every access, with the necessary emission of the initialisation code too.
However, the guard variable is usually very near the data being accessed
anyway, so the penalty for a cache miss is almost always amortised. Likewise,
the Branch Predictor Unit usually makes a good call on this branching that
almost never changes, and speculation can run ahead to load the contents
anyway. However, there are targets with very limited BPUs and speculative
execution that would benefit from this.
The first public Q_GLOBAL_STATIC design from Qt 4.x was there because C++98
didn't have thread-safe statics, so we had our own thread-safe tracking for
the state. This lead to Q_G_S getting a feature to detect pre-initialisation
and post-destruction states, which are useful in a lot of places. When C++11
came with thread-safe statics, we thought to disable the feature to save on
some cycles, exactly like Frederick is proposing (we ran into compiler bugs).
Nowadays, we incorporate the thread-safe statics into our design so the
features are still available without double-locking.
Some compilers are also bad at tracking that initialisation has happened, so
will emit the check more than once for the same function, in conditions where
one of them implies the other. That's definitely a QoI issue, though.
Proposals wrote:
> I wonder if the optimization of something happening once is actually
> measurable.
The point is that it doesn't happen only once. The guard check happens on
every access, with the necessary emission of the initialisation code too.
However, the guard variable is usually very near the data being accessed
anyway, so the penalty for a cache miss is almost always amortised. Likewise,
the Branch Predictor Unit usually makes a good call on this branching that
almost never changes, and speculation can run ahead to load the contents
anyway. However, there are targets with very limited BPUs and speculative
execution that would benefit from this.
The first public Q_GLOBAL_STATIC design from Qt 4.x was there because C++98
didn't have thread-safe statics, so we had our own thread-safe tracking for
the state. This lead to Q_G_S getting a feature to detect pre-initialisation
and post-destruction states, which are useful in a lot of places. When C++11
came with thread-safe statics, we thought to disable the feature to save on
some cycles, exactly like Frederick is proposing (we ran into compiler bugs).
Nowadays, we incorporate the thread-safe statics into our design so the
features are still available without double-locking.
Some compilers are also bad at tracking that initialisation has happened, so
will emit the check more than once for the same function, in conditions where
one of them implies the other. That's definitely a QoI issue, though.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel DCAI Platform & System Engineering
Received on 2024-10-27 18:15:20