ISOCPP Std Discussion List: Re: Is forward progress guarantee still useful?

From: David Brown <david.brown_at_[hidden]>
Date: Wed, 17 Sep 2025 10:17:35 +0200

On 17/09/2025 04:11, Yongwei Wu via Std-Discussion wrote:
> The most surprising UB-based wrong optimization to me, ever, was that
> the following program would crash, instead of looping infinitely, when
> compiled by certain versions of Clang:
>
> int main()
> {
> for (;;) {
> }
> }
>
> Of course, I understand the reason now (and aware that it is now
> "fixed"), and I am really trying hard to justify this optimization when
> teaching people about undefined behaviour. Unfortunately, I have not
> found a case where a compiler can do loop fusion, which I suppose was a
> reason for the forward progress guarantee. (On the contrary, I have
> found that loop fission is more common, supposedly beneficial due to
> cache locality and vectorization.)
>
> I am wondering what are the real-world benefits of the forward progress
> guarantee today.... They probably existed, but are they still there? (If
> not, should we ...?)
>

I think this all comes down to the concept of "observable behaviour" and
the "as-if" rule.

C and C++ are defined in terms of an "abstract machine", and the actual
visible effects produced by the program. (In the C standards, these are
termed "observable behaviour". The C++ standard does not use the same
term, AFAICS, but has the same concept.) If you write a program that
uses a prime sieve to calculate the first 40 primes and print them out,
the compiler can generate any object code it likes as long as the end
result is the same - the same list of numbers is printed out. It can
pre-calculate the results, or use repeated trial division, or use
Euler's "n^2 + n + 40" generator function. All that matters for
correctness is that the observable behaviour from the generated object
code is consistent with what you could get "as if" the hypothetical
abstract machine executed the source code directly.

(Of course real-world compilers have other considerations as well as
correctness, such as runtime efficiency, debugable code, compile speed,
etc.)

Observable behaviour in C and C++ includes volatile accesses, I/O
functions, start and stop of the program, and atomic or synchronisation
operations. Time is not observable behaviour.

Thus as far as the language is concerned, for the sequence "do A, wait
time T, do B", there is no distinction based on how long "T" is. "T"
can be 0, an hour, or infinite time - it is all just "do A, do B" to the
language. If the compiler can determine that part of the code is doing
nothing useful, just wasting time (perhaps it is calculating a result
that is never used), it can simply remove that code without bothering to
figure out if it is an infinite loop or not. It's just normal dead-code
elimination.

So really, this is not about "why do the standards allow removal of
infinite loops?", or "what do compilers gain by such removals?". It is
about how the standards have had to be changed to say that such removals
were /not/ allowed - trying to make infinite loops part of the
observable behaviour would be very difficult because it is, in general,
impossible for the compiler to determine if a loop is infinite. (And
making "time" observable would be impossible without massive changes to
the language and implementations.)

It is certainly the case that infinite do-nothing loops turn up in code
for various reasons. So the C standards have always had a specific
clause in the description of iteration statements that excludes
do-nothing loops from being removed if they have a constant controlling
expression (like "while (true) { ; }"). The C++ standards introduced a
similar idea in C++26 - though in true C++ fashion, it comes with a new
term "trivially infinite loops" and a good deal more details than in the
C standards.

As to what benefits it gives compilers, it means they can eliminate
loops that don't do anything useful, without having to figure out if
they stop or not (unless they are clearly and simply infinite do-nothing
loops in C or C++26). As Nate pointed out, not all C and C++ code is
hand-written. A great deal of code is generated in some way - perhaps
by external source-code generation (though this is more common for C
than C++) from transpilers (turning another language into C to use
existing C compilers), simulation generators, etc., or from internal
source-code generation from macros and of course templates. It is also
common that while the code you write does not have any obvious dead code
(why would you write code that does nothing?), by the time the compiler
has finished inlining, function cloning, inter-procedural optimisations,
constant propagation, link-time optimisation, and so on, large clumps of
code can be simplified or eliminated. There are lots of optimisations
where it is impossible to write short, realistic hand-written code
sequences that trigger the optimisation, yet they turn up in real programs.

Received on 2025-09-17 08:17:39

std-discussion