C++ Logo

std-discussion

Advanced search

Re: Is forward progress guarantee still useful?

From: Yongwei Wu <wuyongwei_at_[hidden]>
Date: Thu, 18 Sep 2025 14:41:38 +0800
On Wed, 17 Sept 2025 at 16:17, David Brown via Std-Discussion
<std-discussion_at_[hidden]> wrote:
>
>
> On 17/09/2025 04:11, Yongwei Wu via Std-Discussion wrote:
> > The most surprising UB-based wrong optimization to me, ever, was that
> > the following program would crash, instead of looping infinitely, when
> > compiled by certain versions of Clang:
> >
> > int main()
> > {
> > for (;;) {
> > }
> > }
> >
> > Of course, I understand the reason now (and aware that it is now
> > "fixed"), and I am really trying hard to justify this optimization when
> > teaching people about undefined behaviour. Unfortunately, I have not
> > found a case where a compiler can do loop fusion, which I suppose was a
> > reason for the forward progress guarantee. (On the contrary, I have
> > found that loop fission is more common, supposedly beneficial due to
> > cache locality and vectorization.)
> >
> > I am wondering what are the real-world benefits of the forward progress
> > guarantee today.... They probably existed, but are they still there? (If
> > not, should we ...?)
> >
>
> I think this all comes down to the concept of "observable behaviour" and
> the "as-if" rule.
>
> C and C++ are defined in terms of an "abstract machine", and the actual
> visible effects produced by the program. (In the C standards, these are
> termed "observable behaviour". The C++ standard does not use the same
> term, AFAICS, but has the same concept.) If you write a program that
> uses a prime sieve to calculate the first 40 primes and print them out,
> the compiler can generate any object code it likes as long as the end
> result is the same - the same list of numbers is printed out. It can
> pre-calculate the results, or use repeated trial division, or use
> Euler's "n^2 + n + 40" generator function. All that matters for
> correctness is that the observable behaviour from the generated object
> code is consistent with what you could get "as if" the hypothetical
> abstract machine executed the source code directly.
>
> (Of course real-world compilers have other considerations as well as
> correctness, such as runtime efficiency, debugable code, compile speed,
> etc.)
>
> Observable behaviour in C and C++ includes volatile accesses, I/O
> functions, start and stop of the program, and atomic or synchronisation
> operations. Time is not observable behaviour.
>
> Thus as far as the language is concerned, for the sequence "do A, wait
> time T, do B", there is no distinction based on how long "T" is. "T"
> can be 0, an hour, or infinite time - it is all just "do A, do B" to the
> language. If the compiler can determine that part of the code is doing
> nothing useful, just wasting time (perhaps it is calculating a result
> that is never used), it can simply remove that code without bothering to
> figure out if it is an infinite loop or not. It's just normal dead-code
> elimination.
>
> So really, this is not about "why do the standards allow removal of
> infinite loops?", or "what do compilers gain by such removals?". It is
> about how the standards have had to be changed to say that such removals
> were /not/ allowed - trying to make infinite loops part of the
> observable behaviour would be very difficult because it is, in general,
> impossible for the compiler to determine if a loop is infinite. (And
> making "time" observable would be impossible without massive changes to
> the language and implementations.)
>
> It is certainly the case that infinite do-nothing loops turn up in code
> for various reasons. So the C standards have always had a specific
> clause in the description of iteration statements that excludes
> do-nothing loops from being removed if they have a constant controlling
> expression (like "while (true) { ; }"). The C++ standards introduced a
> similar idea in C++26 - though in true C++ fashion, it comes with a new
> term "trivially infinite loops" and a good deal more details than in the
> C standards.
>
> As to what benefits it gives compilers, it means they can eliminate
> loops that don't do anything useful, without having to figure out if
> they stop or not (unless they are clearly and simply infinite do-nothing
> loops in C or C++26). As Nate pointed out, not all C and C++ code is
> hand-written. A great deal of code is generated in some way - perhaps
> by external source-code generation (though this is more common for C
> than C++) from transpilers (turning another language into C to use
> existing C compilers), simulation generators, etc., or from internal
> source-code generation from macros and of course templates. It is also
> common that while the code you write does not have any obvious dead code
> (why would you write code that does nothing?), by the time the compiler
> has finished inlining, function cloning, inter-procedural optimisations,
> constant propagation, link-time optimisation, and so on, large clumps of
> code can be simplified or eliminated. There are lots of optimisations
> where it is impossible to write short, realistic hand-written code
> sequences that trigger the optimisation, yet they turn up in real programs.

I might accept the do-nothing loops can be eliminated, if we could not
find a compromise between the theoretical model and the naïve
programmers' view. However, it is still different from the pre-C++26
UB of empty loops. In that specific case, not only was the empty loop
eliminated, but the program would crash as well, because the compiler
was in the Schrödingerian state of thinking two incompatible things:
the loop would finish; the code after the loop would be unreachable.
As the code was "undefined", one could not even blame the compiler.

So, the forward progress guarantee promises more dangerous things than
what you described, as far as I understand it.

-- 
Yongwei Wu
URL: http://wyw.dcweb.cn/

Received on 2025-09-18 06:41:54