C++ Logo

std-discussion

Advanced search

Re: Is forward progress guarantee still useful?

From: Yongwei Wu <wuyongwei_at_[hidden]>
Date: Thu, 18 Sep 2025 14:28:45 +0800
On Wed, 17 Sept 2025 at 22:59, Nate Eldredge <nate_at_[hidden]>
wrote:

>
>
> On Sep 17, 2025, at 00:21, Yongwei Wu <wuyongwei_at_[hidden]> wrote:
>
> I can hardly imagine it is "common". Who would have written such code,
> especially when it never terminates? I would argue that an infinite loop is
> better, in that it would alert the programmer that something is broken. To
> me (and I believe to most C++ programmers), this is a surprising
> optimization. I really have difficulty imagining it is truly useful.
>
>
> A very common reason for *humans* to write code to compute a value that is
> then unused: when the code that would use the value has been turned off for
> this build.
>

I'll reply to this post to describe my points.

First, programmers in general do not want to *rely on* compiler
optimizations to get rid of unused code. Do not forget it is also difficult
for compilers to prove something is unused, especially when the compiler
does not see all code at once.

Another thing is unexpected results. It seems to me very often that
programmers want to measure some calculations. However, when volatile
variables, locks/atomics, or I/O operations are not involved, the
calculation *sometimes* gets eliminated (and sometimes not). It looks to me
like a burden in education.

Here's an example that I think is quite plausible, though not taken from
> actual code. Think of this as low-level embedded code for a freestanding
> implementation (no standard library).
>
> constexpr bool is_debug_build = false; // could be true in other builds
>
> void output_to_serial_port(const char *buf, unsigned len);
>
> void debug_output_buffer(const char *buf, unsigned len) {
> if (is_debug_build) {
> output_to_serial_port(buf, len);
> }
> }
>
> unsigned my_strlen(const char *s) {
> unsigned i;
> for (i = 0; s[i]; i++) { }
> return i;
> }
>
> void debug_output_string(const char *s) {
> debug_output_buffer(s, my_strlen(s));
> }
>
> Suppose that, in the entire program, there are no strings of length
> `std::numeric_limits<unsigned>::max` (henceforth UINT_MAX for brevity) or
> greater. The programmer is well aware of this, and is willing to promise
> it under penalty of UB. Also suppose that all functions shown here can be
> inlined into each other, but that `debug_output_string` is not inlined into
> its callers, so the compiler knows nothing statically about `s`.
>
> Under [intro.progress p1] as it stands (either C++23 or C++26), the
> defined observable semantics of `debug_output_string()` are "do nothing".
> This is obviously what the programmer intended, and they would like it to
> be done as quickly as possible. And indeed, compilers will compile it to a
> flat return (https://godbolt.org/z/5sjTYd6dx), optimizing out the loop
> from `my_strlen`. Great. (Of course, the programmer does not want to
> manually rewrite `debug_output_string()` as { }, because in a debug build,
> it should actually do something.)
>
> But without [intro.progress p1], the defined observable semantics of
> `debug_output_string()` are not simply "do nothing". Rather, they are "do
> nothing, unless s is a string of length UINT_MAX or greater, in which case
> loop forever and do not proceed with the rest of the program". (Recall
> that unsigned integer overflow is not UB and is defined to simply wrap
> around.) If the compiler must provide those semantics, then it must
> actually iterate over the string, just in case it should happen to have
> length UINT_MAX. Again, you can see this in practice by using
> `-fno-finite-loops`: https://godbolt.org/z/GqqePv8Th.
>
> The programmer already knows that the latter case will never happen, but
> AFAIK there is no simple way for them to communicate this to the compiler.
> So without [intro.progress p1], the compiler is required to waste a lot of
> runtime pointlessly looping over strings.
>
> (I know the programmer could avoid this in other ways, e.g. by putting `if
> (is_debug_build)` around the body of `debug_output_string()`. But we could
> suppose that `debug_output_buffer()` is called from several different
> functions: `debug_output_int()`, `debug_output_struct_foo()`, etc. By the
> principle of DRY, the programmer should prefer to put the test of
> `is_debug_build` in just one place.)
>

First, it is a valid example, and you provided counterarguments, making it
unnecessary for me to say things that you know already. I just want to
emphasize that there are already a lot of things C++ programmers need to be
careful with, like that we should try to put string/vector variables
outside the loop, as the existence of global allocation/deallocation
functions make it generally impossible for the compiler to optimize away
allocations and deallocations. (And I truly want deterministic behaviour
and do not like magical optimizations here.)

. . . . . . . . .
>
> Having said all this, let's step back. It seems to me that, among all
> loops with no side effects, there are two types: (A) intended by the
> programmer to always terminate; (B) intended by the programmer to possibly
> loop forever and prevent further execution.
>
> The approach of C++23 was to assume all loops are of type A. If the
> programmer desires a loop of type B, they must manually add a side effect.
>
> C++26 adds the exception for "trivially empty loops" (
> https://eel.is/c++draft/stmt.iter.general#3) which are assumed to be of
> type B. This does not cover your Fermat program, though I don't think it's
> a good example: as long as memory is finite, the search *must* terminate
> somehow after a finite number of iterations, so it can't really be an
> infinite loop in any case. But I'm willing to concede that there exist
> other type B loops that are not "trivially empty".
>
> One thought would be to have an attribute or something, to allow the
> programmer to specify what type of loop is desired. And I can see an
> argument that B ought to be the default, so that a program always "does
> what it says". The programmer could then "opt in" to A with
> [[finite_loop]] or something like that, in cases where they can prove
> statically that the loop always terminates.
>

For the sake of programmer friendliness (by the way, UB is *extremely
programmer-unfriendly*), I would like to see forward progress opt-in,
instead of the current opposite way. Do not surprise your users
(programmers here). Do not expect every programmer to read the C++ standard
(just like most humans want to avoid reading user manuals wherever
possible).

I am in the process of preparing a lecture on UB, and it is really a pain
to find that there are so many surprises in the C++ language.... I concur
that UB is unavoidable at present, but I would argue it is better to have
less UB where possible, especially when making UB defined does not bring
real harm.

While I agree the optimization like you mentioned may really exist, I still
can hardly imagine people will *rely* on it. It is too fragile—a single
memory allocation can make the optimization impossible at run time.

-- 
Yongwei Wu
URL: http://wyw.dcweb.cn/

Received on 2025-09-18 06:28:59