ISOCPP liaison List: Re: [wg14/wg21 liaison] Semantics of angelic nondeterminism

From: Jens Gustedt <jens.gustedt_at_[hidden]>
Date: Sat, 13 Apr 2024 14:15:06 +0200

Am 13. April 2024 11:22:55 MESZ schrieb Jens Maurer <jens.maurer_at_[hidden]>:
>
>
> On 13/04/2024 10.38, Jens Gustedt via Liaison wrote:
> > Am 13. April 2024 10:03:29 MESZ schrieb Martin Uecker via Liaison <liaison_at_[hidden]>:
> >
> >> I do not see why you cannot avoid the conclusion. We can simply
> >> say "once an operation which has UB is encountered, the standard
> >> allows arbitrary behavior." This eliminates a lot of problems
> >> and is far simpler to understand and to reason about. In
> >> particular, it makes the specification useful even in cases
> >> where some component is involved that does not have a full
> >> formal specification and where it is not clear what "game" this
> >> component would play with some angels and demons in the
> >> language spec, i.e. in 99% of all real world applications of C.
> >
> > I agree with most of what you are saying, and actually the other Jens' argument how function calls obfuscate behavior shows to me how useless all of this Platonic idealism is. It just doesn't help to explain or predict program behavior in real life.
>
> On the other hand, recent improvements in optimizer technologies
> (such as link-time optimization) do introduce a whole-program
> view, with very real consequences. The whole point of the
> specification is to be abstract enough so that all these
> improvements in optimizer technology are actually possible.

I don't think that I understand your position, here. Upthread you draw the card that compilers don't know what is happening within function calls, and now you are claiming the opposite?

In any case, I don't see were platonic assumptions about malloc help. If we don't know what is happening inside calls, you, well, just don't know, and assuming that it is one or the other does not change anything. In the case of full knowledge as you describe here, just use that knowledge, and assume a type (and a change thereof) where you observe this. Assumption of an angelic position is only useful when you are an angel, anyhow.

More specifically, I don't believe in whole program optimisation for the case of a program that is composed of TU that are written in different languages, and where one language makes false claims about the memory or type model that the other language has.

> For example, declaring arbitrary non-inline function calls
> as barriers for undefined behavior isn't a good approach once
> "inline" becomes meaningless because the compiler has a whole-
> program view due to LTO.
>
> I think we'll have a lot more long-term maintenance effort
> with a more "real-world" specification rather than having a
> clear abstract model.

I don't think so. In C we have made good progress in pushing things to specification and function boundaries.

> For example, the multi-threading aspects of the memory model
> feel rather abstract to me, but have aged quite well so far.

I never made such a claim. In the contrary, I insisted that we clearly talk about "happens before" and erroneous operations. Resource exhaustion (like stack overflow) and unfortunate events (like race conditions) are obviously much more difficult to model.

> > But I think the above should be formulated a bit more carefully, something like
> >
> > "All visible side effects that happened before an erroneous operation that has UB remain visible to the execution environment. Effects that are perceived after such an operation are not restricted and may partially or completely obfuscate the effects that happened prior to the operation."
> >
> > By such a statement we'd guarantee for example that inside a debugger the rules of the language apply, even if the execution hits an erroneous operation at some point.
>
> I don't think we can meaningfully specify what happens in a debugger.
> We can rely on implementations not being intentionally user-hostile,
> including for interactions with the environment, though.

Probably not debugging in full generality, e.g if there are multiple threads or if the state of the program is changed by the debugger.

But people can assume (and do) that tracing of a sequential execution shows the observable side effects in exactly the order in which they appear, and that no erroneous operation will ever make them unsee what they have observed. I think that this is completely reasonable and that any language standard should give such a guarantee.

Jens

-- 
Jens Gustedt - INRIA & ICube, Strasbourg, France

Received on 2024-04-13 12:15:13