C++ Logo


Advanced search

[wg14/wg21 liaison] Semantics of angelic nondeterminism

From: Davis Herring <herring_at_[hidden]>
Date: Fri, 12 Apr 2024 01:08:21 -0500
When I presented P2434R0 ("Nondeterministic pointer provenance") to SG22 in February, the reasonable objection was raised that it's less than obvious how to define rigorous semantics for a program with a mixture of angelic and demonic nondeterminism. As I said at the time, that is not strictly P2434's problem: two versions of C++ have been published with the same semantics defined for the types of objects implicitly created by malloc and company. However, it is therefore all the more important to verify that the specification technique is sound.

The difficulty is real: indeed, what seems to me to be the obvious candidate, an extensive-form game between the angel and the demon, where the behavior is defined iff the angel has a winning strategy, does not work. Consider, using the existing malloc situation, the following program:


  int incr(int *p) {return ++*p;}
  int id(int i) {return i;}
  int flip() {
    int i=0;
    return id(i)+incr(&i); // unspecified: 1 or 2

  int sink_i(void *p) {
    int *const x=(int*)p;
    x[0]=1; x[1]=-1;
    return x[0]+x[1];
  int sink_f(void *p) {
    float *const x=(float*)p;
    x[0]=1.6; x[1]=1.5;
    return x[0]+x[1];

  int main() {
    void *p=malloc(2*(sizeof(int)+sizeof(float)));
    const int ret=flip()==1 ? sink_i(p) : sink_f(p);
    return ret;

We presumably want this to be well defined: whatever flip returns, the program consistently treats the allocated memory as an int[2] or a float[2]. However, interpreted as a C++ program this would have undefined behavior under the extensive-form game model, because whatever objects the angel picks during the call to malloc, the demon merely has to choose to evaluate the call to id or incr first to select the wrong such consistent usage.

We can't simply say that the demon doesn't get to see the angel's choices (even assuming it possible to word such a restriction), because it might be lucky enough to pick the fatal outcome without such knowledge. It is tempting to require the demon to move first, but then it's not clear how to proceed with evaluation of the program: the demon's next opportunity to move might depend on the semantics of an evaluation that can be fixed only with knowledge of the angel's choices, but the angel cannot move yet because the demon has another chance to guide the program wrong for that choice later. This is not unlike the sort of circularity mentioned in the meeting, where it's not clear how to combine the existential quantifier of the angel with the universal quantifier of the demon.

However, there is a meaningful way to make all the demon's moves known to the angel even though the semantics of the program formally depend on the angel's choices. Instead of having the demon play the game directly, we have it commit to a strategy: a comprehensive plan that produces a choice for every occurrence of unspecified behavior in every possible execution (including those occurrences that take place only in the branches selected by certain outcomes of previous occurrences). Then the angel is given that strategy and attempts to find a choice (here, of objects created in the allocated memory) that gives the program defined behavior. If the angel can do so for every possible strategy, the program does have defined behavior (but it could have the observable behavior associated with any strategy); otherwise, the demon selects a strategy for which no such choice exists and the program has undefined behavior.

Lest the space of possible strategies be considered as nebulous and ill-founded as the existing combination of existential and universal quantifiers might be said to be, let me point out that there is an easy, concrete means of describing that space: each strategy can be encoded as a real number on [0,1), with the interpretation that at each decision in the execution (including the unbounded number of partial orders in which multiple threads can execute) a suitable number of leading bits are consumed from the number and the remainder rescaled to take their place. We know the meaning of any such number only after the angel has made its choices for an execution whose unspecified behavior is determined by that number, but plainly any set of choices the demon can make (given the interference of the angel) corresponds to some real number. (Real numbers can of course store infinite amounts of information each.)

I don't claim at the moment to know a nice way of wording this without spelling out in ugly detail the mechanics of implementing the demon's strategy, but I thought it was worth sharing with this audience evidence that the general direction is viable such that it need not be a source of divergence between the languages. (One relevant example of such potential divergence is the complex interaction that would result from C specifying printf and scanf in terms of explicit provenance attributes of pointer values that are not meaningful when the library is incorporated by reference in C++.)

Skeptical comments welcome,

Received on 2024-04-12 06:08:33