C++ Logo


Advanced search

Re: [wg14/wg21 liaison] Semantics of angelic nondeterminism

From: Davis Herring <herring_at_[hidden]>
Date: Wed, 17 Apr 2024 15:18:16 -0600
> Yes, but we can describe the properties of the arrays as
> preconditions in terms of the state of the abstract system
> on function entry.

"on function entry" is simplistic in the presence of multiple threads,
but I don't think there's really any disagreement here. Obviously local
reasoning is what's most practical, and we want to enable and empower it
where we can, but I don't see what I've written as standing in
opposition to that direction. In particular, the approach in P2434R1
(which I'm still writing but is basically "P2434R0 + notes about acausal
pointer values") is meant to describe (however indirectly) what
reasoning _can_ be applied locally given code like

   int* use_zombie(int *z) {
     int *const p=(int*)malloc(sizeof *p);
     if((uintptr_t)p==(uintptr_t)z) *z=1;
     else *p=0;
     return p;

That code is of course silly, but it's meant to illustrate in a simple
context the sort of interactions across threads in the lock-free
linked-list algorithms that have so often been discussed lately. I
claim that we want it to be well-defined (although there's no guarantee
which branch is taken) in isolation because z might be the old pointer
stored in an atomic, but that

   int use_zombie(int *z) {
     int *const p=(int*)malloc(sizeof *p);
     const int ret=*p;
     return ret;

can be reduced to

   int use_zombie(int *z) {
     return 0;

because z must point to something that exists independently of the
malloc (at an unknown address) and so must not alias p.

> I does not have to, but it seems obvious to me that it is
> much more useful when it is.

Certainly a specification that maps more directly onto actual
implementation operations is easier to deal with, but I am not at all
sure that there is such a "direct specification" available here.
Already the Provenance TS has to talk about pointers with a
superposition of provenances, and that seems consistent with the
effective type approach, but I suspect that attempting to formally
define those notions in the context of multiple threads will require
specification gymnastics similar in amount (if not kind) to those in
P2434 since there might not be any happens-before edges between the uses
of such pointers on multiple threads. Extending it thereafter to
support the pointer-zap case will not improve things.

> Indeed, C does not have this. We want to be able to
> reason about executions where flip() returns 1 and we also
> want to be able to reason about executions where flip() returns
> 2 until the point the operation that has UB is encountered.

I think this point is best interpreted as being the motivation for your
WG14:N3128, and indeed you wrote to me a year ago (sorry I never got
back to it until now!) suggesting that my std::observable (from P1494)
might be more profitably considered implicit in every library I/O
function (recalling that that set includes fflush rather than printf).
I plan to present a revision of P1494 that includes that idea, as it
indeed seems that the value of the paper might be realized without the
syntax. That might make the distinction between the different models of
undefined behavior a question of phrasing rather than substance, which
is an improvement.

> I do not understand why WG21 goes down the path of essentially
> producing a specification that gives "no semantics" to almost
> any real world program, while seemingly providing (almost)
> no advantages in terms of optimization because compilers
> can not reason globally about all paths of executions anyway.

Is it meaningfully different to give semantics to such programs but only
up to the first undefined operation?

> In particular, it makes the specification useful even in cases where
> some component is involved that does not have a full formal
> specification and where it is not clear what "game" this component
> would play with some angels and demons in the language spec, i.e. in
> 99% of all real world applications of C.
I don't think it's actually any more clear in the absence of "quantifier
semantics" what happens when you call a function whose specification is
imprecise: it might call abort, or open an arbitrary file, or anything.
That's not an impediment to practical work getting done, of course, and
that means it's also not an impediment to precise specification of what
we can specify.


This product is sold by volume, not by mass.  If it appears too dense or 
too sparse, it is because mass-energy conversion has occurred during 

Received on 2024-04-17 21:18:41