Date: Thu, 11 Jun 2026 04:20:37 +0000
> This is what a conforming implementation can do under the "as-if" rule; however, reordering is not the semantics of the abstract machine.
Well, what does the "abstract machine" means at this point?
C++ is a practical language meant to generate machine code to run on concrete machines, not theoretical ones.
And as we have discussed those concrete machines don't really respect "order of execution", unless you put barriers everywhere, and nobody would want that because that would make things painfully slow.
And the fact that the standard provides atomics with memory order semantics and explicit ways to use a barrier, is at least an implicit acknowledgement that this sort of shenanigans are things that exist.
If you are talking about consteval functions, that the compiler must calculate at compile time and must compute the values as if run on a theoretical computer. Then yeah, you can technically say that A always happens before B, but then again consteval doesn't have threads or clocks so this kind of behavior would never be observable.
It used to be that devices had a single cpu that always did things in the order that they were given. And the semantics of the programming languages that have been developed reflected that kind of workflow.
But hardware has evolved and doesn't do that sort of thing anymore.
Programming languages where you write "this happens" then " this happens" still are for all intent and purpose the most efficient way of writing programs (at least it makes humans happy thinking that they still understand the code they have written).
But we have to deal with the reality that when you are writing multi-threaded software you are going to see side effects of the CPU treating the order of your instructions more like a suggestion.
________________________________
From: Std-Discussion <std-discussion-bounces_at_[hidden]ocpp.org> on behalf of jim x via Std-Discussion <std-discussion_at_[hidden]>
Sent: Thursday, 11 June 2026 04:41:14
To: Jennifier Burnett <jenni_at_[hidden]>
Cc: jim x <xmh970252187_at_[hidden]>; jim x via Std-Discussion <std-discussion_at_[hidden]>
Subject: Re: [std-discussion] Does the C++ abstract machine recognize a temporal order of execution?
Almost certainly it won't represent the time at the moment of the function call, nor will it represent the exact moment after the call completing (in both cases the thread may get preempted or interrupted for a non trivial amount of time whilst in the function.)
`now()` is an invocation of the function, according to [intro.execution] p12
each evaluation that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any) is either sequenced before all evaluations that occur within F or sequenced after all evaluations that occur within F;
That is, in the single thread, the control flow must first execute `#0`, then execute the evaluation occur within the `now()` that samples the global time point. If the sampled time point is named `now1`, the control flow must execute `#0` at a time point that is no later than `now1`.
t1 may well execute the store and then read the time point after, but there's no guarantee that the store need become visible to t2 until an infinitely far point in the future.
This is what a conforming implementation can do under the "as-if" rule; however, reordering is not the semantics of the abstract machine. For instance, the modification order of an atomic object is not an observable behavior, and a conforming implementation doesn't need to emulate this structure, but this structure is a meaningful semantics in the abstract machine.
On Thu, Jun 11, 2026 at 1:50 AM Jennifier Burnett <jenni_at_[hidden]<mailto:jenni_at_[hidden]>> wrote:
The result of now() being defined as "the current point in time" is inherently ambiguous. Almost certainly it won't represent the time at the moment of the function call, nor will it represent the exact moment after the call completing (in both cases the thread may get preempted or interrupted for a non trivial amount of time whilst in the function.) Attempting to mix the formal parts of the standard (the memory model) and the normative parts (the definition of "current point in time") usually doesn't go well.
Besides this, your example is trivially defeated by store buffering. t1 may well execute the store and then read the time point after, but there's no guarantee that the store need become visible to t2 until an infinitely far point in the future. Thus t1 and t2 may temporally execute in that order but the read on t2 may still observe a 0, the only way that you could avoid that would be if you specified that any read of a time point happens-before any read of a time point that observes a higher value. That's not even possible to guarantee with a read on AArch64's memory model, you'd have to use a read-modify-write instruction that doesn't modify the memory value (such as a fetch_add with zero), which I would be highly surprised if any current implementations did.
On 10 June 2026 10:46:37 BST, jim x via Std-Discussion <std-discussion_at_[hidden]<mailto:std-discussion_at_[hidden]>> wrote:
Consider this example:
````cpp
#include <atomic>
#include <chrono>
#include <thread>
uint64_t timestamp() {
auto now = std::chrono::steady_clock::now().time_since_epoch();
return std::chrono::duration_cast<std::chrono::nanoseconds>(now).count();
}
int main() {
std::atomic<long int> val = 0;
long int now1, now2;
auto t1 = std::thread([&]() {
val.store(1,relaxed); // #1
now1 = timestamp(); // #2
});
auto t2 = std::thread([&]() {
now2 = timestamp(); // #3
val.load( relaxed ); // #4
});
t1.join();
t2.join();
}
````
This question arises from whether we can determine if a specific execution outcome is caused by inter-thread latency within the abstract machine. A possible execution of the above example is that #4 reads 0 even when now1 < now2.
Both intro.execution p8<https://eel.is/c++draft/intro.execution#8>
> Given any two evaluations A and B, if A is sequenced before B (or, equivalently, B is sequenced after A), then the execution of A shall precede the execution of B.
and [stmt.pre] p1
> Except as indicated, statements are executed in sequence ([intro.execution]).
state that the control flow executes expressions in sequential order within a single thread, provided one evaluation is sequenced before another.
Furthermore, [time.clock.steady] p1 states:
> Objects of class steady_clock represent clocks for which values of time_point never decrease as physical time advances and for which values of time_point advance at a steady rate relative to real time. That is, the clock may not be adjusted.
and [time.clock.req] p2 states:
> C1::now(): Returns a time_point object representing the current point in time.
This implies that calling now() samples a global time point when the control flow executes it. Since the control flow cannot reach #2 without first executing #1, #1 must be executed by the control flow at a point in time no later than the time point returned by #2. The same logic applies to #3 and #4.
Therefore, when now1 < now2, does it imply that #1 is executed by the control flow of t1 at a point in time strictly earlier than when #4 is executed by the control flow of t2, from the perspective of the abstract machine? (Note that this does not refer to a happens-before relationship, but rather a temporal comparison of the control flows executing these expressions.)
As a minor clarification, this is not a question about physical implementations (which are governed by the "as-if" rule), but rather a conceptual question about the formal behavior defined by the C++ abstract machine.
The deduction above is based entirely on existing rules within the standard, and there seems to be no explicit rule that contradicts this interpretation. Consequently, this appears to be a gray area in the specification. If this reasoning is indeed flawed, where exactly does the flaw lie? Furthermore, are there any specific rules in the standard that would directly negate this conclusion?
Well, what does the "abstract machine" means at this point?
C++ is a practical language meant to generate machine code to run on concrete machines, not theoretical ones.
And as we have discussed those concrete machines don't really respect "order of execution", unless you put barriers everywhere, and nobody would want that because that would make things painfully slow.
And the fact that the standard provides atomics with memory order semantics and explicit ways to use a barrier, is at least an implicit acknowledgement that this sort of shenanigans are things that exist.
If you are talking about consteval functions, that the compiler must calculate at compile time and must compute the values as if run on a theoretical computer. Then yeah, you can technically say that A always happens before B, but then again consteval doesn't have threads or clocks so this kind of behavior would never be observable.
It used to be that devices had a single cpu that always did things in the order that they were given. And the semantics of the programming languages that have been developed reflected that kind of workflow.
But hardware has evolved and doesn't do that sort of thing anymore.
Programming languages where you write "this happens" then " this happens" still are for all intent and purpose the most efficient way of writing programs (at least it makes humans happy thinking that they still understand the code they have written).
But we have to deal with the reality that when you are writing multi-threaded software you are going to see side effects of the CPU treating the order of your instructions more like a suggestion.
________________________________
From: Std-Discussion <std-discussion-bounces_at_[hidden]ocpp.org> on behalf of jim x via Std-Discussion <std-discussion_at_[hidden]>
Sent: Thursday, 11 June 2026 04:41:14
To: Jennifier Burnett <jenni_at_[hidden]>
Cc: jim x <xmh970252187_at_[hidden]>; jim x via Std-Discussion <std-discussion_at_[hidden]>
Subject: Re: [std-discussion] Does the C++ abstract machine recognize a temporal order of execution?
Almost certainly it won't represent the time at the moment of the function call, nor will it represent the exact moment after the call completing (in both cases the thread may get preempted or interrupted for a non trivial amount of time whilst in the function.)
`now()` is an invocation of the function, according to [intro.execution] p12
each evaluation that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any) is either sequenced before all evaluations that occur within F or sequenced after all evaluations that occur within F;
That is, in the single thread, the control flow must first execute `#0`, then execute the evaluation occur within the `now()` that samples the global time point. If the sampled time point is named `now1`, the control flow must execute `#0` at a time point that is no later than `now1`.
t1 may well execute the store and then read the time point after, but there's no guarantee that the store need become visible to t2 until an infinitely far point in the future.
This is what a conforming implementation can do under the "as-if" rule; however, reordering is not the semantics of the abstract machine. For instance, the modification order of an atomic object is not an observable behavior, and a conforming implementation doesn't need to emulate this structure, but this structure is a meaningful semantics in the abstract machine.
On Thu, Jun 11, 2026 at 1:50 AM Jennifier Burnett <jenni_at_[hidden]<mailto:jenni_at_[hidden]>> wrote:
The result of now() being defined as "the current point in time" is inherently ambiguous. Almost certainly it won't represent the time at the moment of the function call, nor will it represent the exact moment after the call completing (in both cases the thread may get preempted or interrupted for a non trivial amount of time whilst in the function.) Attempting to mix the formal parts of the standard (the memory model) and the normative parts (the definition of "current point in time") usually doesn't go well.
Besides this, your example is trivially defeated by store buffering. t1 may well execute the store and then read the time point after, but there's no guarantee that the store need become visible to t2 until an infinitely far point in the future. Thus t1 and t2 may temporally execute in that order but the read on t2 may still observe a 0, the only way that you could avoid that would be if you specified that any read of a time point happens-before any read of a time point that observes a higher value. That's not even possible to guarantee with a read on AArch64's memory model, you'd have to use a read-modify-write instruction that doesn't modify the memory value (such as a fetch_add with zero), which I would be highly surprised if any current implementations did.
On 10 June 2026 10:46:37 BST, jim x via Std-Discussion <std-discussion_at_[hidden]<mailto:std-discussion_at_[hidden]>> wrote:
Consider this example:
````cpp
#include <atomic>
#include <chrono>
#include <thread>
uint64_t timestamp() {
auto now = std::chrono::steady_clock::now().time_since_epoch();
return std::chrono::duration_cast<std::chrono::nanoseconds>(now).count();
}
int main() {
std::atomic<long int> val = 0;
long int now1, now2;
auto t1 = std::thread([&]() {
val.store(1,relaxed); // #1
now1 = timestamp(); // #2
});
auto t2 = std::thread([&]() {
now2 = timestamp(); // #3
val.load( relaxed ); // #4
});
t1.join();
t2.join();
}
````
This question arises from whether we can determine if a specific execution outcome is caused by inter-thread latency within the abstract machine. A possible execution of the above example is that #4 reads 0 even when now1 < now2.
Both intro.execution p8<https://eel.is/c++draft/intro.execution#8>
> Given any two evaluations A and B, if A is sequenced before B (or, equivalently, B is sequenced after A), then the execution of A shall precede the execution of B.
and [stmt.pre] p1
> Except as indicated, statements are executed in sequence ([intro.execution]).
state that the control flow executes expressions in sequential order within a single thread, provided one evaluation is sequenced before another.
Furthermore, [time.clock.steady] p1 states:
> Objects of class steady_clock represent clocks for which values of time_point never decrease as physical time advances and for which values of time_point advance at a steady rate relative to real time. That is, the clock may not be adjusted.
and [time.clock.req] p2 states:
> C1::now(): Returns a time_point object representing the current point in time.
This implies that calling now() samples a global time point when the control flow executes it. Since the control flow cannot reach #2 without first executing #1, #1 must be executed by the control flow at a point in time no later than the time point returned by #2. The same logic applies to #3 and #4.
Therefore, when now1 < now2, does it imply that #1 is executed by the control flow of t1 at a point in time strictly earlier than when #4 is executed by the control flow of t2, from the perspective of the abstract machine? (Note that this does not refer to a happens-before relationship, but rather a temporal comparison of the control flows executing these expressions.)
As a minor clarification, this is not a question about physical implementations (which are governed by the "as-if" rule), but rather a conceptual question about the formal behavior defined by the C++ abstract machine.
The deduction above is based entirely on existing rules within the standard, and there seems to be no explicit rule that contradicts this interpretation. Consequently, this appears to be a gray area in the specification. If this reasoning is indeed flawed, where exactly does the flaw lie? Furthermore, are there any specific rules in the standard that would directly negate this conclusion?
Received on 2026-06-11 04:20:43
