Date: Fri, 10 Jun 2022 11:10:20 -0400
The "sense" that a program makes is given by the rules of the language, and
the rules of the language have always said, since volatile was introduced,
that reading or writing a volatile variable is a side effect that is
sequenced with all other side effects, and that access to volatile
variables is supposed to happen by following the abstract machine. The
languages said nothing to differentiate automatic volatiles from any other
volatiles.
It is not esoteric nonsense to expect a program to behave the way the
language specifies. And it is not as if the C++ committee hasn't added
enormous amounts of esoteric nonsense to the language that people are
expected to understand. Here's my favorite:
int &a() { printf("a"); static int i; return i; }
Int &b() { printf("b"); static int i; return i; }
void foo() {
a() <<= b();
a() << b();
a() < b();
}
On Fri, Jun 10, 2022, 1:37 AM Jason McKesson via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On Thu, Jun 9, 2022 at 9:10 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >
> > On Fri, 10 Jun 2022 at 01:34, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >>
> >> On Thu, Jun 9, 2022 at 7:40 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >> >
> >> > On Thu, 9 Jun 2022 at 18:29, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >> >>
> >> >> On Thu, Jun 9, 2022 at 11:45 AM Hyman Rosen via Std-Proposals
> >> >> <std-proposals_at_[hidden]> wrote:
> >> >> >
> >> >> > The problem started with code like
> >> >> >
> >> >> > volatile char v[4] = 0;
> >> >> > ++v[2];
> >> >> >
> >> >> > On architectures that do not provide byte-level memory access, the
> abstract machine cannot be followed exactly. Reading or writing one bye
> from that array could require reading or writing all four. Rather than
> trying to come up with wording to cover such situations, the C standard
> made volatile access implementation-defined, with the understanding that
> compilers should implement volatile semantics as best they can in weird
> situations. But once the optimizationists captured the standardization
> processes, they took that implementation-defined behavior as permission to
> disregard volatile semantics altogether even on normal platforms.
> >> >> >
> >> >> > Here is an interesting case.
> >> >> >
> >> >> > void foo(int *p) {
> >> >> > if (!p) { printf("null pointer\n"); }
> >> >> > volatile bool b = false;
> >> >> > if (b) { abort(); }
> >> >> > *p = 0;
> >> >> > }
> >> >> >
> >> >> > It's implementing a poor-man's contract check that the argument is
> not null and trying to print a message if it is (but continue going). With
> correctly implemented volatile semantics, the message will print because
> printing is a side effect that must happen before the volatile access which
> is also a side effect. But the Microsoft compiler elides the volatile
> variable and test altogether, then sees that if the initial test is true
> then undefined behavior must result, and eliminates that test and the print.
> >> >>
> >> >> OK, I'm a normal C++ programmer and I read over that code. My first
> >> >> thought will be "why is there a pointless variable here?" The idea
> >> >> that the presence of `if(b)` should change *anything* about the
> >> >> execution of this code is absurd. This only makes sense to those who
> >> >> already know way too much about the language.
> >> >>
> >> >> It would make far more sense if you could just stick some kind of
> >> >> syntax that obviously says, "Follow the abstract machine exactly"
> >> >> there. Like:
> >> >>
> >> >> > void foo(int *p) {
> >> >> > [[exact_code]] {
> >> >> > if (!p) { printf("null pointer\n"); }
> >> >> > *p = 0;
> >> >> > }
> >> >> > }
> >> >>
> >> >> That is readable. It tells you what is happening. It makes it much
> >> >> more clear not just what is happening but why it is there. And the
> >> >> scope applied to the attribute tells you how much of the function it
> >> >> covers.
> >> >
> >> >
> >> > I am not sure what it means to "follow the abstract machine exactly".
> Could you explain in a manner that could be followed by a compiler?
> >>
> >> Not really. That's why it is an attribute. The idea is to say "don't
> >> mess around with this code." What that means is up to compilers and
> >> QOI.
> >
> >
> > So why would any compiler writer respect that attribute, if they can
> just hoist the banner of QOI?
> >
> >> The closest I could say to an actual, standard-defined set of behavior
> >> would be to not have UB propagate backwards in time.That is, if there
> >> is some runtime construct that generates UB, every observable
> >> side-effect that happens-before the UB-invoking construct should
> >> execute as directed, whenever possible.
> >
> >
> > Have you seen S. Davis Herring's P1494R2, partial program correctness
> (std::observable())? (I think that might be where Hyman Rosen's example
> comes from; it's more or less the same idea, anyway.) That might help in
> this specific case, but not for the debug case or for benchmarking.
> >
> > A practical problem is that actually flushing the relevant buffers to
> the console can take an arbitrary amount of time, and during that time may
> be dependent on the program continuing to exist.
>
> Things like that are why my wording would be more of a suggestion than
> a requirement. The implementation would attempt, with the best of its
> abilities, to act as though side effects that are ordered to happen
> before something that causes UB will still be visible.
>
> The idea of an explicit fence like that might be better than my scoped
> tool. But at the end of the day, you can't guarantee any of it. Not
> even with a `volatile` under prior wording.
>
> >> > As for scope, we really aren't interested in covering more than a
> single read or write (in the benchmarking case) of an object (which may
> involve more than one read or write of scalars). Quite possibly this could
> be accomplished by std::<bikeshed>_load and std::<bikeshed>_store functions
> acting as an optimizer barrier.
> >>
> >> The example that was presented is about the propagation of UB. That
> >> is, because `if(!p)` can only ever be true if the final statement
> >> yields UB. And since compilers are allowed to do *anything* in the
> >> event of UB, if the user passes a NULL `p`, then UB will result.
> >> Therefore, the compiler may freely assume that `p` will never be NULL,
> >> since UB doesn't promise anything about the state of the abstract
> >> machine *before* the UB was triggered. And therefore, it can optimize
> >> out the `if`.
> >
> > No, the compiler can only do this if it has already decided to elide the
> volatile read.
>
> Which is a perfectly valid thing for it to do, because by any
> reasonable metric, the read should accomplish *nothing*. That's what
> brings us here.
>
> Remember: the primary point of this is to make it so that people can
> write code where its behavior can be reasonably understood without
> resorting to esoteric nonsense. It doesn't make sense for a read of a
> local variable, `volatile` or not, to cause the behavior you want it
> to cause. So even if you wanted to find a way to restore that behavior
> (which I'm not sure you could), it would still be a bad way of causing
> that particular behavior. And the only justification you have for it
> is that we've always done it that way.
>
> > If the test were changed to e.g. getchar() it would not be elided.
>
> That's only because the result of `getchar` is beyond the compiler's
> purview. The result of a variable read whose domain is entirely
> visible to the compiler is within the compiler's purview.
>
> >> There is no way to make this a property of a variable or of a memory
> >> read. It's a property of the relationship between the `if(!p)` and the
> >> `*p` code. You need to tell the compiler that you want it to actually
> >> do the `if` test and you actually want to see the results if possible,
> >> even if you will get UB in a moment.
> >
> >
> > If the compiler is required to perform the read and - perhaps this is
> crucial - not permitted to assume that it retains a previously stored
> value, even if there is no way for the address to escape - then it would
> have to perform the test.
>
> Again, you're not justifying why we should use `volatile` for this;
> only that historically it has been possible (in theory). That doesn't
> make it good; indeed, the whole point of deprecating these kinds of
> things is to keep people from using the feature in nonsensical
> contexts that require expert-level understanding to even know what's
> going on.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
the rules of the language have always said, since volatile was introduced,
that reading or writing a volatile variable is a side effect that is
sequenced with all other side effects, and that access to volatile
variables is supposed to happen by following the abstract machine. The
languages said nothing to differentiate automatic volatiles from any other
volatiles.
It is not esoteric nonsense to expect a program to behave the way the
language specifies. And it is not as if the C++ committee hasn't added
enormous amounts of esoteric nonsense to the language that people are
expected to understand. Here's my favorite:
int &a() { printf("a"); static int i; return i; }
Int &b() { printf("b"); static int i; return i; }
void foo() {
a() <<= b();
a() << b();
a() < b();
}
On Fri, Jun 10, 2022, 1:37 AM Jason McKesson via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On Thu, Jun 9, 2022 at 9:10 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >
> > On Fri, 10 Jun 2022 at 01:34, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >>
> >> On Thu, Jun 9, 2022 at 7:40 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >> >
> >> > On Thu, 9 Jun 2022 at 18:29, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >> >>
> >> >> On Thu, Jun 9, 2022 at 11:45 AM Hyman Rosen via Std-Proposals
> >> >> <std-proposals_at_[hidden]> wrote:
> >> >> >
> >> >> > The problem started with code like
> >> >> >
> >> >> > volatile char v[4] = 0;
> >> >> > ++v[2];
> >> >> >
> >> >> > On architectures that do not provide byte-level memory access, the
> abstract machine cannot be followed exactly. Reading or writing one bye
> from that array could require reading or writing all four. Rather than
> trying to come up with wording to cover such situations, the C standard
> made volatile access implementation-defined, with the understanding that
> compilers should implement volatile semantics as best they can in weird
> situations. But once the optimizationists captured the standardization
> processes, they took that implementation-defined behavior as permission to
> disregard volatile semantics altogether even on normal platforms.
> >> >> >
> >> >> > Here is an interesting case.
> >> >> >
> >> >> > void foo(int *p) {
> >> >> > if (!p) { printf("null pointer\n"); }
> >> >> > volatile bool b = false;
> >> >> > if (b) { abort(); }
> >> >> > *p = 0;
> >> >> > }
> >> >> >
> >> >> > It's implementing a poor-man's contract check that the argument is
> not null and trying to print a message if it is (but continue going). With
> correctly implemented volatile semantics, the message will print because
> printing is a side effect that must happen before the volatile access which
> is also a side effect. But the Microsoft compiler elides the volatile
> variable and test altogether, then sees that if the initial test is true
> then undefined behavior must result, and eliminates that test and the print.
> >> >>
> >> >> OK, I'm a normal C++ programmer and I read over that code. My first
> >> >> thought will be "why is there a pointless variable here?" The idea
> >> >> that the presence of `if(b)` should change *anything* about the
> >> >> execution of this code is absurd. This only makes sense to those who
> >> >> already know way too much about the language.
> >> >>
> >> >> It would make far more sense if you could just stick some kind of
> >> >> syntax that obviously says, "Follow the abstract machine exactly"
> >> >> there. Like:
> >> >>
> >> >> > void foo(int *p) {
> >> >> > [[exact_code]] {
> >> >> > if (!p) { printf("null pointer\n"); }
> >> >> > *p = 0;
> >> >> > }
> >> >> > }
> >> >>
> >> >> That is readable. It tells you what is happening. It makes it much
> >> >> more clear not just what is happening but why it is there. And the
> >> >> scope applied to the attribute tells you how much of the function it
> >> >> covers.
> >> >
> >> >
> >> > I am not sure what it means to "follow the abstract machine exactly".
> Could you explain in a manner that could be followed by a compiler?
> >>
> >> Not really. That's why it is an attribute. The idea is to say "don't
> >> mess around with this code." What that means is up to compilers and
> >> QOI.
> >
> >
> > So why would any compiler writer respect that attribute, if they can
> just hoist the banner of QOI?
> >
> >> The closest I could say to an actual, standard-defined set of behavior
> >> would be to not have UB propagate backwards in time.That is, if there
> >> is some runtime construct that generates UB, every observable
> >> side-effect that happens-before the UB-invoking construct should
> >> execute as directed, whenever possible.
> >
> >
> > Have you seen S. Davis Herring's P1494R2, partial program correctness
> (std::observable())? (I think that might be where Hyman Rosen's example
> comes from; it's more or less the same idea, anyway.) That might help in
> this specific case, but not for the debug case or for benchmarking.
> >
> > A practical problem is that actually flushing the relevant buffers to
> the console can take an arbitrary amount of time, and during that time may
> be dependent on the program continuing to exist.
>
> Things like that are why my wording would be more of a suggestion than
> a requirement. The implementation would attempt, with the best of its
> abilities, to act as though side effects that are ordered to happen
> before something that causes UB will still be visible.
>
> The idea of an explicit fence like that might be better than my scoped
> tool. But at the end of the day, you can't guarantee any of it. Not
> even with a `volatile` under prior wording.
>
> >> > As for scope, we really aren't interested in covering more than a
> single read or write (in the benchmarking case) of an object (which may
> involve more than one read or write of scalars). Quite possibly this could
> be accomplished by std::<bikeshed>_load and std::<bikeshed>_store functions
> acting as an optimizer barrier.
> >>
> >> The example that was presented is about the propagation of UB. That
> >> is, because `if(!p)` can only ever be true if the final statement
> >> yields UB. And since compilers are allowed to do *anything* in the
> >> event of UB, if the user passes a NULL `p`, then UB will result.
> >> Therefore, the compiler may freely assume that `p` will never be NULL,
> >> since UB doesn't promise anything about the state of the abstract
> >> machine *before* the UB was triggered. And therefore, it can optimize
> >> out the `if`.
> >
> > No, the compiler can only do this if it has already decided to elide the
> volatile read.
>
> Which is a perfectly valid thing for it to do, because by any
> reasonable metric, the read should accomplish *nothing*. That's what
> brings us here.
>
> Remember: the primary point of this is to make it so that people can
> write code where its behavior can be reasonably understood without
> resorting to esoteric nonsense. It doesn't make sense for a read of a
> local variable, `volatile` or not, to cause the behavior you want it
> to cause. So even if you wanted to find a way to restore that behavior
> (which I'm not sure you could), it would still be a bad way of causing
> that particular behavior. And the only justification you have for it
> is that we've always done it that way.
>
> > If the test were changed to e.g. getchar() it would not be elided.
>
> That's only because the result of `getchar` is beyond the compiler's
> purview. The result of a variable read whose domain is entirely
> visible to the compiler is within the compiler's purview.
>
> >> There is no way to make this a property of a variable or of a memory
> >> read. It's a property of the relationship between the `if(!p)` and the
> >> `*p` code. You need to tell the compiler that you want it to actually
> >> do the `if` test and you actually want to see the results if possible,
> >> even if you will get UB in a moment.
> >
> >
> > If the compiler is required to perform the read and - perhaps this is
> crucial - not permitted to assume that it retains a previously stored
> value, even if there is no way for the address to escape - then it would
> have to perform the test.
>
> Again, you're not justifying why we should use `volatile` for this;
> only that historically it has been possible (in theory). That doesn't
> make it good; indeed, the whole point of deprecating these kinds of
> things is to keep people from using the feature in nonsensical
> contexts that require expert-level understanding to even know what's
> going on.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Received on 2022-06-10 15:10:34