Date: Fri, 10 Jun 2022 02:10:20 +0100
On Fri, 10 Jun 2022 at 01:34, Jason McKesson via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On Thu, Jun 9, 2022 at 7:40 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >
> > On Thu, 9 Jun 2022 at 18:29, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >>
> >> On Thu, Jun 9, 2022 at 11:45 AM Hyman Rosen via Std-Proposals
> >> <std-proposals_at_[hidden]> wrote:
> >> >
> >> > The problem started with code like
> >> >
> >> > volatile char v[4] = 0;
> >> > ++v[2];
> >> >
> >> > On architectures that do not provide byte-level memory access, the
> abstract machine cannot be followed exactly. Reading or writing one bye
> from that array could require reading or writing all four. Rather than
> trying to come up with wording to cover such situations, the C standard
> made volatile access implementation-defined, with the understanding that
> compilers should implement volatile semantics as best they can in weird
> situations. But once the optimizationists captured the standardization
> processes, they took that implementation-defined behavior as permission to
> disregard volatile semantics altogether even on normal platforms.
> >> >
> >> > Here is an interesting case.
> >> >
> >> > void foo(int *p) {
> >> > if (!p) { printf("null pointer\n"); }
> >> > volatile bool b = false;
> >> > if (b) { abort(); }
> >> > *p = 0;
> >> > }
> >> >
> >> > It's implementing a poor-man's contract check that the argument is
> not null and trying to print a message if it is (but continue going). With
> correctly implemented volatile semantics, the message will print because
> printing is a side effect that must happen before the volatile access which
> is also a side effect. But the Microsoft compiler elides the volatile
> variable and test altogether, then sees that if the initial test is true
> then undefined behavior must result, and eliminates that test and the print.
> >>
> >> OK, I'm a normal C++ programmer and I read over that code. My first
> >> thought will be "why is there a pointless variable here?" The idea
> >> that the presence of `if(b)` should change *anything* about the
> >> execution of this code is absurd. This only makes sense to those who
> >> already know way too much about the language.
> >>
> >> It would make far more sense if you could just stick some kind of
> >> syntax that obviously says, "Follow the abstract machine exactly"
> >> there. Like:
> >>
> >> > void foo(int *p) {
> >> > [[exact_code]] {
> >> > if (!p) { printf("null pointer\n"); }
> >> > *p = 0;
> >> > }
> >> > }
> >>
> >> That is readable. It tells you what is happening. It makes it much
> >> more clear not just what is happening but why it is there. And the
> >> scope applied to the attribute tells you how much of the function it
> >> covers.
> >
> >
> > I am not sure what it means to "follow the abstract machine exactly".
> Could you explain in a manner that could be followed by a compiler?
>
> Not really. That's why it is an attribute. The idea is to say "don't
> mess around with this code." What that means is up to compilers and
> QOI.
>
So why would any compiler writer respect that attribute, if they can just
hoist the banner of QOI?
The closest I could say to an actual, standard-defined set of behavior
> would be to not have UB propagate backwards in time.That is, if there
> is some runtime construct that generates UB, every observable
> side-effect that happens-before the UB-invoking construct should
> execute as directed, whenever possible.
>
Have you seen S. Davis Herring's P1494R2, partial program correctness
(std::observable())? (I think that might be where Hyman Rosen's example
comes from; it's more or less the same idea, anyway.) That might help in
this specific case, but not for the debug case or for benchmarking.
A practical problem is that actually flushing the relevant buffers to the
console can take an arbitrary amount of time, and during that time may be
dependent on the program continuing to exist.
> As for scope, we really aren't interested in covering more than a single
> read or write (in the benchmarking case) of an object (which may involve
> more than one read or write of scalars). Quite possibly this could be
> accomplished by std::<bikeshed>_load and std::<bikeshed>_store functions
> acting as an optimizer barrier.
>
> The example that was presented is about the propagation of UB. That
> is, because `if(!p)` can only ever be true if the final statement
> yields UB. And since compilers are allowed to do *anything* in the
> event of UB, if the user passes a NULL `p`, then UB will result.
> Therefore, the compiler may freely assume that `p` will never be NULL,
> since UB doesn't promise anything about the state of the abstract
> machine *before* the UB was triggered. And therefore, it can optimize
> out the `if`.
>
No, the compiler can only do this if it has already decided to elide the
volatile read. If the test were changed to e.g. getchar() it would not be
elided.
There is no way to make this a property of a variable or of a memory
> read. It's a property of the relationship between the `if(!p)` and the
> `*p` code. You need to tell the compiler that you want it to actually
> do the `if` test and you actually want to see the results if possible,
> even if you will get UB in a moment.
>
If the compiler is required to perform the read and - perhaps this is
crucial - not permitted to assume that it retains a previously stored
value, even if there is no way for the address to escape - then it would
have to perform the test.
std-proposals_at_[hidden]> wrote:
> On Thu, Jun 9, 2022 at 7:40 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
> >
> > On Thu, 9 Jun 2022 at 18:29, Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >>
> >> On Thu, Jun 9, 2022 at 11:45 AM Hyman Rosen via Std-Proposals
> >> <std-proposals_at_[hidden]> wrote:
> >> >
> >> > The problem started with code like
> >> >
> >> > volatile char v[4] = 0;
> >> > ++v[2];
> >> >
> >> > On architectures that do not provide byte-level memory access, the
> abstract machine cannot be followed exactly. Reading or writing one bye
> from that array could require reading or writing all four. Rather than
> trying to come up with wording to cover such situations, the C standard
> made volatile access implementation-defined, with the understanding that
> compilers should implement volatile semantics as best they can in weird
> situations. But once the optimizationists captured the standardization
> processes, they took that implementation-defined behavior as permission to
> disregard volatile semantics altogether even on normal platforms.
> >> >
> >> > Here is an interesting case.
> >> >
> >> > void foo(int *p) {
> >> > if (!p) { printf("null pointer\n"); }
> >> > volatile bool b = false;
> >> > if (b) { abort(); }
> >> > *p = 0;
> >> > }
> >> >
> >> > It's implementing a poor-man's contract check that the argument is
> not null and trying to print a message if it is (but continue going). With
> correctly implemented volatile semantics, the message will print because
> printing is a side effect that must happen before the volatile access which
> is also a side effect. But the Microsoft compiler elides the volatile
> variable and test altogether, then sees that if the initial test is true
> then undefined behavior must result, and eliminates that test and the print.
> >>
> >> OK, I'm a normal C++ programmer and I read over that code. My first
> >> thought will be "why is there a pointless variable here?" The idea
> >> that the presence of `if(b)` should change *anything* about the
> >> execution of this code is absurd. This only makes sense to those who
> >> already know way too much about the language.
> >>
> >> It would make far more sense if you could just stick some kind of
> >> syntax that obviously says, "Follow the abstract machine exactly"
> >> there. Like:
> >>
> >> > void foo(int *p) {
> >> > [[exact_code]] {
> >> > if (!p) { printf("null pointer\n"); }
> >> > *p = 0;
> >> > }
> >> > }
> >>
> >> That is readable. It tells you what is happening. It makes it much
> >> more clear not just what is happening but why it is there. And the
> >> scope applied to the attribute tells you how much of the function it
> >> covers.
> >
> >
> > I am not sure what it means to "follow the abstract machine exactly".
> Could you explain in a manner that could be followed by a compiler?
>
> Not really. That's why it is an attribute. The idea is to say "don't
> mess around with this code." What that means is up to compilers and
> QOI.
>
So why would any compiler writer respect that attribute, if they can just
hoist the banner of QOI?
The closest I could say to an actual, standard-defined set of behavior
> would be to not have UB propagate backwards in time.That is, if there
> is some runtime construct that generates UB, every observable
> side-effect that happens-before the UB-invoking construct should
> execute as directed, whenever possible.
>
Have you seen S. Davis Herring's P1494R2, partial program correctness
(std::observable())? (I think that might be where Hyman Rosen's example
comes from; it's more or less the same idea, anyway.) That might help in
this specific case, but not for the debug case or for benchmarking.
A practical problem is that actually flushing the relevant buffers to the
console can take an arbitrary amount of time, and during that time may be
dependent on the program continuing to exist.
> As for scope, we really aren't interested in covering more than a single
> read or write (in the benchmarking case) of an object (which may involve
> more than one read or write of scalars). Quite possibly this could be
> accomplished by std::<bikeshed>_load and std::<bikeshed>_store functions
> acting as an optimizer barrier.
>
> The example that was presented is about the propagation of UB. That
> is, because `if(!p)` can only ever be true if the final statement
> yields UB. And since compilers are allowed to do *anything* in the
> event of UB, if the user passes a NULL `p`, then UB will result.
> Therefore, the compiler may freely assume that `p` will never be NULL,
> since UB doesn't promise anything about the state of the abstract
> machine *before* the UB was triggered. And therefore, it can optimize
> out the `if`.
>
No, the compiler can only do this if it has already decided to elide the
volatile read. If the test were changed to e.g. getchar() it would not be
elided.
There is no way to make this a property of a variable or of a memory
> read. It's a property of the relationship between the `if(!p)` and the
> `*p` code. You need to tell the compiler that you want it to actually
> do the `if` test and you actually want to see the results if possible,
> even if you will get UB in a moment.
>
If the compiler is required to perform the read and - perhaps this is
crucial - not permitted to assume that it retains a previously stored
value, even if there is no way for the address to escape - then it would
have to perform the test.
Received on 2022-06-10 01:10:33