sg12: Re: [ub] Objectives and tasks for SG12

From: Gabriel Dos Reis <gdr_at_[hidden]>
Date: Thu, 30 May 2013 20:23:42 -0500

Lawrence Crowl <crowl_at_[hidden]> writes:

| On 5/30/13, Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
| > On 05/29/2013 10:36 PM, Nevin Liber wrote:
| > > On 29 May 2013 14:35, Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
| > > > (1) Is a compiler diagnostic acceptable? Yes.
| > > > (2) Is a run-time abort acceptable? Yes.
| > > > (3) Is an unspecified result value acceptable? Yes.
| > > > (4) Is it acceptable that your compiler changes the behavior
| > > > of unrelated code that follows the overflow? That's very
| > > > surprising.
| > > >
| > > > Giving compilers latitude to choose among 1-3 (depending on
| > > > the target audience) is fine, but, in my opinion, prohibiting
| > > > option 4 would be an improvement.
| > >
| > > The counter argument is usually that (4) has a run time cost
| > > in that the overflow must now be detected instead of just
| > > assumed that it cannot happen. This effectively penalizes
| > > correct programs.
| >
| > This statement seems to be inaccurate for the majority of
| > current hardware. Signed integer overflow will just "work" on
| > the hardware level and give you some result, i.e. implement (3).
| > No extra checking is needed.
| >
| > This is exactly the reason why I think "signed integer overflow" is
| > a good example for the discussion: Current hardware exhibits only
| > a limited set of behavior, yet the C++ standard does not reflect
| > that, but gives permission to the compiler to do anything it wants.
| >
| > Can we quantify what we give up if we model current hardware
| > behavior more closely?
|
| Consider a program that adds a constant to a signed integer
| in a loop. Under the current model, the compiler can assume
| that the variable is monotonically increasing, and can therefore
| eliminate comparisons, which leads to simpler loops, which leads
| to vectorization, which leads to implementation in GPUs, ....
| Without that assumption, the chain of optimizations disappears.
| We really won't know the true tradeoff on real code until such
| optimizations are widely deployed.
|
| > > What does "behavior of unrelated code" even mean once we've
| > > invoked undefined behavior?
| >
| > Well, I thought the goal of SG12 was to discuss whether the
| > current definition of "undefined behavior" should be retained
| > for some (which?) cases, or whether something could be done to
| > restrain the set of valid executions. For example, we already
| > have the concept of "unspecified behavior", e.g. the sequencing
| > of evaluation of function arguments is unspecified. This means
| > the implementation is restricted to choose among a set of possible
| > behaviors, and not exhibit arbitrary behavior.
|
| My hope for SG12 was to get a clear understanding of the risks
| and benefits of undefined behavior. At present, the C++ community
| seem to be filled with fear of the risks, with no understanding of
| the benefits. The risks and benefits may vary between different
| language features, so I expect the present state to change.
|
| I think we would be wise to consider unspecified behavior as well.
| The issue is that with unspecified behavior, programmers have the
| same vulnerability to bugs, but the compiler has far less lattitude
| to help find them.
|
| One good result from SG12 is a list of features that have more than
| one possible outcome, so that programmers have a better understanding
| of when their programs stray from perfectly portable.

I am hoping SG12's recommendations and resolutions will be more than
just "educate programmers on undefined behavior". We have been educating
C++ programmers for more than 25 years about these issues, and I am not
quite sure we can categorically say we have won on this front -- in
fact, I think we haven't and we are not winning -- and if we did, we
haven't done it at the pace of sophistication of optimizers.

We also have to take into account when the potential risks posed by
"undefined behavior" outweigh the potential optimization it may bring.
We would need hard numbers and, of course, healthy dose of skepticism too.
(Many C++ programs aren't spec benchmarks.)

Another path to explore is to try to spell out in broad terms the
conditions certain valuable optimizations need, without requiring
unrestricted behavior. The example of loop vectorization you gave does
not need that everywhere else in the program, signed integer arithmetic
leads to undefined behavior. It does not need that a function that
isn't subject to vectorization behave in way that is most surprising --
see the various examples in the references.

-- Gaby

Received on 2013-05-31 03:30:35