On Thu, Mar 15, 2018 at 6:44 PM, Nick Lewycky <nlewycky@google.com> wrote:

As-if transformations maintain the actions of the written program under the assumption that undefined behaviour does not occur. The two are fundamentally intertwined.

Here's a few thought experiments: [...]
Today, all these optimizations are justified by the fact that any attempt to observe the difference before and after optimization must necessarily execute undefined behaviour, and executing undefined behaviour is a priori assumed to be impossible.

It's possible that there's another way to show that these and other optimizations are valid, but I would need to see a proposal on how to do that. I've tried to create one myself and failed. Frankly, I'm starting to believe that our current situation with UB really is the best possible design choice.

This is beautiful, by the way. Hear hear.

Can you fix this without adding a listing of CPU instructions to the language standard
and without fully defining it?

I don't need to.  The C++ Standard already specifies that atomic integers obey the rules
of 2's-complement arithmetic, and it doesn't spell out what those rules are.

I would consider it shall "obey the rules of 2's complement arithmetic" to be fully defined, as 1's complement machines would have to simulate 2's complement arithmetic. (Offhand, I think this is what happens with atomics today; 1's complement machines are expected to use a library call to implement the atomic and ensure 2's complement?)

This is relevant to JF's P0907R0 (and to my UB-preserving fork too), so I'm curious to explore this.
What N4700 says on the subject is...

> There are specializations of the `atomic` template for the integral types [...] `int`, [...]
>     template<> struct atomic<integral> {
>         [...]
>         integral operator++() noexcept;
>         integral operator+=(integral) noexcept;
>         [...]
>     };
> Descriptions are provided below only for members that differ from the primary template.
> The following operations perform arithmetic computations. [... a table follows; it does not include operator++ ...]
> [...]
> T fetch_key(T operand, memory_order order = memory_order_seq_cst) noexcept;
> [...]
> Remarks: For signed integer types, arithmetic is defined to use two's complement representation. There are no undefined results.
> T operator op=(T operand) noexcept;
> Effects: Equivalent to: return fetch_key(operand) op operand;

So, first of all, there's no official semantics provided for e.g.
    std::atomic<int> x(42);
But we can intuitively assume that `++x` should be equivalent to `x+=1`, and move on from there.
`x+=1` is officially defined to be equivalent to `x.fetch_add(1) + 1`.

Now things start breaking down, on non-two's-complement machines. We have to somehow make `x.fetch_add(1)` perform an operation which is equivalent to a "two's complement" addition, i.e., it needs to modify the bitwise representation of `x` in some way that corresponds to a two's-complement increment.
    std::atomic<int> y(INT_MAX);  // for the sake of argument, let's use a 16-bit sign-magnitude machine
This means that if we have the bit-pattern 0x7FFF (a 16-bit INT_MAX), incrementing it must produce the bit-pattern 0x8000, because that's what a two's complement addition would produce.
And then we do the second `+=`. It adds 1 to the bit-pattern 0x8000 and stores 0x8001 safely back into it.
But look at the return type of `y.fetch_add(1)`!  It is specified to return integral, i.e. int, i.e. it will return 0x8000, which is "negative zero" on this machine.
And then operator+= is specified to behave as-if `y.fetch_add(1) + 1`!
So we have "negative zero" 0x8000, and we're trying to "+1" to it, in the native ones'-complement arithmetic. If negative zero is a trap representation, we will get a trap. Otherwise, we will observe that "INT_MAX plus 1 plus 1" is "negative zero plus 1" is int(1) — anyway, it is certainly not INT_MIN as we were naively hoping!

The (intermediate) result is that atomics do not visibly behave in a two's-complement manner on a ones'-complement machine.

Now let's do the same operation on a two's-complement machine.

    std::atomic<int> y(INT_MAX);

We add 1 to the bit-pattern 0x7FFF and store 0x8000 safely back into it.
But look at the return type of `y.fetch_add(1)`!  It is specified to return integral, i.e. int, i.e. it will return 0x7FFF, which is INT_MAX on this machine.
And then operator+= is specified to behave as-if `y.fetch_add(1) + 1`!
So we have "INT_MAX" 0x7FFF, and we're trying to "+1" to it, in the native arithmetic. If overflow checking is enabled, we will get a trap. In general, we have undefined behavior here.  (That is, fetch_add has defined behavior on overflow, but operator+= has undefined behavior on overflow.)

The (ultimate) result is that atomics do not visibly behave in a two's-complement manner even on a two's-complement machine.

Now, I don't think this is fatal to C++. I think this is fatal to platforms attempting to implement a dialect of C++ where native `int` has an arithmetic other than two's complement. I strongly support the idea of dropping non-two's-complement arithmetic from C++ altogether. (And, incidentally, the wording for atomics is broken as demonstrated above; but that's a separate issue IMHO.)