On Fri, May 15, 2020 at 5:20 PM Hyman Rosen via Std-Discussion <std-discussion@lists.isocpp.org> wrote:
On Fri, May 15, 2020 at 9:30 AM J Decker via Std-Discussion <std-discussion@lists.isocpp.org> wrote:
On Wed, May 13, 2020 at 7:03 AM yo mizu via Std-Discussion <std-discussion@lists.isocpp.org> wrote:
The issue really stems from big endian platforms... in memory (starting at 100d)

100  - 00  00 01 A4

the address of the 'char' that is in that integer is +3 from the start of the pointer.
This means that casting a int* to a char* may cause a shift to the pointer value; and worse, the conversion back from char* to int* may not know how many chars to unwind to get back to the start?  (or maybe, going to a short* inbetween?)

It's not so painful in a little endian world

100  - A4 01 00 00

100 is all the same for char*, short*, int* ,... 

The C++ object model is nonsense, and you can't reason from nonsense.

The correct way to think about this, the way it's been done from the
early days of C, is that an object occupies a region of storage in
memory, a pointer to the object is the lowest address of that region
of storage, reinterpret_cast of that pointer to a different pointer
type is a no-op at runtime, and indirecting through such a pointer
treats the sizeof(TO_TYPE) lowest bytes of the storage as if it held
an object of TO_TYPE.

In my vision, memory is a bag of bits, and you can look at that bag
through the lens of any type and get whatever object of that type that
is represented by those bits. There is no such thing as "strict aliasing".
Any write to memory invalidates all cached vales (e.g., in registers)
unless the compiler can prove that the write wouldn't affect them.

The optimizationist compiler community has committed itself to willfully
disobeying what programmers write in their code, in patterns that go back
half a century.  Instead of optimization being the transformation of code
into semantically equivalent forms that improve some metric, it has become
the task of finding more and more clever ways to break user intentions,
blithely saying "oh, you couldn't possibly have meant that" and discarding
swathes of code.


I think I would take a middle ground here. I think the strict aliasing rule, which we had all the way back in C++98, is fine. The compiler gets to make more aggressive optimizations and if you want to reinterpret an `int` as a `float` or whatever, you just have to go through some function that will copy the bytes into a `float`, which the compiler can optimize out anyway.

But what happened in C++17 is that, by the letter of the standard, the language became effectively unusable in many ways. For example, it seems that `memset` and `memcpy` are now magic library functions that would be UB if they were user-defined, and most uses of `offsetof` are not allowed anymore. When this kind of situation happens, the result is that the C++ community ignores the standard. Some, like me, assume that the standard will eventually be fixed to resolve this issue. Others may feel that the standard has lost its legitimacy.

The change that happened in C++17 did not even happen for a good reason. It was caused by P0137, whose real objective was not to break the C++ object model this badly; it just did so accidentally. WG21 can, and should, fix this problem, and ideally as soon as possible.

Unfortunately, I have encountered resistance to this in the past. One person's point of view was that despite the brokenness of the current situation, allowing pointer arithmetic within objects as if they were `char` arrays would be even worse. Obviously, I strongly disagree with that view.

--
Brian Bi