Subject: Re: Accessing an object with a char pointer.
From: Yongwei Wu (wuyongwei_at_[hidden])
Date: 2020-05-17 10:09:50
On Sat, 16 May 2020 at 05:41, Brian Bi via Std-Discussion <
> On Fri, May 15, 2020 at 5:20 PM Hyman Rosen via Std-Discussion <
> std-discussion_at_[hidden]> wrote:
>> On Fri, May 15, 2020 at 9:30 AM J Decker via Std-Discussion <
>> std-discussion_at_[hidden]> wrote:
>>> On Wed, May 13, 2020 at 7:03 AM yo mizu via Std-Discussion <
>>> std-discussion_at_[hidden]> wrote:
>>> The issue really stems from big endian platforms... in memory (starting
>>> at 100d)
>>> 100 - 00 00 01 A4
>>> the address of the 'char' that is in that integer is +3 from the start
>>> of the pointer.
>>> This means that casting a int* to a char* may cause a shift to the
>>> pointer value; and worse, the conversion back from char* to int* may not
>>> know how many chars to unwind to get back to the start? (or maybe, going
>>> to a short* inbetween?)
>>> It's not so painful in a little endian world
>>> 100 - A4 01 00 00
>>> 100 is all the same for char*, short*, int* ,...
>> The C++ object model is nonsense, and you can't reason from nonsense.
>> The correct way to think about this, the way it's been done from the
>> early days of C, is that an object occupies a region of storage in
>> memory, a pointer to the object is the lowest address of that region
>> of storage, reinterpret_cast of that pointer to a different pointer
>> type is a no-op at runtime, and indirecting through such a pointer
>> treats the sizeof(TO_TYPE) lowest bytes of the storage as if it held
>> an object of TO_TYPE.
>> In my vision, memory is a bag of bits, and you can look at that bag
>> through the lens of any type and get whatever object of that type that
>> is represented by those bits. There is no such thing as "strict aliasing".
>> Any write to memory invalidates all cached vales (e.g., in registers)
>> unless the compiler can prove that the write wouldn't affect them.
>> The optimizationist compiler community has committed itself to willfully
>> disobeying what programmers write in their code, in patterns that go back
>> half a century. Instead of optimization being the transformation of code
>> into semantically equivalent forms that improve some metric, it has become
>> the task of finding more and more clever ways to break user intentions,
>> blithely saying "oh, you couldn't possibly have meant that" and discarding
>> swathes of code.
> I think I would take a middle ground here. I think the strict aliasing
> rule, which we had all the way back in C++98, is fine. The compiler gets to
> make more aggressive optimizations and if you want to reinterpret an `int`
> as a `float` or whatever, you just have to go through some function that
> will copy the bytes into a `float`, which the compiler can optimize out
> But what happened in C++17 is that, by the letter of the standard, the
> language became effectively unusable in many ways. For example, it seems
> that `memset` and `memcpy` are now magic library functions that would be UB
> if they were user-defined, and most uses of `offsetof` are not allowed
> anymore. When this kind of situation happens, the result is that the C++
> community ignores the standard. Some, like me, assume that the standard
> will eventually be fixed to resolve this issue. Others may feel that the
> standard has lost its legitimacy.
> The change that happened in C++17 did not even happen for a good reason.
> It was caused by P0137 <https://wg21.link/P0137>, whose real objective
> was not to break the C++ object model this badly; it just did so
> accidentally. WG21 can, and should, fix this problem, and ideally as soon
> as possible.
It should be accidental, IMHO. Telling any programmer that reinterpret_cast
a pointer can change the pointer value is astounding. Accessing an object
with char* is established usage, and I cannot think of a reason that the
committee want to break that.
-- Yongwei Wu URL: http://wyw.dcweb.cn/
STD-DISCUSSION list run by herb.sutter at gmail.com
Older Archives on Google Groups