Date: Wed, 29 Jan 2025 12:39:00 +0000
On Tue, Jan 28, 2025 at 3:40 PM Jens Maurer wrote:
>
> > double *ptr = reinterpret_cast<double*>(0x7fff934a9d83);
> >
> > Then I want "++ptr" to add 8 to it, giving 0x7FFF934A9D8B. I want this
> > to be well-defined behaviour.
>
> It is an important optimizer assumption that all values
> of a given type "pointer to T" are suitably aligned for "T".
I've been mulling this over. Let's say we have a double*, and that
alignof(double) == 8.
If we increment this pointer or do any arithmetic on it, then an
x86_64 CPU will simply do:
add rdi, 8
There isn't any optimisation that can be performed here. I can't think
of another CPU that would do something differently that would allow
for any kind of optimisation here.
But when it comes to dereferencing the pointer, this is where
optimisation can come into it. There is the "movsd" instruction for
moving a floating-point number:
movsd xmm0, [aligned address]
And then there's the "movups" instruction for moving a floating-point
into unaligned memory:
movups xmm0, [unaligned address]
I'm not saying that we should ever dereference an unaligned pointer.
All I'm asking for is that pointer arithmetic will be well-defined for
all pointer types even when the address is unaligned -- and I think
this won't affect any optimisers. In fact it won't even require a
change to any modern compilers.
By the way I do realise that this will all fall apart when it comes to
"pointer tagging". If someone uses the lower 3 bits of a pointer to
store extra info, and if I then relocate the object into unaligned
memory and do arithmetic on a pointer inside it, then I'll corrupt the
pointer tag. I realise that.
>
> > double *ptr = reinterpret_cast<double*>(0x7fff934a9d83);
> >
> > Then I want "++ptr" to add 8 to it, giving 0x7FFF934A9D8B. I want this
> > to be well-defined behaviour.
>
> It is an important optimizer assumption that all values
> of a given type "pointer to T" are suitably aligned for "T".
I've been mulling this over. Let's say we have a double*, and that
alignof(double) == 8.
If we increment this pointer or do any arithmetic on it, then an
x86_64 CPU will simply do:
add rdi, 8
There isn't any optimisation that can be performed here. I can't think
of another CPU that would do something differently that would allow
for any kind of optimisation here.
But when it comes to dereferencing the pointer, this is where
optimisation can come into it. There is the "movsd" instruction for
moving a floating-point number:
movsd xmm0, [aligned address]
And then there's the "movups" instruction for moving a floating-point
into unaligned memory:
movups xmm0, [unaligned address]
I'm not saying that we should ever dereference an unaligned pointer.
All I'm asking for is that pointer arithmetic will be well-defined for
all pointer types even when the address is unaligned -- and I think
this won't affect any optimisers. In fact it won't even require a
change to any modern compilers.
By the way I do realise that this will all fall apart when it comes to
"pointer tagging". If someone uses the lower 3 bits of a pointer to
store extra info, and if I then relocate the object into unaligned
memory and do arithmetic on a pointer inside it, then I'll corrupt the
pointer tag. I realise that.
Received on 2025-01-29 12:39:12