C++ Logo


Advanced search

Re: [ub] Is memmove a general way to change the type of memory?

From: Lawrence Crowl <Lawrence_at_[hidden]>
Date: Thu, 7 Nov 2013 15:06:51 -0800
On 11/5/13, Jeffrey Yasskin <jyasskin_at_[hidden]> wrote:
> Adam (cc'ed) found an interesting result yesterday:
> Say we have a region of memory written by an unknown process:
> int* content = new int[sizeof(Foo)/sizeof(int)];
> initialize(tmp);
> We want to interpret the bytes in that region of memory as the object
> representation of a 'Foo'. We can't just reinterpret_cast them to a
> Foo, since then we're violating [basic.lval]. However, we know how to
> reinterpret the bytes of a float as the bytes of an int:
> float f = 3.14f;
> uint32_t i;
> static_assert(sizeof(i) == sizeof(f), "...");
> memcpy(&i, &f, sizeof(i));
> use(i); // Unspecified value, but implementations often define it.
> So:
> Foo f;
> memcpy(&f, content, sizeof(Foo));
> use(f); // Similar caveat.
> We can certainly transform this to:
> char tmp[sizeof(Foo)];
> memcpy(tmp, content, sizeof(Foo));
> memcpy(&f, tmp, sizeof(Foo));
> use(f);
> But in between the memcpy()s there, we're no longer using 'content',
> so to save memory let's transform that to:
> char tmp[sizeof(Foo)];
> memcpy(tmp, content, sizeof(Foo));
> Foo* foo = reinterpret_cast<Foo*>(content);
> memcpy(foo, tmp, sizeof(Foo));
> use(*foo);

The reinterpret_cast is suspicous to me because there is no aliasing
permitted between int[] and Foo.

> But memmove() is defined as "The memmove function copies n characters
> from the object pointed to by s2 into the object pointed to by s1.
> Copying takes place as if the n characters from the object pointed to
> by s2 are ?rst copied into a temporary array of n characters that does
> not overlap the objects pointed to by s1 and s2, and then the n
> characters from the temporary array are copied into the object pointed
> to by s1." in C99, and C++14 delegates to C99 for memmove()'s
> definition.

I think that phrasing was intended to deal with overlap, not aliasing.

> So we can optimize our implementation to:
> Foo* foo = reinterpret_cast<Foo*>(content);
> memmove(foo, content, sizeof(Foo));
> use(*foo);
> The type of a variable isn't generally thought to help with aliasing
> violations, so we should be fine transforming this to:
> memmove(content, content, sizeof(Foo));
> use(*reinterpret_cast<Foo*>(content));
> Leading to the conclusion that memmove() is the explicit way to reset
> the type of any block of memory. Clang and gcc successfully optimize
> this self-copy away, so it's even a free reinterpretation.

But by optimizing it away, any boundary in lifetimes is probably lost.

> Now, this is a little crazy, and Clang's TBAA annotations on the
> program below appear to indicate an aliasing violation, so I'd like to
> invite this list to point out where my logic's broken.

I think there may be interpretations of [basic.life] that affect your

Perhaps your last step is an issue in the compiler, but frankly I think
the standard is weak in mechanisms for programmers to indicate when an
object lifetime ends.

> Jeffrey
> --------
> // Compiled with `clang++ -O1 test.cc -o - -S -emit-llvm`
> #include <stdio.h>
> #include <string.h>
> void foo(void* p, size_t len) {
> memmove(p, p, len);
> }
> __attribute__((noinline)) int convert(float* f) {
> memmove(f, f, sizeof(*f));
> return *reinterpret_cast<int*>(f);
> }
> int main() {
> float f = 3.14;
> printf("%d\n", convert(&f));
> }

Lawrence Crowl

Received on 2013-11-08 00:06:53