C++ Logo

SG12

Advanced search

Subject: Re: [ub] Is memmove a general way to change the type of memory?
From: Lawrence Crowl (Lawrence_at_[hidden])
Date: 2013-11-07 17:06:51


On 11/5/13, Jeffrey Yasskin <jyasskin_at_[hidden]> wrote:
> Adam (cc'ed) found an interesting result yesterday:
>
> Say we have a region of memory written by an unknown process:
> int* content = new int[sizeof(Foo)/sizeof(int)];
> initialize(tmp);
>
> We want to interpret the bytes in that region of memory as the object
> representation of a 'Foo'. We can't just reinterpret_cast them to a
> Foo, since then we're violating [basic.lval]. However, we know how to
> reinterpret the bytes of a float as the bytes of an int:
>
> float f = 3.14f;
> uint32_t i;
> static_assert(sizeof(i) == sizeof(f), "...");
> memcpy(&i, &f, sizeof(i));
> use(i); // Unspecified value, but implementations often define it.
>
> So:
>
> Foo f;
> memcpy(&f, content, sizeof(Foo));
> use(f); // Similar caveat.
>
> We can certainly transform this to:
> char tmp[sizeof(Foo)];
> memcpy(tmp, content, sizeof(Foo));
> memcpy(&f, tmp, sizeof(Foo));
> use(f);
>
> But in between the memcpy()s there, we're no longer using 'content',
> so to save memory let's transform that to:
>
> char tmp[sizeof(Foo)];
> memcpy(tmp, content, sizeof(Foo));
> Foo* foo = reinterpret_cast<Foo*>(content);
> memcpy(foo, tmp, sizeof(Foo));
> use(*foo);

The reinterpret_cast is suspicous to me because there is no aliasing
permitted between int[] and Foo.

>
> But memmove() is defined as "The memmove function copies n characters
> from the object pointed to by s2 into the object pointed to by s1.
> Copying takes place as if the n characters from the object pointed to
> by s2 are ?rst copied into a temporary array of n characters that does
> not overlap the objects pointed to by s1 and s2, and then the n
> characters from the temporary array are copied into the object pointed
> to by s1." in C99, and C++14 delegates to C99 for memmove()'s
> definition.

I think that phrasing was intended to deal with overlap, not aliasing.

> So we can optimize our implementation to:
>
> Foo* foo = reinterpret_cast<Foo*>(content);
> memmove(foo, content, sizeof(Foo));
> use(*foo);
>
> The type of a variable isn't generally thought to help with aliasing
> violations, so we should be fine transforming this to:
>
> memmove(content, content, sizeof(Foo));
> use(*reinterpret_cast<Foo*>(content));
>
> Leading to the conclusion that memmove() is the explicit way to reset
> the type of any block of memory. Clang and gcc successfully optimize
> this self-copy away, so it's even a free reinterpretation.

But by optimizing it away, any boundary in lifetimes is probably lost.

>
> Now, this is a little crazy, and Clang's TBAA annotations on the
> program below appear to indicate an aliasing violation, so I'd like to
> invite this list to point out where my logic's broken.

I think there may be interpretations of [basic.life] that affect your
logic.

Perhaps your last step is an issue in the compiler, but frankly I think
the standard is weak in mechanisms for programmers to indicate when an
object lifetime ends.

> Jeffrey
>
> --------
>
> // Compiled with `clang++ -O1 test.cc -o - -S -emit-llvm`
>
> #include <stdio.h>
> #include <string.h>
>
> void foo(void* p, size_t len) {
> memmove(p, p, len);
> }
>
> __attribute__((noinline)) int convert(float* f) {
> memmove(f, f, sizeof(*f));
> return *reinterpret_cast<int*>(f);
> }
>
> int main() {
> float f = 3.14;
> printf("%d\n", convert(&f));
> }

-- 
Lawrence Crowl

SG12 list run by sg12-owner@lists.isocpp.org