C++ Logo

std-discussion

Advanced search

Re: Is it valid use reinterpret_cast to form pointer to object?

From: Lénárd Szolnoki <cpp_at_[hidden]>
Date: Tue, 01 Aug 2023 07:57:53 +0200
On 1 August 2023 03:42:46 CEST, Thiago Macieira via Std-Discussion <std-discussion_at_[hidden]> wrote:
>On Monday, 31 July 2023 17:21:06 PDT Ville Voutilainen wrote:
>> I don't understand why start_lifetime_as is brought up in this
>> discussion, and where that paper shows
>> the example in the beginning of the thread being UB.
>
>I think it was incorrectly brought up. The answer to the OP is that
>std::launder() is required because a std::byte array and T aren't pointer-
>interconvertible, presuming one new or std::construct_at have been done to
>actually start the lifetime of a T.
>
>However, it was brought up and I read the paper, then I pointed my objections.
>
>> start_lifetime_as
>> convinces the language that a bag of representation bits
>> are a valid object of trivial type.
>
>The paper specifically says that it convinces the language that the lifetime
>starts now, not that it had already started and you're just forming a pointer
>to it.
>
>> But the example in the
>> thread-starting message is not like that at all, it does a placement
>> new of a T object into a storage location, and then separately obtains
>> a pointer to T into that same storage location.
>> If the status quo wording somewhere says that's UB, then we should
>> have an almost trivial issue submission saying "surely not".
>
>I agree with you, but the paper implies it is UB. The language for
>[basic.compound]/4 seems to confirm that.
>
>"If two objects are pointer-interconvertible, then they have the same address,
>and it is possible to obtain a pointer to one from a pointer to the other via
>a reinterpret_cast ([expr.reinterpret.cast])."
>
>Strictly speaking, this is an "if" (sufficient condition), not "if and only if"
>(sufficient and necessary). That means there could be other conditions under
>which reinterpret_cast can be used on pointers without causing UB, but the
>language here strongly implies it is a requirement. Though nothing in
>[expr.reinterpret.cast] corroborates.

That's incorrect. The wording says that the pointer value remains unchanged, that is it still points to the storage. Nothing says that it would point to the object created with new. This is why you need launder here.

The need for launder is much more clear regarding compiler optimisations if the same storage is used to hold different objects during its lifetime. But you technically need it when the storage only ever holds a single object.

>
>> That should also have nothing to do with start_lifetime_as, because
>> that placement-new plus the later address-obtaining and use
>> should work even for non trivial types, for any T, regardless of
>> whether trivial or complex, so start_lifetime_as doesn't solve
>> any problems here anyway.
>
>Agreed but in this case start_lifetime_as wouldn't be the solution either. If
>the object's lifetime has already started, then it should be std::launder
>instead or (as you and I seem to agree) reinterpret_cast.
>
>> > But the standard also says that memcpy() is allowed to implicitly create
>> > objects, something this paper even reminds us of. Since any read() implies
>> > a memcpy() from some other storage, I argue that the object's lifetime
>> > was already started.
>>
>> Well, there's a limited bunch of special functions that the standard
>> recognizes as implicit-lifetime-starting.
>> malloc is one of them, memcpy is another, but not all operations that
>> copy memory are in that bunch.
>
>True, but I argue that this is the very case where it is implicitly starting a
>lifetime, because it is copying a full object whose lifetime was previously
>started elsewhere onto this buffer.
>
>> That's why we need start_lifetime_as, to be able to provide an escape
>> hatch to communicate to the language
>> that such a special operation is in play, even though the standard
>> doesn't explicitly recognize it as such,
>> and that also makes that bunch extensible.
>
>I understand what you're saying, but I only agree in very limited
>circumstances, for automatic or static storage buffers. For dynamic storage,
>you obtained such using malloc() or an equivalent platform-specific blessed
>function (mmap, brk) and/or copied an extant object there with memcpy().
>
>Even for automatic and static storage buffers, I am arguing that read(),
>recv(), recvfrom(), recvmsg() perform a memcpy() from an unseen storage of an
>extant object that was previously write()ten. I also argue that mmap() does
>such a memcpy() too, like a lot of other functions do. Therefore, this should
>not be UB:
>
> struct sockaddr_storage buf;
> socklen_t len = sizeof(buf);
> getsockname(sockfd, reinterpret_cast<sockaddr *>(&buf), &len);
> if (buf.ss_family == AF_INET)
> do_inet(reinterpret_cast<sockaddr_in *>(buf));
> else if (buf.ss_family == AF_INET6)
> do_inet6(reinterpret_cast<sockaddr_in6 *>(buf));
>
>I advise that people to just use a union here and benefit from the common
>initial sequence rule, but the language of pointer-interconvertibility and the
>fact that getsockname() performed a memcpy() from such a union type means the
>code above must not be UB.
>
>The actual implementation inside the Linux kernel does not have the union. It
>does however start the lifetime of a sockaddr_in or sockaddr_in6 object (using
>the C language rules for that) then memcpy()s onto your buffer, so
>std::launder() would be correct. The FreeBSD implementation is even more
>explicit by using malloc()
>https://github.com/freebsd/freebsd-src/blob/main/sys/netinet/in_pcb.c#L1832

Received on 2023-08-01 05:58:00