ISOCPP std-proposals List: Re: [std-proposals] [[packed]] std::unaligned

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Wed, 13 Dec 2023 09:51:31 -0500

On Wed, Dec 13, 2023 at 5:21 AM Frederick Virchanza Gotham via
Std-Proposals <std-proposals_at_[hidden]> wrote:

> On Tuesday, December 12, 2023, Thiago Macieira wrote:
>
>> On Tuesday, 12 December 2023 15:11:00 -03 Frederick Virchanza Gotham via
>> Std-Proposals wrote:
>> > I don't know if it would ever make sense to have an
>> > 'unaligned<std::string>'
>>
>> It doesn't. Unaligned only makes sense if you're trying to match the
>> layout as
>> specified by some external ABI, which means it is not std::string or any
>> complex type.
>>
>
> In a previous post, I gave two use cases for 'unaligned':
> (1) [...]
> (2) Conserving RAM / EEPROM
>

Right. I think I agree with "conserving RAM" as a potential use case
(although I don't personally think that has enough benefit to justify the
cost of standardizing `unaligned`; for most working programmers the benefit
will be zero).

In that model, basically, you're not thinking of a `std::unaligned<T>` as
being a T object at all. You're thinking of it as being a "ticket for a T"
— a bag-of-bits temporarily holding the dehydrated object representation of
a T, from which you can reconstitute a T at any later time. ("Just add
alignment!")
With that mental model, you *might* want merely "trivially relocatable" —
not because you use memcpy to dehydrate/rehydrate the T, but rather because
you're using it as a *temporary holding-place in between* constructing an
instance of T at one address and destroying it at a different address.
Here's how that might look:

T construct();
int main() {
  std::unaligned<T> ut = construct(); // 1
  // ut.~unaligned() is called implicitly // 2
}

Line 1 constructs a prvalue T1, move-constructs T2 from that, then
*dehydrates* (begins-to-relocate) from T2 by copying its bits into the
`unaligned`'s storage. Meanwhile T1 is destroyed at the end of the
full-expression.
Line 2 *rehydrates* (finishes-relocating) into a new object T3 by copying
T3's bits from the `unaligned`'s storage, and then destroys T3.
Here's the implementation corresponding to the sample code and description
above:

template<class T>
struct unaligned {
  char data_[sizeof(T)];
  unaligned(T&& t1) {
    alignas(T) char t2[sizeof(T)];
    ::new (t2) T(std::move(t1));
    memcpy(data_, t2, sizeof(T));
  }
  ~unaligned() {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data, sizeof(T));
    ((T*)t3)->~T();
  }
};

But. In this model, the `unaligned` *always* holds a ticket for a T. You
can't cheaply extract the T from the `unaligned`, because there's no way
for the `unaligned` ever to become "empty" (and adding a boolean so that
you could store an "empty state" — basically turning this into
`unaligned_optional` — defeats the purpose of making this type only
sizeof(T) bytes big).
So if you want a .value() method, at the high level it has to look
basically like this:

  T value() const {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data_, sizeof(T)); // rehydrate from storage into t3
    T result = *(T*)t3; // copy the value
    memcpy(data_, t3, sizeof(T)); // dehydrate from t3 back into storage
    return result;
  }

But notice that the second `memcpy` is pointless — the bytes in storage
haven't changed! So we can make this simply:

  T value() const {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data_, sizeof(T)); // rehydrate from storage into t3
    return std::move(*(T*)t3);
  }

Likewise, the copy constructor would have to look like this:

  unaligned(const unaligned& ut) {
    alignas(T) char t4[sizeof(T)];
    alignas(T) char t5[sizeof(T)];
    memcpy(t4, ut.data_, sizeof(T)); // rehydrate from storage
    ::new (t5) T(t4); // copy-construct the T; it's OK if this throws
    memcpy(data_, t5, sizeof(T)); // dehydrate into new storage
  }

So, this is definitely doable with trivial-relocatability instead of
trivial-copyability. (As long as you don't mind that all those memcpys are
*technically* UB.) But it does involve much more careful bookkeeping than
if you can just assume trivial-copyability.

This is actually very close to an application of trivial relocatability
that I've been discussing with Daniel Anderson within the past month. I've
referred to it as "fractional reserve banking for object representations."
The idea is that a concurrent/parallel algorithm can lend out "copies" of a
trivially relocatable type (such as a unique_ptr) to multiple threads, by
handing out copies of its object representation, as long as those threads
promise not to modify the object they're given and promise to pay back the
loan when they're done (by trivially relocating it back into the
originating bank — as in .value() above, this is a no-op because the bank's
object representation hasn't changed) and before the bank destroys its
original object. This can be accounted as a legal series of *sequential*
loans (relocate out, relocate back), but in fact it doesn't break anything
(and so we make a bigger profit faster!) if those loans all happen *in
parallel*. I'm vaguely hoping that we will see a WG21 paper on this
specific application sometime in the next year. I doubt we could ever make
it *technically* legal, but it shows that there's a userland appetite for
the type-trait even aside from its "traditional" use in STL containers and
algorithms.

Speaking of its traditional use, I've got a small new paper in the December
mailing:
D3055R0 "Relax wording to permit relocation optimizations in the STL"
<https://quuxplusone.github.io/draft/d3055-relocation.html>

–Arthur

Received on 2023-12-13 14:51:46