As a note, the non-trivial versions have consistently been non-`constexpr`. I think that it would be beneficial to support constant initialization of `unaligned<T>` globals. This requires the use of `std::bit_cast`, not `std::memcpy`, which includes a constraint of `std::trivially_copyable<T>`.

On Wed, 13 Dec 2023 at 09:51, Arthur O'Dwyer via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
On Wed, Dec 13, 2023 at 5:21 AM Frederick Virchanza Gotham via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
On Tuesday, December 12, 2023, Thiago Macieira wrote:
On Tuesday, 12 December 2023 15:11:00 -03 Frederick Virchanza Gotham via Std-Proposals wrote:
> I don't know if it would ever make sense to have an
> 'unaligned<std::string>'

It doesn't. Unaligned only makes sense if you're trying to match the layout as
specified by some external ABI, which means it is not std::string or any
complex type.

In a previous post, I gave two use cases for 'unaligned':
(1) [...]
(2) Conserving RAM / EEPROM

Right. I think I agree with "conserving RAM" as a potential use case (although I don't personally think that has enough benefit to justify the cost of standardizing `unaligned`; for most working programmers the benefit will be zero).

In that model, basically, you're not thinking of a `std::unaligned<T>` as being a T object at all. You're thinking of it as being a "ticket for a T" — a bag-of-bits temporarily holding the dehydrated object representation of a T, from which you can reconstitute a T at any later time. ("Just add alignment!")
With that mental model, you might want merely "trivially relocatable" — not because you use memcpy to dehydrate/rehydrate the T, but rather because you're using it as a temporary holding-place in between constructing an instance of T at one address and destroying it at a different address.
Here's how that might look:

T construct();
int main() {
  std::unaligned<T> ut = construct();  // 1
  // ut.~unaligned() is called implicitly // 2
}

Line 1 constructs a prvalue T1, move-constructs T2 from that, then dehydrates (begins-to-relocate) from T2 by copying its bits into the `unaligned`'s storage. Meanwhile T1 is destroyed at the end of the full-expression.
Line 2 rehydrates (finishes-relocating) into a new object T3 by copying T3's bits from the `unaligned`'s storage, and then destroys T3.
Here's the implementation corresponding to the sample code and description above:

template<class T>
struct unaligned {
  char data_[sizeof(T)];
  unaligned(T&& t1) {
    alignas(T) char t2[sizeof(T)];
    ::new (t2) T(std::move(t1));
    memcpy(data_, t2, sizeof(T));
  }
  ~unaligned() {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data, sizeof(T));
    ((T*)t3)->~T();
  }
};

But. In this model, the `unaligned` always holds a ticket for a T. You can't cheaply extract the T from the `unaligned`, because there's no way for the `unaligned` ever to become "empty" (and adding a boolean so that you could store an "empty state" — basically turning this into `unaligned_optional` — defeats the purpose of making this type only sizeof(T) bytes big).
So if you want a .value() method, at the high level it has to look basically like this:

  T value() const {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data_, sizeof(T)); // rehydrate from storage into t3
    T result = *(T*)t3; // copy the value
    memcpy(data_, t3, sizeof(T)); // dehydrate from t3 back into storage
    return result;
  }

But notice that the second `memcpy` is pointless — the bytes in storage haven't changed! So we can make this simply:

  T value() const {
    alignas(T) char t3[sizeof(T)];
    memcpy(t3, data_, sizeof(T)); // rehydrate from storage into t3
    return std::move(*(T*)t3);
  }

Likewise, the copy constructor would have to look like this:

  unaligned(const unaligned& ut) {
    alignas(T) char t4[sizeof(T)];
    alignas(T) char t5[sizeof(T)];
    memcpy(t4, ut.data_, sizeof(T)); // rehydrate from storage
    ::new (t5) T(t4); // copy-construct the T; it's OK if this throws
    memcpy(data_, t5, sizeof(T)); // dehydrate into new storage
  }

So, this is definitely doable with trivial-relocatability instead of trivial-copyability. (As long as you don't mind that all those memcpys are technically UB.) But it does involve much more careful bookkeeping than if you can just assume trivial-copyability.

This is actually very close to an application of trivial relocatability that I've been discussing with Daniel Anderson within the past month. I've referred to it as "fractional reserve banking for object representations." The idea is that a concurrent/parallel algorithm can lend out "copies" of a trivially relocatable type (such as a unique_ptr) to multiple threads, by handing out copies of its object representation, as long as those threads promise not to modify the object they're given and promise to pay back the loan when they're done (by trivially relocating it back into the originating bank — as in .value() above, this is a no-op because the bank's object representation hasn't changed) and before the bank destroys its original object. This can be accounted as a legal series of sequential loans (relocate out, relocate back), but in fact it doesn't break anything (and so we make a bigger profit faster!) if those loans all happen in parallel. I'm vaguely hoping that we will see a WG21 paper on this specific application sometime in the next year. I doubt we could ever make it technically legal, but it shows that there's a userland appetite for the type-trait even aside from its "traditional" use in STL containers and algorithms.

Speaking of its traditional use, I've got a small new paper in the December mailing:
D3055R0 "Relax wording to permit relocation optimizations in the STL"

–Arthur
--
Std-Proposals mailing list
Std-Proposals@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals