On Mon, 30 May 2022 at 05:38, Sébastien Bini <sebastien.bini@gmail.com> wrote:
About the synthesized relocation:

> Yes, that's exactly what I'm suggesting. Composition is necessary - here this means being able to combine a relocate-only and a move-destroy type as data members of a class, and have that aggregate remain usable.
>
> What problems do you foresee?

I like that users don't have to explicitly call the destructor on the moved part, should they mean to write the initializer explicitly. But I feel mixed about this.

Let's consider the following scenario:

struct DB
{
    DB(DB&&) = default;
    void start();
    void end();
};

struct Transaction
{
    Transaction(Transaction&&) = default;
    explicit Transaction(DB& db); // calls DB::start
    ~Transaction() noexcept; // calls DB::end unless *this is in moved-from state
};

struct T
{
    DB _db;
    Transaction _trans;

    operator reloc(T&& src) = default;
};


DB and Transaction are not relocatable as their operator reloc() is implicitly deleted. As such T's operator reloc will use synthesized relocation (move+destroy) from DB and Transaction. But it is important that, in operator reloc(), members are constructed in the order they are declared, and source members destructed in that reversed order.

This means that operator reloc() will:
  1. Initialize each base class and data member in declaration order (_db and then _trans).
  2. Initialization is done either with trivial relocation, operator reloc(), move constructor, or is user-provided.
  3. Right after the initialization is done, and just before the operator reloc() function body is entered, the necessary destructors are called. The destructor of all base classes and data-members that got initialized by move or user code is called on the source object in the reversed declaration order (src._trans and then src._db).
Now we pray that src._trans destructor will have no side effect on src._db (normally it shouldn't as it knows it has been moved-from). But we are walking on thin ice here. If DB had an operator reloc(), then src._db would be considered destructed before src._trans destructor is called (which is unusual in C++). If it had any effect on src._db, then it would end up modifying a destructed object.

The defaulted operator reloc can't work here anyway, since Transaction needs to be informed that DB has relocated. 

So Transaction must already have a ctor Transaction(Transaction&&, DB&) and the relocator of T must invoke that ctor:

T::operator reloc(T&& src) : _trans(std::move(src._trans), _db) {}

Or, if Transaction has its own operator reloc, then it must in addition have a rebind(DB&) member function, and the relocator of T must invoke that function:

T::operator reloc(T&& src) { _trans.rebind(_db); }

T is committed to do one of these, since it is responsible for maintaining the relationship between DB and Transaction.

An alternative is that DB keeps a list of pending (bound) transactions and automatically causes those to rebind when DB is moved. This is significantly safer, and not excessively difficult to implement. Or, possibly, DB can use pImpl idiom and Transaction can store a reference to that heap-allocated DB::Impl object. In either of these cases, the defaulted operator reloc will work fine.

About the exception safety:

I admit there are no strong arguments against proper exception handling. Here is what I can come up with:

Operator reloc() acts in three stages: (a) the base classes and data members initialisation, (b) destruction in reversed declaration order on any subobject of the source object that did not get true relocation (synthesized relocation or user provided initialization), (c) the function body.

If an exception leaks through in stage (a) then:
  1. in reversed declaration order, call the destructor of all initialized subobjects.
  2. in reversed declaration order, call the destructor of the following source subobjects:
    • all subobjects that did not get relocated yet
    • the subobjects that would get destructed in stage (b)
    • if the initialization that threw did not happen through true relocation, then the matching subobject. (If the initialization happened by operator reloc() then we know that the source subobject is considered destructed.)
We call the destructor on the new instance subobjects first as they were constructed more recently.
If any extra exception leaks through during those destructor calls, then std::terminate is called (as it is done through stack-unwinding).

If an exception leaks through in stage (b) then:
  • All subobjects of the new instance are destructed in reversed declaration order.
  • All the remaining destructors of the source subobject are called in the same order.
  • If any extra exception leaks through during those destructor calls, then std::terminate is called (as it is done through stack-unwinding).
If an exception leaks through in stage (c) then:
  • All subobjects of the new instance are destructed in reversed declaration order.
  • If any extra exception leaks through during those destructor calls, then std::terminate is called (as it is done through stack-unwinding).
Last, delegating constructors with operator reloc() should also be allowed: `operator reloc(T&& src) : T{std::move(src)} {}`.
  • In that case `src` is destructed right before the function body is entered.
  • If an exception leaks through the delegating constructor, then `src` destructor is called normally and the exception is propagated:
    • The new instance does not need to be destructed as the delegating constructor already took care of that.
    • If an extra exception leaks through the destructor of `src` then std::terminate is called.
All this looks nice, but I can't help to think of what would happen when a destructor is called on an unmoved subobject of the source that has a side-effect on an already relocated subobject. (see below)

Exceptions and synthesized relocation:

What happens now if someone adds an "operator reloc() = default;" to DB (or worse, if this is done automatically should DB be trivial)? Let's rewrite T:

struct Throw
{
    operator reloc(Throw&&) { throw std::runtime_error{"oops"}; }
};

struct T
{
    DB _db;
    Throw _thrower;
    Transaction _trans;

    operator reloc(T&& src) = default;
};

In T's operator reloc():
  1. _db is initialized using its own operator reloc(). src._db is then considered destructed.
  2. _thrower will thow. Using the rules described above, operator reloc() needs then to destruct this->_db and then src._trans, before propagating the exception.
  3. this->_db.~DB() is called without any issue.
  4. src._trans.~Transaction() is called while src._trans is not in its moved-from state, so it will act on an object that theoretically reached its end of life (call src._db.end() while src._db is destructed). Ouch.
This is even more painful when we know that T default move constructor does the right thing: when the exception is thrown, this->_db is destructed. src continues to live as normal, even though src._db is in a moved-from state. Given that DB can handle a call to DB::end() in a moved-from state, no problem will arise.

We could argue that T should not be relocatable and that it's all bad user code. But still I wonder how much synthesized relocation and exceptions may break things...

On the other hand, we could state that synthesized relocation does not happen, i.e. defaulting operator reloc() does not fallback on move+destruct on unrelocatable subobjects. operator reloc() would then be implicitly deleted if any base class or data-member has a (potentially implicitly) deleted operator reloc().

Then users will be forced to write their own operator reloc() which has at least the benefit of questioning them on whether their class should be relocatable to begin with, and if yes, how to do it properly.

This is the same issue as before: T::T(T&&) = default is incorrect, since Transaction must be informed that DB has relocated, unless (as above) DB is aware of Transaction or is pImpl. It is unreasonable to expect the default operator reloc to work in a circumstance where the default move constructor does not.

Assuming that either of the above safety improvements is made, such that T::T(T&&) works, we must conclude that the exception cleanup destroys src._trans before dest._db.

If neither of the above safety improvements obtains (perhaps Transaction is a third-party bolt-on) then the move constructor T::T(T&&) must unbind Transaction before invoking DB::DB(DB&&), probably via constructor deferral. If all the subobject types are movable, then T::operator reloc(T&&) can also defer to a constructor, as you define above. However, if some data member is immovable, we will need some other mechanism to unbind Transaction before relocating DB, which is why I suggest adding an init-statement to occur before the ctor-initializer in an operator reloc function-try-block.