ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Sébastien Bini <sebastien.bini_at_[hidden]>
Date: Mon, 30 May 2022 13:37:58 +0200

Hello all,

Sorry for the late reply. I'll try to respond thoroughly.

> POD is a deprecated term; presumably you mean trivial? It is not
observable if a trivial class is relocated by move+destroy, so there is no
ABI break for these.

Yes, trivial. And agreed.

*About the ABI break:*

To summarize what has been said:
- Non-trivial types need to opt-in to relocation by providing their own
operator reloc(). If I understand correctly, this would involve an ABI
change on functions where such types are passed by value, so that function
parameters can be relocated (or more precisely, their destructor call can
be alleviated).
- Types can also opt-out of the ABI break by adding a potentially
non-standardized attribute on their class definition. In which cases such
function parameters can only be relocated by move constructor, the
destructor being called at the caller site.
- The proposal can then force STL types (for those where it makes sense) to
implement their own operator reloc(), while library vendors are free to
opt-out of the ABI break.
- And yet we provide full support for relocatable-only types, even if they
have STL types as data members.

I am not an expert on ABI preservation, but that seems like a nice way to
go.

*About the synthesized relocation:*

> Yes, that's exactly what I'm suggesting. Composition is necessary - here
this means being able to combine a relocate-only and a move-destroy type as
data members of a class, and have that aggregate remain usable.
>
> What problems do you foresee?

I like that users don't have to explicitly call the destructor on the moved
part, should they mean to write the initializer explicitly. But I feel
mixed about this.

Let's consider the following scenario:

struct DB
{
    DB(DB&&) = default;
    void start();
    void end();
};

struct Transaction
{
    Transaction(Transaction&&) = default;
    explicit Transaction(DB& db); // calls DB::start
    ~Transaction() noexcept; // calls DB::end unless *this is in moved-from
state
};

struct T
{
    DB _db;
    Transaction _trans;

    operator reloc(T&& src) = default;
};

DB and Transaction are not relocatable as their operator reloc() is
implicitly deleted. As such T's operator reloc will use synthesized
relocation (move+destroy) from DB and Transaction. But it is important
that, in operator reloc(), members are constructed in the order they are
declared, and source members destructed in that reversed order.

This means that operator reloc() will:

   1. Initialize each base class and data member in declaration order (_db
   and then _trans).
   2. Initialization is done either with trivial relocation, operator
   reloc(), move constructor, or is user-provided.
   3. Right after the initialization is done, and just before the operator
   reloc() function body is entered, the necessary destructors are called. The
   destructor of all base classes and data-members that got initialized by
   move or user code is called on the source object in the reversed
   declaration order (src._trans and then src._db).

Now we pray that src._trans destructor will have no side effect on src._db
(normally it shouldn't as it knows it has been moved-from). But we are
walking on thin ice here. If DB had an operator reloc(), then src._db would
be considered destructed before src._trans destructor is called (which is
unusual in C++). If it had any effect on src._db, then it would end up
modifying a destructed object.

*About the exception safety:*

I admit there are no strong arguments against proper exception handling.
Here is what I can come up with:

Operator reloc() acts in three stages: (a) the base classes and data
members initialisation, (b) destruction in reversed declaration order on
any subobject of the source object that did not get true relocation
(synthesized relocation or user provided initialization), (c) the function
body.

If an exception leaks through in stage (a) then:

   1. in reversed declaration order, call the destructor of all initialized
   subobjects.
   2. in reversed declaration order, call the destructor of the following
   source subobjects:

   - all subobjects that did not get relocated yet
      - the subobjects that would get destructed in stage (b)
      - if the initialization that threw did not happen through true
      relocation, then the matching subobject. (If the initialization
happened by
      operator reloc() then we know that the source subobject is considered
      destructed.)

We call the destructor on the new instance subobjects first as they were
constructed more recently.
If any extra exception leaks through during those destructor calls, then
std::terminate is called (as it is done through stack-unwinding).

If an exception leaks through in stage (b) then:

   - All subobjects of the new instance are destructed in reversed
   declaration order.
   - All the remaining destructors of the source subobject are called in
   the same order.
   - If any extra exception leaks through during those destructor calls,
   then std::terminate is called (as it is done through stack-unwinding).

If an exception leaks through in stage (c) then:

   - All subobjects of the new instance are destructed in reversed
   declaration order.
   - If any extra exception leaks through during those destructor calls,
   then std::terminate is called (as it is done through stack-unwinding).

Last, delegating constructors with operator reloc() should also be allowed:
`operator reloc(T&& src) : T{std::move(src)} {}`.

   - In that case `src` is destructed right before the function body is
   entered.
   - If an exception leaks through the delegating constructor, then `src`
   destructor is called normally and the exception is propagated:
      - The new instance does not need to be destructed as the delegating
      constructor already took care of that.
      - If an extra exception leaks through the destructor of `src` then
      std::terminate is called.

All this looks nice, but I can't help to think of what would happen when a
destructor is called on an unmoved subobject of the source that has a
side-effect on an already relocated subobject. (see below)

*Exceptions and synthesized relocation:*

What happens now if someone adds an "operator reloc() = default;" to DB (or
worse, if this is done automatically should DB be trivial)? Let's rewrite T:

struct Throw
{
    operator reloc(Throw&&) { throw std::runtime_error{"oops"}; }
};

struct T
{
    DB _db;
    Throw _thrower;
    Transaction _trans;

    operator reloc(T&& src) = default;
};

In T's operator reloc():

   1. _db is initialized using its own operator reloc(). src._db is then
   considered destructed.
   2. _thrower will thow. Using the rules described above, operator reloc()
   needs then to destruct this->_db and then src._trans, before propagating
   the exception.
   3. this->_db.~DB() is called without any issue.
   4. src._trans.~Transaction() is called while src._trans is not in its
   moved-from state, so it will act on an object that theoretically reached
   its end of life (call src._db.end() while src._db is destructed). Ouch.

This is even more painful when we know that T default move constructor does
the right thing: when the exception is thrown, this->_db is destructed. src
continues to live as normal, even though src._db is in a moved-from state.
Given that DB can handle a call to DB::end() in a moved-from state, no
problem will arise.

We could argue that T should not be relocatable and that it's all bad user
code. But still I wonder how much synthesized relocation and exceptions may
break things...

On the other hand, we could state that synthesized relocation does not
happen, i.e. defaulting operator reloc() does not fallback on move+destruct
on unrelocatable subobjects. operator reloc() would then be implicitly
deleted if any base class or data-member has a (potentially implicitly)
deleted operator reloc().

Then users will be forced to write their own operator reloc() which has at
least the benefit of questioning them on whether their class should be
relocatable to begin with, and if yes, how to do it properly.

Best regards,
Sébastien

Received on 2022-05-30 11:38:11