ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Sébastien Bini <sebastien.bini_at_[hidden]>
Date: Tue, 16 Aug 2022 15:25:34 +0200

On Mon, Aug 15, 2022 at 1:06 AM Edward Catmur <ecatmur_at_[hidden]>
wrote:

> On Sun, 14 Aug 2022 at 20:35, Edward Catmur <ecatmur_at_[hidden]>
> wrote:
>
>> On Tue, 2 Aug 2022 at 09:37, Sébastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>
>>> What about a separate proposal that would allow you to inject custom
>>> code on any constructor, before the initializer list? I would have
>>> personally found it useful on some occasions, and could prove its full
>>> usefulness with relocation:
>>>
>>> class T
>>> {
>>> public:
>>> T(T&& rhs) : std::unique_ptr<U> guard{MakeGuard(rhs)},
>>> T{std::move(rhs), 0} { guard.release(); }
>>> // or with relocation ctor:
>>> T(T rhs) : auto guard{MakeGuard(rhs)}, auto
>>> guard2{MakeSecondGuard(rhs)} /* default memberwise relocation of subojects
>>> is done here automatically */ { guard2.release(); guard.release(); }
>>> private:
>>> T(T&& rhs, int); // guarded ctor
>>> };
>>>
>>> The idea is to declare and initialize new names before the nominal list
>>> of initializers (delegating constructors or subobjects). The introduced
>>> objects would be recognized as their type and names appear clearly. Their
>>> destructor would be called at scope exit (even if due to an exception), in
>>> reverse declaration order as usual.
>>>
>>
>> Yes, this is what I'm suggesting as try-with-init. The issue is finding a
>> way to add the initialization into the; the position between the `try` and
>> `{` of a function-try-block is currently free real estate. Adding it
>> directly before the base-and-member-initialization list would be ambiguous
>> both to the compiler and to the reader.
>>
>
> Sorry, this should be the `try` and the `:` (introducing the
> *ctor-initializer*), of course.
>

To contextualize, you are suggesting something of the sort:

class T : B
{
  D _d;
public:
  T(T rhs) : try (auto guard = MakeGuard(rhs)) /* optionally subobject
initialization: */ B{rhs}, _d{rhs._d}
  {
    /* reloc ctor body */
  }
  catch (std::exception const&)
  {
    /* catch body */
  }
};

Generally speaking, a try-with-init ctor sounds dangerous. What happens if
a subobject ctor throws, and the catch block absorbs the exception (e.g.
does not call `throw` to propagate it). This means you would end up with a
semi-constructed object, with some of its subobjects left uninitialized?
It's even more hazardous if the constructor is a relocation constructor as
some of the source subobjects will not be destructed.
The exception must be propagated somehow, the caller must know that its
object was not properly constructed.

But I don't get why you would need a try-with-init approach: the newly
introduced object (e.g. `guard`) should clean-up its resources in its
destructor, which will be called automatically or by stack-unwinding if a
subobject ctor throws. What you need is, IMO, to have the ability to
initialize an extra object before the subobjects initialization.

The syntax I proposed is not ambiguous to the compiler as the newly
introduced names must declare their type (so they cannot be data members)
and name (so it cannot be a delegating/main class ctor call):

T(T rhs) : auto guard{MakeGuard(rhs)}, B{rhs}, _d{rhs._d} {}

I admit it may be confusing to the reader. But as long as we agree on the
solution, we are free to come up with any, clearer syntax we want:
T(T rhs) : [guard = MakeGuard(rhs)] B{rhs}, _d{rhs._d} {}

As you pointed out we cannot be satisfied with the default implementation,
>>> as doing the memberwise reloc-assign is wrong. I can only think of two ways
>>> of making reloc-assignments (x = reloc y;): either by using (a)
>>> std::swap, or by relying on the uncanny (b) destruct + placement new on
>>> this. Doing reloc-assignment using (a) comes at the cost of three
>>> std::relocate_at (for the swap) and one destructor call (y is destructed at
>>> the end of the swap):
>>>
>>> x = reloc y; // if done by std::swap, would cost 3 std::relocate_at,
>>> + destructor call on y.
>>>
>>> Doing reloc-assign using (b) is cheaper as it only costs one
>>> std::relocate_at call and one destructor call (on x):
>>>
>>> x = reloc y; // could be the same as std::destroy_at(&x);
>>> std::relocate_at(&y, &x);
>>>
>>> For (b) to work though, the destructor of y must not be called, as if it
>>> were used in reloc initialization:
>>>
>>> auto x = reloc y; // in this context, should x have a reloc ctor,
>>> the destructor of y is not called.
>>>
>>> I suggest the following way to declare a reloc-assignment operator, that
>>> would use (b) implementation:
>>>
>>> class T
>>> {
>>> public:
>>> // [...]
>>> T& operator=(T rhs) = reloc; // same as { std::destroy_at(this);
>>> return *std::relocate_at(&rhs, this); }
>>> };
>>>
>>> This declares a prvalue assignment operator, and defines it with a
>>> special implementation. We do not use = default; as users may expect
>>> memberwise relocation which is not what's happening. This = reloc; also
>>> allows us to make it a special function so it can share similarities with
>>> the reloc constructor: the destructor of rhs is not called at caller site,
>>> and rhs is not actually passed by value to the operator=() despite its
>>> signature, but by reference. This allows us to make the same set of
>>> optimizations as with the relocation constructor.
>>>
>>> The destroy + placement new approach is considered safe. Since rhs is a
>>> prvalue (or behind the scene, a reference to a prvalue), its address cannot
>>> be (by construction) this or any subobject of *this. As such
>>> std::destroy_at(this); cannot destruct rhs as a side effect.
>>>
>>> If an exception leaks through this reloc-assignment operator then *this
>>> will likely be in a destructed state (the exception was either thrown by
>>> std::destroy_at(this), leaving the object in a somewhat destructed state,
>>> or by std::relocate_at which will have failed to reconstruct *this
>>> properly). That destructed state may have disastrous consequences if *this
>>> is an object with automatic storage whose destructor will inevitably be
>>> called at some point (probably even during the stack unwinding incurred by
>>> the exception leak), which will result in calling a destructor on a
>>> destructed object. For this reason, I suggest that declaring this
>>> reloc-assign operator yields a compilation error if the destructor and the
>>> relocation operations are not noexcept. Note that this forcefully renders
>>> the reloc-assign operator noexcept. Also, std::relocate_at needs to be
>>> valid so the type must have a copy, move or relocation ctor.
>>>
>>> I further suggest adding an extra declaration: T& operator=(T rhs) =
>>> reloc(auto); which will silently ignore the reloc-assignment operator
>>> declaration if the above requirements are not met (useful for template
>>> code).
>>>
>>> Note that if a class does not provide a reloc-assignment operator,
>>> instructions such as `x = reloc y;` are still possible. reloc merely
>>> transforms y into a prvalue. Overload resolution will simply pick the
>>> move-assignment operator or copy-assignment operator instead if provided.
>>>
>>
>> I'm not too sure. The thing is, this isn't too difficult to write by
>> hand; even the exception safety stuff can be done with static_assert and
>> requires (for reloc(auto)).
>>
>
You can write it by hand, but you cannot write by hand that the source
object is not copied, nor that its destructor is not called. The = reloc
provides that magic bit for us, as it does for the relocation ctor.

> But also, there's a safer way to implement relocating assignment, which is
>> (as always) copy-and-swap:
>>
>> T& operator=(T rhs) { rhs.swap(*this); return *this; }
>>
>> We should be encouraging the use of copy-and-swap idiom (well, without
>> the copy) rather than the unsafe destroy-and-relocate. The only problem is
>> that there isn't a way to automatically generate a memberwise swap...
>>
>
I agree that having =reloc using destroy-and-relocate may encourage users
to use that idiom in places where it is not necessarily safe. Copy-and-swap
is always safe although not as optimized in this precise use case...

I believe it is of prime importance to have the reloc-assignment operator
to do exactly as intended. That means in `y = reloc x;`: we want the
resources of `x` to be (mem)moved to `y`, `y` being cleaned beforehand, and
to skip the destructor call on `x`. copy-and-swap does not quite fit in
here in my opinion.

What feels unsafe about destroy-and-relocate is that we destruct an object
within a member function. We can tweak things around and turn the
reloc-assignment operator into a static function behind the scene:
T& operator=(T src) = reloc; // same as: static T& assign_reloc(T& dst, T&
src) { std::destroy_at(&dst); return *std::relocate_at(&src, &dst); }

Here we no longer have the scary std::destroy_at(this), and we avoid
self-destruction in a member function (which may be an UB?).

I fear that if we fallback to copy-and-swap then determined users will rely
on dirty hacks to use some unsafe, home-made destroy-and-relocate instead.

Received on 2022-08-16 13:25:47