C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: Julien Villemure-Fréchette <julien.villemure_at_[hidden]>
Date: Tue, 16 Aug 2022 13:50:18 -0400
> Generally speaking, a try-with-init ctor sounds dangerous. What happens if a subobject ctor throws, and the catch block absorbs the exception (e.g. does not call `throw` to propagate it). This means you would end up with a semi-constructed object, with some of its subobjects left uninitialized? It's even more hazardous if the constructor is a relocation constructor as some of the source subobjects will not be destructed.
The exception must be propagated somehow, the caller must know that its object was not properly constructed.

That can never happen: a constructor catch block cannot return normally (it has an implicit rethrow); and only the constructed member subobjects are automatically destroyed before entry to ctor catch block (hence you shouldn't access any non-static data members from the catch block). Furthermore, only if the exception was thrown from the body of a delegating ctor, then the dtor of this object is called before entry to the ctor catch block (hence you can't refer to 'this' from within the catch block).

> The issue is finding a way to add the initialization into the; the position between the `try` and `{` of a function-try-block is currently free real estate.

There is a builtin solutions for this:
Compose your class with private inheritance and/or use base from member idiom:
https://www.boost.org/doc/libs/1_80_0/libs/utility/doc/html/utility/utilities/base_from_member.

Note: The implementation of allocator aware containers (STL containers, ex std::vector) make extensive use of composition inheritance as a strategy to ensure memory allocation/deallocation is done before/after the container elements are constructed/destroyed.

Julien V.


-------- Original Message --------
From: "Sébastien Bini via Std-Proposals" <std-proposals_at_[hidden]>
Sent: August 16, 2022 9:25:34 a.m. EDT
To: Edward Catmur <ecatmur_at_[hidden]>
Cc: "Sébastien Bini" <sebastien.bini_at_[hidden]>, Std-Proposals <std-proposals_at_[hidden]>
Subject: Re: [std-proposals] Relocation in C++

On Mon, Aug 15, 2022 at 1:06 AM Edward Catmur <ecatmur_at_[hidden]>
wrote:

> On Sun, 14 Aug 2022 at 20:35, Edward Catmur <ecatmur_at_[hidden]>
> wrote:
>
>> On Tue, 2 Aug 2022 at 09:37, Sébastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>
>>> What about a separate proposal that would allow you to inject custom
>>> code on any constructor, before the initializer list? I would have
>>> personally found it useful on some occasions, and could prove its full
>>> usefulness with relocation:
>>>
>>> class T
>>> {
>>> public:
>>> T(T&& rhs) : std::unique_ptr<U> guard{MakeGuard(rhs)},
>>> T{std::move(rhs), 0} { guard.release(); }
>>> // or with relocation ctor:
>>> T(T rhs) : auto guard{MakeGuard(rhs)}, auto
>>> guard2{MakeSecondGuard(rhs)} /* default memberwise relocation of subojects
>>> is done here automatically */ { guard2.release(); guard.release(); }
>>> private:
>>> T(T&& rhs, int); // guarded ctor
>>> };
>>>
>>> The idea is to declare and initialize new names before the nominal list
>>> of initializers (delegating constructors or subobjects). The introduced
>>> objects would be recognized as their type and names appear clearly. Their
>>> destructor would be called at scope exit (even if due to an exception), in
>>> reverse declaration order as usual.
>>>
>>
>> Yes, this is what I'm suggesting as try-with-init. The issue is finding a
>> way to add the initialization into the; the position between the `try` and
>> `{` of a function-try-block is currently free real estate. Adding it
>> directly before the base-and-member-initialization list would be ambiguous
>> both to the compiler and to the reader.
>>
>
> Sorry, this should be the `try` and the `:` (introducing the
> *ctor-initializer*), of course.
>

To contextualize, you are suggesting something of the sort:

class T : B
{
  D _d;
public:
  T(T rhs) : try (auto guard = MakeGuard(rhs)) /* optionally subobject
initialization: */ B{rhs}, _d{rhs._d}
  {
    /* reloc ctor body */
  }
  catch (std::exception const&)
  {
    /* catch body */
  }
};

Generally speaking, a try-with-init ctor sounds dangerous. What happens if
a subobject ctor throws, and the catch block absorbs the exception (e.g.
does not call `throw` to propagate it). This means you would end up with a
semi-constructed object, with some of its subobjects left uninitialized?
It's even more hazardous if the constructor is a relocation constructor as
some of the source subobjects will not be destructed.
The exception must be propagated somehow, the caller must know that its
object was not properly constructed.

But I don't get why you would need a try-with-init approach: the newly
introduced object (e.g. `guard`) should clean-up its resources in its
destructor, which will be called automatically or by stack-unwinding if a
subobject ctor throws. What you need is, IMO, to have the ability to
initialize an extra object before the subobjects initialization.

The syntax I proposed is not ambiguous to the compiler as the newly
introduced names must declare their type (so they cannot be data members)
and name (so it cannot be a delegating/main class ctor call):

T(T rhs) : auto guard{MakeGuard(rhs)}, B{rhs}, _d{rhs._d} {}

I admit it may be confusing to the reader. But as long as we agree on the
solution, we are free to come up with any, clearer syntax we want:
T(T rhs) : [guard = MakeGuard(rhs)] B{rhs}, _d{rhs._d} {}

As you pointed out we cannot be satisfied with the default implementation,
>>> as doing the memberwise reloc-assign is wrong. I can only think of two ways
>>> of making reloc-assignments (x = reloc y;): either by using (a)
>>> std::swap, or by relying on the uncanny (b) destruct + placement new on
>>> this. Doing reloc-assignment using (a) comes at the cost of three
>>> std::relocate_at (for the swap) and one destructor call (y is destructed at
>>> the end of the swap):
>>>
>>> x = reloc y; // if done by std::swap, would cost 3 std::relocate_at,
>>> + destructor call on y.
>>>
>>> Doing reloc-assign using (b) is cheaper as it only costs one
>>> std::relocate_at call and one destructor call (on x):
>>>
>>> x = reloc y; // could be the same as std::destroy_at(&x);
>>> std::relocate_at(&y, &x);
>>>
>>> For (b) to work though, the destructor of y must not be called, as if it
>>> were used in reloc initialization:
>>>
>>> auto x = reloc y; // in this context, should x have a reloc ctor,
>>> the destructor of y is not called.
>>>
>>> I suggest the following way to declare a reloc-assignment operator, that
>>> would use (b) implementation:
>>>
>>> class T
>>> {
>>> public:
>>> // [...]
>>> T& operator=(T rhs) = reloc; // same as { std::destroy_at(this);
>>> return *std::relocate_at(&rhs, this); }
>>> };
>>>
>>> This declares a prvalue assignment operator, and defines it with a
>>> special implementation. We do not use = default; as users may expect
>>> memberwise relocation which is not what's happening. This = reloc; also
>>> allows us to make it a special function so it can share similarities with
>>> the reloc constructor: the destructor of rhs is not called at caller site,
>>> and rhs is not actually passed by value to the operator=() despite its
>>> signature, but by reference. This allows us to make the same set of
>>> optimizations as with the relocation constructor.
>>>
>>> The destroy + placement new approach is considered safe. Since rhs is a
>>> prvalue (or behind the scene, a reference to a prvalue), its address cannot
>>> be (by construction) this or any subobject of *this. As such
>>> std::destroy_at(this); cannot destruct rhs as a side effect.
>>>
>>> If an exception leaks through this reloc-assignment operator then *this
>>> will likely be in a destructed state (the exception was either thrown by
>>> std::destroy_at(this), leaving the object in a somewhat destructed state,
>>> or by std::relocate_at which will have failed to reconstruct *this
>>> properly). That destructed state may have disastrous consequences if *this
>>> is an object with automatic storage whose destructor will inevitably be
>>> called at some point (probably even during the stack unwinding incurred by
>>> the exception leak), which will result in calling a destructor on a
>>> destructed object. For this reason, I suggest that declaring this
>>> reloc-assign operator yields a compilation error if the destructor and the
>>> relocation operations are not noexcept. Note that this forcefully renders
>>> the reloc-assign operator noexcept. Also, std::relocate_at needs to be
>>> valid so the type must have a copy, move or relocation ctor.
>>>
>>> I further suggest adding an extra declaration: T& operator=(T rhs) =
>>> reloc(auto); which will silently ignore the reloc-assignment operator
>>> declaration if the above requirements are not met (useful for template
>>> code).
>>>
>>> Note that if a class does not provide a reloc-assignment operator,
>>> instructions such as `x = reloc y;` are still possible. reloc merely
>>> transforms y into a prvalue. Overload resolution will simply pick the
>>> move-assignment operator or copy-assignment operator instead if provided.
>>>
>>
>> I'm not too sure. The thing is, this isn't too difficult to write by
>> hand; even the exception safety stuff can be done with static_assert and
>> requires (for reloc(auto)).
>>
>
You can write it by hand, but you cannot write by hand that the source
object is not copied, nor that its destructor is not called. The = reloc
provides that magic bit for us, as it does for the relocation ctor.


> But also, there's a safer way to implement relocating assignment, which is
>> (as always) copy-and-swap:
>>
>> T& operator=(T rhs) { rhs.swap(*this); return *this; }
>>
>> We should be encouraging the use of copy-and-swap idiom (well, without
>> the copy) rather than the unsafe destroy-and-relocate. The only problem is
>> that there isn't a way to automatically generate a memberwise swap...
>>
>
I agree that having =reloc using destroy-and-relocate may encourage users
to use that idiom in places where it is not necessarily safe. Copy-and-swap
is always safe although not as optimized in this precise use case...

I believe it is of prime importance to have the reloc-assignment operator
to do exactly as intended. That means in `y = reloc x;`: we want the
resources of `x` to be (mem)moved to `y`, `y` being cleaned beforehand, and
to skip the destructor call on `x`. copy-and-swap does not quite fit in
here in my opinion.

What feels unsafe about destroy-and-relocate is that we destruct an object
within a member function. We can tweak things around and turn the
reloc-assignment operator into a static function behind the scene:
T& operator=(T src) = reloc; // same as: static T& assign_reloc(T& dst, T&
src) { std::destroy_at(&dst); return *std::relocate_at(&src, &dst); }

Here we no longer have the scary std::destroy_at(this), and we avoid
self-destruction in a member function (which may be an UB?).

I fear that if we fallback to copy-and-swap then determined users will rely
on dirty hacks to use some unsafe, home-made destroy-and-relocate instead.

-- Julien Villemure
Sent from my Android device with K-9 Mail.

Received on 2022-08-16 17:51:18