On Mon, Aug 15, 2022 at 1:06 AM Edward Catmur <ecatmur@googlemail.com> wrote:

On Sun, 14 Aug 2022 at 20:35, Edward Catmur <ecatmur@googlemail.com> wrote:
On Tue, 2 Aug 2022 at 09:37, Sébastien Bini <sebastien.bini@gmail.com> wrote:
What about a separate proposal that would allow you to inject custom code on any constructor, before the initializer list? I would have personally found it useful on some occasions, and could prove its full usefulness with relocation:

class T
{
public:
T(T&& rhs) : std::unique_ptr<U> guard{MakeGuard(rhs)}, T{std::move(rhs), 0} { guard.release(); }
// or with relocation ctor:
T(T rhs) : auto guard{MakeGuard(rhs)}, auto guard2{MakeSecondGuard(rhs)} /* default memberwise relocation of subojects is done here automatically */ { guard2.release(); guard.release(); }
private:
T(T&& rhs, int); // guarded ctor
};

The idea is to declare and initialize new names before the nominal list of initializers (delegating constructors or subobjects). The introduced objects would be recognized as their type and names appear clearly. Their destructor would be called at scope exit (even if due to an exception), in reverse declaration order as usual.

Yes, this is what I'm suggesting as try-with-init. The issue is finding a way to add the initialization into the; the position between the `try` and `{` of a function-try-block is currently free real estate. Adding it directly before the base-and-member-initialization list would be ambiguous both to the compiler and to the reader.

Sorry, this should be the `try` and the `:` (introducing the *ctor-initializer*), of course.

To contextualize, you are suggesting something of the sort:

class T : B

{

D _d;

public:

T(T rhs) : try (auto guard = MakeGuard(rhs)) /* optionally subobject initialization: */ B{rhs}, _d{rhs._d}

{

/* reloc ctor body */

}

catch (std::exception const&)

{

/* catch body */

}

};

Generally speaking, a try-with-init ctor sounds dangerous. What happens if a subobject ctor throws, and the catch block absorbs the exception (e.g. does not call `throw` to propagate it). This means you would end up with a semi-constructed object, with some of its subobjects left uninitialized? It's even more hazardous if the constructor is a relocation constructor as some of the source subobjects will not be destructed.

The exception must be propagated somehow, the caller must know that its object was not properly constructed.

But I don't get why you would need a try-with-init approach: the newly introduced object (e.g. `guard`) should clean-up its resources in its destructor, which will be called automatically or by stack-unwinding if a subobject ctor throws. What you need is, IMO, to have the ability to initialize an extra object before the subobjects initialization.

The syntax I proposed is not ambiguous to the compiler as the newly introduced names must declare their type (so they cannot be data members) and name (so it cannot be a delegating/main class ctor call):

T(T rhs) : auto guard{MakeGuard(rhs)}, B{rhs}, _d{rhs._d} {}

I admit it may be confusing to the reader. But as long as we agree on the solution, we are free to come up with any, clearer syntax we want:

T(T rhs) : [guard = MakeGuard(rhs)] B{rhs}, _d{rhs._d} {}

As you pointed out we cannot be satisfied with the default implementation, as doing the memberwise reloc-assign is wrong. I can only think of two ways of making reloc-assignments (x = reloc y;): either by using (a) std::swap, or by relying on the uncanny (b) destruct + placement new on this. Doing reloc-assignment using (a) comes at the cost of three std::relocate_at (for the swap) and one destructor call (y is destructed at the end of the swap):

    x = reloc y; // if done by std::swap, would cost 3 std::relocate_at, + destructor call on y.

Doing reloc-assign using (b) is cheaper as it only costs one std::relocate_at call and one destructor call (on x):

    x = reloc y; // could be the same as std::destroy_at(&x); std::relocate_at(&y, &x);

For (b) to work though, the destructor of y must not be called, as if it were used in reloc initialization:

    auto x = reloc y; // in this context, should x have a reloc ctor, the destructor of y is not called.

I suggest the following way to declare a reloc-assignment operator, that would use (b) implementation:

class T
{
public:
// [...]
T& operator=(T rhs) = reloc; // same as { std::destroy_at(this); return *std::relocate_at(&rhs, this); }
};

This declares a prvalue assignment operator, and defines it with a special implementation. We do not use = default; as users may expect memberwise relocation which is not what's happening. This = reloc; also allows us to make it a special function so it can share similarities with the reloc constructor: the destructor of rhs is not called at caller site, and rhs is not actually passed by value to the operator=() despite its signature, but by reference. This allows us to make the same set of optimizations as with the relocation constructor.

The destroy + placement new approach is considered safe. Since rhs is a prvalue (or behind the scene, a reference to a prvalue), its address cannot be (by construction) this or any subobject of *this. As such std::destroy_at(this); cannot destruct rhs as a side effect.

If an exception leaks through this reloc-assignment operator then *this will likely be in a destructed state (the exception was either thrown by std::destroy_at(this), leaving the object in a somewhat destructed state, or by std::relocate_at which will have failed to reconstruct *this properly). That destructed state may have disastrous consequences if *this is an object with automatic storage whose destructor will inevitably be called at some point (probably even during the stack unwinding incurred by the exception leak), which will result in calling a destructor on a destructed object. For this reason, I suggest that declaring this reloc-assign operator yields a compilation error if the destructor and the relocation operations are not noexcept. Note that this forcefully renders the reloc-assign operator noexcept. Also, std::relocate_at needs to be valid so the type must have a copy, move or relocation ctor.

I further suggest adding an extra declaration: T& operator=(T rhs) = reloc(auto); which will silently ignore the reloc-assignment operator declaration if the above requirements are not met (useful for template code).

Note that if a class does not provide a reloc-assignment operator, instructions such as `x = reloc y;` are still possible. reloc merely transforms y into a prvalue. Overload resolution will simply pick the move-assignment operator or copy-assignment operator instead if provided.

I'm not too sure. The thing is, this isn't too difficult to write by hand; even the exception safety stuff can be done with static_assert and requires (for reloc(auto)).

You can write it by hand, but you cannot write by hand that the source object is not copied, nor that its destructor is not called. The = reloc provides that magic bit for us, as it does for the relocation ctor.

But also, there's a safer way to implement relocating assignment, which is (as always) copy-and-swap:

T& operator=(T rhs) { rhs.swap(*this); return *this; }

We should be encouraging the use of copy-and-swap idiom (well, without the copy) rather than the unsafe destroy-and-relocate. The only problem is that there isn't a way to automatically generate a memberwise swap...

I agree that having =reloc using destroy-and-relocate may encourage users to use that idiom in places where it is not necessarily safe. Copy-and-swap is always safe although not as optimized in this precise use case...

I believe it is of prime importance to have the reloc-assignment operator to do exactly as intended. That means in `y = reloc x;`: we want the resources of `x` to be (mem)moved to `y`, `y` being cleaned beforehand, and to skip the destructor call on `x`. copy-and-swap does not quite fit in here in my opinion.

What feels unsafe about destroy-and-relocate is that we destruct an object within a member function. We can tweak things around and turn the reloc-assignment operator into a static function behind the scene:

T& operator=(T src) = reloc; // same as: static T& assign_reloc(T& dst, T& src) { std::destroy_at(&dst); return *std::relocate_at(&src, &dst); }

Here we no longer have the scary std::destroy_at(this), and we avoid self-destruction in a member function (which may be an UB?).

I fear that if we fallback to copy-and-swap then determined users will rely on dirty hacks to use some unsafe, home-made destroy-and-relocate instead.