ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Giuseppe D'Angelo <giuseppe.dangelo_at_[hidden]>
Date: Mon, 2 May 2022 18:23:07 +0200

On 02/05/2022 17:22, Sébastien Bini wrote:
> Hi again Giuseppe,
>
> > I think this is what I meant by "move semantics plus the p1144 bits",
> > that is, you need to tag types for which this can happen.
>
> Yes and no, the relocation constructor can be implicitly declared and
> defined.
>
> > What do you mean precisely by "must remain usable"?
> >
> > A moved-from instance must be able to:
> >
> > 1) safely be destroyed
> > 2) safely be reassigned to
> > 3) (possibly) safely be self-swapped
> >
> > In 100% of the classes I've written, implementing move semantics in a
> > way that satisfies 1) will also make it satisfy 2) and 3) with no extra
> > work.
>
> I get your point. But I argue that these rules aren't well understood
> nor have reached a consensus.

But, ultimately, the paper has to weight relocation vs. partially-formed
moves. The community's stance may be split, but that means that there's
a significant portion of users for which move implies partially formed,
and thus for which you need strong motivations in order to propose
further changes to the status quo.

> The first thing that comes to mind is a P-impl class. How do you deal
> with the P-impl pointer of the moved-from instance? Either you leave it
> to nullptr, or you reallocate it directly.
> - In the first case you will need to make a null pointer check in each
> public function, or else using the class on a moved-from instance will
> lead to a crash. I agree that reusing the moved-from instance without
> reassigning it may lead to unexpected results, but I wouldn't expect a
> crash.
> - In the second case you will make a memory allocation that is most
> likely unneeded as most developers discard moved objects right away.
>
>
>
> Those problems do not appear if relocation is used instead of move. The
> P-impl pointer can simply be left to null, and the destructor will be a
> no-op. We don't introduce live objects with a null P-impl pointer.
>
> In a more general case, the moved-from state may break class invariants
> (like P-impl != nullptr). The relocation constructor solves this design
> "flaw": a class instance can still be moved and the moved instance can
> be left in a dirty state.

In Qt we've been using pimpl since... forever. Our moved-from state for
pimpl'd value classes is partially formed (pimpl is nullptr); see the
relevant entries in QUIP-0019.

There are *no* checks done anywhere, except (if needed) in the
destructor and the assignment operators (if not implemented as
copy/move-and-swap or pure swap). Usually such checks are not even
there, because stuff like `delete pimpl` (or similarly using smart
pointers) just works™ even if pimpl == nullptr.

===

Aside: this allowed for a very straightforward upgrade from C++98 code,
which has also been the case for Qt (coming from 1994). Say you've got a
value class that fully honours the Rule Of Three, you want to add the
move operations and make it honour the Rule Of Five. In the partially
formed design, the only things you need to do is:

1) add a move constructor (that leaves the source object in a
partially-formed state)
2) add a move assignment operator (possibly idiomatically, via
move-and-swap or pure-swap, so no real implementation work to do here)
3) check that the copy assignment works on moved-from objects (in case,
copy-and-swap works)
4) check that the destructor works on moved-from objects

and that's it.

This approach is a massive win over the alternatives:

* adding a move constructor that leaves the source in a well-defined
state would end up being much more expensive to reach for than partially
formed. Amongst other things, it would impede noexcept(true) if one has
to reallocate the pimpl pointer;

* making pimpl == nullptr well-formed is a change in the class
invariants (cf. G. Romer's `IndirectInt` example), which would require
to review and possibly adapt *the entire implementation of the class* in
order to accommodate the new invariant. This is a complete non-starter
for codebases as big as Qt.

Add to the mix the classes that in C++98 honour the rule of three by
means of the rule of zero, so in C++11 they also honour the rule of
five, and for which the compiler-generated move operations leave the
source object in a partially formed state (again, `IndirectInt`,
`flat_map`, ...). No-one is going to audit ALL Qt's value classes to
make them not crash if used when moved-from.

===

Anyways, in Qt, we have received so far precisely 0 bug reports about
crashes due to a use-after-move due to a null pimpl pointer. The
empirical evidence is that people simply don't touch moved-from objects
(and clang-tidy complains if you do so). Therefore, I no longer see any
problem at leaving live objects around with a null pimpl pointer.

>
> > This is surely my biggest overall criticism: given that the destructor
> > is still called after a reloc operation, how is that different from the
> > existing move semantics?
>
> Mainly, see above.
> But you can also have the opportunity to write an improved move ctor
> (relocation ctor) that can be better optimized (std::list for instance).
> Also, you can move const objects with relocation, which you cannot do
> with move semantics (not a big deal I admit, my main point is on class
> design).
>
[snip]
> > But do you have concrete examples where the benefits would be
> > significant? std::list is a case of the stdlib shooting itself in the
> foot.
>
> The first one that comes to mind is std::list. But that doesn't mean
> it's the only one.
>
> If stdlib shot itself in the foot, then so do average developers when
> writing move constructors.

But only if they're willing to follow stdlib's footsteps. Specifically,
only if they

* need to ensure a valid state after move, and
* are not able to pay the price of an ABI break in order to have a
std::list which does NOT need an externally allocated sentinel node.

Is there any evidence that this set of users is big enough to justify
such a language change?
> > Are you saying that if instead I use `reloc`, that code should compile?
> > If so, does the benefit of "I can reloc const object" truly overcome the
> > con of "the compiler lets me modify const objects"?
>
> This code would not compile. You can only call reloc on:
> - local objects or function parameters
> - on objects that are not reused until its end of scope
>
> You cannot pass references or pointer deferences to reloc or it yields a
> compilation error.

OK, thanks for clarifying this point.

Regards,

-- 
Giuseppe D'Angelo

Received on 2022-05-02 16:23:11