ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Sébastien Bini <sebastien.bini_at_[hidden]>
Date: Mon, 2 May 2022 17:22:05 +0200

Hi again Giuseppe,

> I think this is what I meant by "move semantics plus the p1144 bits",
> that is, you need to tag types for which this can happen.

Yes and no, the relocation constructor can be implicitly declared and
defined.

> What do you mean precisely by "must remain usable"?
>
> A moved-from instance must be able to:
>
> 1) safely be destroyed
> 2) safely be reassigned to
> 3) (possibly) safely be self-swapped
>
> In 100% of the classes I've written, implementing move semantics in a
> way that satisfies 1) will also make it satisfy 2) and 3) with no extra
> work.

I get your point. But I argue that these rules aren't well understood nor
have reached a consensus.

The first thing that comes to mind is a P-impl class. How do you deal with
the P-impl pointer of the moved-from instance? Either you leave it to
nullptr, or you reallocate it directly.
- In the first case you will need to make a null pointer check in each
public function, or else using the class on a moved-from instance will lead
to a crash. I agree that reusing the moved-from instance without
reassigning it may lead to unexpected results, but I wouldn't expect a
crash.
- In the second case you will make a memory allocation that is most likely
unneeded as most developers discard moved objects right away.

Those problems do not appear if relocation is used instead of move. The
P-impl pointer can simply be left to null, and the destructor will be a
no-op. We don't introduce live objects with a null P-impl pointer.

In a more general case, the moved-from state may break class invariants
(like P-impl != nullptr). The relocation constructor solves this design
"flaw": a class instance can still be moved and the moved instance can be
left in a dirty state.

> This is surely my biggest overall criticism: given that the destructor
> is still called after a reloc operation, how is that different from the
> existing move semantics?

Mainly, see above.
But you can also have the opportunity to write an improved move ctor
(relocation ctor) that can be better optimized (std::list for instance).
Also, you can move const objects with relocation, which you cannot do with
move semantics (not a big deal I admit, my main point is on class design).

Last, in the first revision of the proposal the destructor was not called
and the relocation constructor had more benefits. Most of the time, it
could simply behave as a memcpy over the class memory. But this ABI break
issue forces me to still call the destructor on relocated instances and
hence change things a bit. I guess I still have to think this through...

> There's still a question pending here, however. In
>
> auto ptr = &obj;
> other = reloc obj;
> delete ptr;
>
> Does the reloc operation invalidate ptr? What's the rule in general for
> pointers and references to a relocated object, do they become invalid?

The proposal does not try to solve the dangling pointers issue. The rule is
to not reuse pointers or references on relocated objects, but this is not
enforced in the language, we will have to rely on tools for that. reloc obj
only forbids (i.e. compilation error) further mentions of the name obj (if
the name resolves to the same object than the one that was relocated).

For instance:

    auto obj = foo();
    auto& robj = obj;
    auto other = reloc obj;
    {
        auto obj = bar(); // ok
        do_smth(obj); // ok
    }
    do_smth(obj); // compilation error
    do_smth(robj); // compiles but robj is an invalid state

> Can I use `*this` in the object's destructor?

I admit the paper could be more precise on that. Short answer: yes.

reloc only calls the relocation constructor (given that your class provides
one). It's exactly the same as calling: T{const_cast<T~>(obj)}. It does not
reclaim any memory on its own, the actual end of scope of the object is not
changed.

So as long as you don't invalidate `*this` in the relocation ctor (which I
don't know how you would), you are safe.

> It does not need to remain valid. A moved-from instance can be merely be
> partially-formed.

See above.

> But do you have concrete examples where the benefits would be
> significant? std::list is a case of the stdlib shooting itself in the
foot.

The first one that comes to mind is std::list. But that doesn't mean it's
the only one.

If stdlib shot itself in the foot, then so do average developers when
writing move constructors.

Anytime dynamic memory allocation is involved, developers ask themselves
what to do in the move constructor. See for instance my P-impl second
implementation solution that I mentioned earlier (reallocate memory for the
moved-from instance), which I already saw in production code.

> I (and many others) profoundly disagree with that blog post's
> conclusion. Please review P2027, P2345 and the excellent talk by Marc
> Mutz at MeetingC++. The bottom line is that moved-from objects do not
> need to be valid, but only partially-formed.

I will.

> Clearly you are free to disagree with the partially-moved stance, but
> then there needs to be profound motivation in the paper regarding why
> relocation is more beneficial than designing classes according to that
> criterion.

I understand your point, yet I am also convinced of the advantages of what
I propose (especially the reloc operator). I can identify two weak points
though:
- relocation should alleviate destruction, as I first proposed. But this
leads to an ABI break that is not that easy to solve.
- introduction of a new type of reference. I would very much like to reuse
move references, but I still cannot think of a way to do it without
developers misusing it.

> To be honest this could be actually also be seen as a _bad_ thing, and
> therefore I'd like more motivation besides "let's just make it possible".
>
> If I have something like
>
> template <typename T>
> void f(T &obj) {
> T temp = std::move(obj);
> }
>
> and I pass in e.g. a `const unique_ptr`, this code fails to compile.
> It's actually good that it fails to compile -- it's warning me that it'd
> be attempting a modification on a const object.
>
> Are you saying that if instead I use `reloc`, that code should compile?
> If so, does the benefit of "I can reloc const object" truly overcome the
> con of "the compiler lets me modify const objects"?

This code would not compile. You can only call reloc on:
- local objects or function parameters
- on objects that are not reused until its end of scope

You cannot pass references or pointer deferences to reloc or it yields a
compilation error.

Unless I am missing something, cv-qualifiers on objects passed to reloc can
be safely discarded, just as they are discarded when passed to their
destructor.

Also, the following code will compile and I see no problem with it:

    const auto ptr = std::make_unique<T>();
    auto other = reloc ptr;

ptr was relocated although it was const. This is fine as it reached its end
of life anyway and is to be destructed (or is destructed by the reloc call
if we manage to alleviate this ABI break issue :/).

On Mon, May 2, 2022 at 1:30 PM Giuseppe D'Angelo <giuseppe.dangelo_at_[hidden]>
wrote:

> Hello!
>
> On 02/05/2022 12:23, Sébastien Bini wrote:
> > Hello Giuseppe,
> >
> > Thank you for your feedback.
> >
> > > I'm probably missing something, but from what you say above and from
> the
> > > TempDir example on page 7, all of this sounds like existing move
> > > semantics with the additional enforcement that a moved-from object
> isn't
> > > ever touched again (not even reassigned / reset).
> >
> > You could view it like that. It also allows for (a) better optimization
> > (relocation constructor can be better optimized in some cases, and
> > trivial relocation (move+destruct optimized into a memcpy) can happen),
>
> I think this is what I meant by "move semantics plus the p1144 bits",
> that is, you need to tag types for which this can happen.
>
>
> > (b) better semantics (it better conveys the intent), and (c) the move of
> > const volatile objects.
> >
> > The point is also, that from a class design point of view, that the move
> > constructor may break the class purpose as the moved-from instance must
> > remain usable.
>
> What do you mean precisely by "must remain usable"?
>
> A moved-from instance must be able to:
>
> 1) safely be destroyed
> 2) safely be reassigned to
> 3) (possibly) safely be self-swapped
>
> In 100% of the classes I've written, implementing move semantics in a
> way that satisfies 1) will also make it satisfy 2) and 3) with no extra
> work.
>
> This is surely my biggest overall criticism: given that the destructor
> is still called after a reloc operation, how is that different from the
> existing move semantics?
>
>
> >
> > > (From a syntax point of view, I'm not sure how that is desirable, as
> one
> > > could no longer something like `other = reloc obj; delete &obj;`, but
> I
> > > don't think it's a particularly compelling use case...)
> >
> > Indeed, I don't see a case where you would need to reuse the relocated
> > object. If you do need to, then move semantics are what you need.
>
> There's still a question pending here, however. In
>
> auto ptr = &obj;
> other = reloc obj;
> delete ptr;
>
> Does the reloc operation invalidate ptr? What's the rule in general for
> pointers and references to a relocated object, do they become invalid?
> Can I use `*this` in the object's destructor?
>
>
> >
> > > Anyways, a justification for relocation over move semantics on page 3
> > > says that "the move constructor performs extra operations to ensure
> that
> > > the moved-from object remains at a valid state." This is not
> universally
> > > true. It is true for the Standard Library (and makes certain std::list
> > > implementations of move operations expensive / noexcept(false));
> that's
> > > just stdlib's stance.
> >
> > Indeed it is not universally true. The phrasing may be too general, but
> > there are indeed cases where the move constructor is:
> > - either not trivial to implement, because of that moved-from instance
> > that must remain valid
>
> It does not need to remain valid. A moved-from instance can be merely be
> partially-formed.
>
>
> > - either performs extra costly operations that could be avoided, had we
> > known that the moved-object will not be reused.
>
> But do you have concrete examples where the benefits would be
> significant? std::list is a case of the stdlib shooting itself in the foot.
>
>
> >
> > > A class author can always state that a moved-from
> > > object of their class is only partially formed. In this last case, I
> am
> > > not sure the paper convincly justifies operator reloc over the status
> > > quo (which includes clang-tidy "use after move" checks and so on).
> >
> > Stating that the moved-from object is invalid and should not be touched
> > again is (in my opinion) a bad class design. You may have a look at
> > https://herbsutter.com/2020/02/17/move-simply/
> > <https://herbsutter.com/2020/02/17/move-simply/>, which further
> > motivated me to write this paper. Quoting Herb Sutter: "In C++, an
> > object is valid (meets its invariants) for its entire lifetime, which is
> > from the end of its construction to the start of its destruction (see
> > [basic.life]/4). Moving from an object does not end its lifetime, only
> > destruction does, so moving from an object does not make it invalid or
> > not obey its invariants." Relocation allows to make an object invalid
> > before it is destructed, albeit it is never touched again except by its
> > destructor.
>
> I (and many others) profoundly disagree with that blog post's
> conclusion. Please review P2027, P2345 and the excellent talk by Marc
> Mutz at MeetingC++. The bottom line is that moved-from objects do not
> need to be valid, but only partially-formed.
>
> Clearly you are free to disagree with the partially-moved stance, but
> then there needs to be profound motivation in the paper regarding why
> relocation is more beneficial than designing classes according to that
> criterion.
>
>
>
> > The classes you speak about would be the first one to benefit from the
> > relocation constructor (which ensures the relocated-instance is not
> > reused) and could also mark their move constructor as deleted. This
> > would respect the language philosophy.
>
> Sorry, what classes are you referring to?
>
> The paper and I were referring to std::list as example of class that
> needs to do more than needed in its move constructor (specifically: in
> some implementations it has to reallocate a sentinel node, making its
> move operations noexcept(false)). But the reasons for this are twofold;
> one is stdlib's position of "valid-but-unspecified" state, which
> requires such a reset, and the other one is that vendors didn't want to
> pay for an ABI break (because you _can_ redesign std::list in a way that
> doesn't need an externally allocated sentinel). Either way, std::list
> clearly wants a usable (non-deleted) move constructor...?
>
> For the TempDir example, I'm a proponent of 2) (make it partially
> formed), which is also precisely what you need to do to make it
> relocatable.
>
>
>
> > The reloc operator introduces a language-level guarantee (no use after
> > relocation) that will always be better than an external tool that
> > developers may simply not run.
> >
>
> I'm a very strong supporter of tooling, so I'm not really buying the
> argument of "there is tooling, it solves the problem, but developers may
> not run it". C++ without tooling is a doomed language. But I understand
> it's a personal opinion, yet I'd like this criticism to be properly
> addressed.
>
> On a more language level, I'd say that then the question "why all the
> relocation machinery, and not just an operator that moves + makes the
> name unusable" should be discussed in the paper.
>
>
> > > The above point wasn't my main criticism. My main criticism was that
> the
> > > paper seems to settle on a convoluted syntax for the already existing
> > > move semantics (plus the p1144 bits). Did I misunderstand something?
> >
> > Relocation and move semantics cover different needs. Move semantics are
> > still needed (for instance to move a vector or unique_ptr and reuse them
> > later). Relocation is here:
> > - for classes that don't mix well with move-semantics and would benefit
> > from being relocation-only (no move constructor) ;
>
> Do you have examples of such classes?
>
>
> > - for performance reasons: relocating a variable that will be touched
> > again may allow for better optimization than using move-semantics ;
>
> Ditto. As I said, I don't think I've ever encountered a class where
> "safe to destroy" was less expensive than "safe to destroy and to
> reassign to".
>
>
> > - for a safer code: we sometimes use std::move on variables that we
> > intend to never touch again. reloc enforces that.
>
> See above.
>
>
>
> > Besides, we can mark
> > objects as const and still relocate them, which is not possible with
> > move-semantics.
>
> To be honest this could be actually also be seen as a _bad_ thing, and
> therefore I'd like more motivation besides "let's just make it possible".
>
>
> If I have something like
>
> template <typename T>
> void f(T &obj) {
> T temp = std::move(obj);
> }
>
> and I pass in e.g. a `const unique_ptr`, this code fails to compile.
> It's actually good that it fails to compile -- it's warning me that it'd
> be attempting a modification on a const object.
>
> Are you saying that if instead I use `reloc`, that code should compile?
> If so, does the benefit of "I can reloc const object" truly overcome the
> con of "the compiler lets me modify const objects"?
>
>
> My 2 c,
>
> --
> Giuseppe D'Angelo
>

Received on 2022-05-02 15:22:16