ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Fri, 4 Mar 2022 18:07:50 +0100

On Fri, 4 Mar 2022 at 15:03, Maciej Cencora <m.cencora_at_[hidden]> wrote:

> śr., 2 mar 2022 o 22:57 Edward Catmur <ecatmur_at_[hidden]> napisał(a):
> >
> > On Wed, 2 Mar 2022 at 12:33, Maciej Cencora via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
> >>
> >> Lol. I am not a proponent of Rust, I have never written a single line
> >> of code in that language, nor did I read any book about it.
> >>
> >> If you think I am wrong, then please show me how this mentioned issue
> >> can be solved without a language level construct?
> >
> >
> > It can (and should) be solved in the library, as we solve many problems
> in the library for which other languages would need a language level
> construct.
>
> Perhaps the performance part could be solved in a library, but
> certainly not the usability part which is all I am arguing for.
>

Not in "a" library, in "the" library, i.e. the Standard library. And the
usability part would be solved by increased clarity to readers and to
static analyzers.

>
> > The problem with a `relocate` operator per se is that in real-world
> usage it can be arbitrarily complicated to determine whether a variable has
> been relocated from and that following uses should be rejected; we can't
> use linear/affine types or Rust's lifetime analysis. Instead, we should
> solve this in the library: use `std::optional<T>` and give it a
> (hypothetical; bikeshed) `T std::optional<T>::pop()` method that returns a
> prvalue relocated from the contained object (or moved/copied followed by
> destroy, for non-relocatable types), and sets the optional to disengaged.
> Then, identifying programs where the optional is used after it has been
> disengaged is a matter of QOI, and can be solved by static analysis and
> testing with sanitizers or wide-contract methods. And in many cases the
> engaged flag of the optional would be optimized out of the program binary
> (again, we are historically and presently happy to rely on the optimizer).
>
> I don't really see how using std::optional<T> fixes any problem I have
> mentioned. People still need to move into, so they call std::move, and
> the source variable is still in a scope, with a possibility of being
> misused, resulting in UB.
>

A static analyzer (or a reviewer) can easily analyze whether an optional
may be indirected after it has been disengaged. So the possibility of
misuse is much reduced.

Why do you say that whether a variable has been relocated from is hard
> to determine?
> If the user wrote: "move/relocate x" then after this statement x is
> relocated, period.
>

Only in very simple cases. The "relocate" expression may occur within a
branch, or in a ternary conditional, or after a jump label (switch or
goto), or after a possibly throwing statement-expression, or within an
expression with indeterminate order of evaluation, or... And even in
straight-line code its effects may be felt within the destructors of
automatic variables declared before the "relocate" expression... or may
not, if a return or possibly throwing expression is encountered first.

Indeed, it is common (especially in performance-sensitive contexts) to
write programs that are partial i.e. have UB on some inputs, since they are
part of a system that ensures the UB-provoking input can never occur.
Rejecting such programs outright would not be acceptable.

>
> > For efficiency (trivial std::unique_ptr, std::list with allocating
> default ctor) and for never-empty types (gsl::not_null) we will need a way
> to write or (preferably) default the relocation operation on class types;
> this could be accomplished via (suitably) qualified (or perhaps specified)
> member functions, writing the relocation operation as a qualified
> conversion function to the same (class) type and allowing other other
> destructive operations on prvalues (e.g. std::unique_ptr::release) to
> obviate calling the destructor. Coming up with a syntax (other than `=
> default`, on the relocation operation itself) that is guaranteed leak-free
> is a bit tricky, but I'm working on it (from time to time).
> >
> > The relocation operation would automatically be called on return of an
> id-expression or similar (considered equivalent to and preferred over
> move-and-destroy), but there would also need to be a way for library code
> (such as `optional`) to invoke the relocate operation on objects whose
> lifetime it manages, where currently it might call the move constructor
> followed by the destructor. A (magic) library function `T
> relocate_or_move_and_destroy_at(T*)` would be sufficient for (Standard and
> third-party) library authors, while sufficiently off-putting to end users
> that they would be steered to use `std::optional`; the overconfident would
> be free to use `alignas(T) std::byte buf[sizeof(T)]`.
>
> A magic library function is a language level solution (just hidden
> behind a function), otherwise people would have already implemented it
> in their libraries.

The language level part is just the ability to invoke the relocator, which
- yes - does need to be a language feature, for safety, triviality and
composability, and to obviate the destructor on prvalues. Otherwise this
could indeed be implemented in pure library form.

Also again with such an API, if source object is
> e.g. an automatic variable it is alive, and using it after a call to
> the proposed magic function will still lead to UB. So it doesn't fix
> the problems I raised.
>

Correct, it would need to be an object with non-automatic lifetime, e.g. a
union member or an object constructed into a suitably aligned/sized buffer.
That's why this API would be designed for the use of library authors, not
for end-users directly; invoking it on an automatic variable would be
incorrect, just as e.g. invoking std::destroy_at (or, indeed the
destructor) on an automatic variable is incorrect.

>
> > The library should similarly offer a type trait
> `is_trivially_relocatable` (*not* `is_relocatable`, since there would be no
> way for user code to invoke the relocation operation directly without
> fallback to move-and-destroy), allowing (user and library) code to apply
> all 3 of the memcpy/memmove optimizations mentioned by Arthur; `any` and
> `swap` would use the trait directly, and `vector` via
> `uninitialized_relocate_or_move_and_destroy_n`. Clearly, scalars (and
> arrays thereof, etc) would be trivially relocatable, as would aggregates
> with all trivially relocatable members; (other) user-defined special member
> functions would disable the relocation operation on class types, but it
> could be reenabled (trivially, if appropriate) with `= default` syntax.
> >
> > Finally, note that there is no need for a relocating assignment
> operator; any class for which it would be trivial will be itself trivial
> (assignment can't be trivial for `std::unique_ptr` because any existing
> resource has to be deleted); std::list doesn't care since both sides
> already have their sentinel node allocated; and gsl::not_null can write its
> assignment operator to take its argument by value and swap (the issue of
> delayed destruction of by-value arguments being sufficiently abstruse not
> to be worth worrying about). Considering Arthur's criteria, `any` and
> `swap` are already covered by trivial relocatability, and `vector` can't
> benefit since destructors would need to be called on either the source or
> target range anyway.
>

Received on 2022-03-04 17:08:02