ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Sébastien Bini <sebastien.bini_at_[hidden]>
Date: Tue, 17 May 2022 18:24:46 +0200

Hello all,

To summarize the changes we have discussed from the second revision (
https://github.com/SebastienBini/cpp-relocation-proposal/blob/main/relocation.pdf
):

*A. operator reloc*

The reloc operator performs relocation by either:

   1. a memcpy if the object is trivially relocatable (p1144).
   2. a call to the "operator reloc" member function if it is accessible.
   3. a regular move operation (using move constructor) if it is accessible.

In the first two cases the language guarantees that the destructor of the
relocated object is not called, although its lifetime has ended. References
and pointers to the relocated object become invalid after the reloc
operator completes.
In the last case, the moved object destructor is called normally, when the
object reaches its end of scope, as if std::move was used in place of
reloc.
In all cases, the operator reloc does not allow further mentions to the
name of the relocated object (compilation error), its identifier is removed
from the lexical scope.

If reloc is called on a local object (i.e. declared in the same function
body where reloc is being used) then relocation must be performed using (1)
or (2), fallbacking to (3) only if the object is not trivially relocatable
and its operator reloc function is not accessible (deleted or private).
If reloc is called on a function parameter, then relocation can be
performed using (1), (2) or (3), in that order of preference. (1) or (2)
are no longer enforced as ABI restrictions may prevent from prematurely
ending the lifetime of relocated parameters. Which one is picked is left at
the discretion of compiler vendors.

*B. [[relocate_parameters]] constructor attribute*

We introduce a new attribute that only applies to constructors. A
constructor with this attribute has the guarantee that it can relocate its
parameters as it they were local objects (guaranteed to use (1) or (2),
fallbacking to (3) only if the object is not trivially relocatable and its
operator reloc function is not accessible (deleted or private)).

Constructors with this attribute may be called with a different ABI
(notably using callee-destruct convention). However this should not break
anything in the language as it is not possible to take the address of a
constructor, and as such this potential ABI change needs not to reflect on
its function signature.

This special attribute allows us to write an std::reloc_wrapper, which we
can use to relocate objects into a container:
template <class T>
struct reloc_wrapper
{
public:
  [[relocate_parameters]] reloc_wrapper(T val);
  bool has_value() const;
  T pilfer();
};

// And in containers:
std::vector<T, Alloc>::push_back(std::reloc_wrapper<T>&& val);

void foo(std::vector<T>& vec)
{
  T data;
  vec.push_back(std::reloc_wrapper{reloc data}); /* not a nice syntax but
we can add more syntactic sugar later...
  * 'data' destructor is not called at the end of foo.
  * 'data' is effectively relocated into 'vec' without ever calling its
move ctor, possibly only with memcpy calls if trivially relocatable.
  * We now must see how we could optimize out this extra relocation (once
inside the wrapper, once inside the container).
  */
}

*C. Relocator member function*

The relocator member function is the special function that is invoked by
operator reloc (2). The operator reloc behaves like a constructor: it
constructs a new instance and maybe allows for base or member
initialisation.
This special function is only needed if the object is not trivially
relocatable. The suggested syntaxes are:

struct T: B
{
// data members
D data;
T* self;

// syntax A
operator reloc() { self = this; }
// syntax B
operator reloc(T&& rhs) : self{this} {}
// syntax C (P0023R0)
>>T(T& rhs) : >>B{rhs}, >>data{rhs.data}, self{this} {}
};

In syntax A, 'this' is constructed by calling operator reloc on all base
classes and non-static data-members, constructed from the object to
relocate. If some base classes or data-members are trivially relocatable
then they are built using memcpy instead. This syntax does not allow for
custom base or member initialisation. Likewise, the relocated object does
not appear as a parameter since it will have reached its end of life by the
time the function body is reached.

Syntax B allows for custom base or member initialisation, and it defaults
to that of syntax A for the initialisers that are not provided. On one hand
this allows to properly relocate objects that have a const data-member that
needs manual fixup (as if 'self' was of type 'T* const'). On the other
hand, rhs becomes available in the function body for users to mess with it
(while it's reached its end of life).

Syntax C is directly taken from P0023R0. Although the P0023R0's relocator
can be defaulted, all the base and member initialisers must be written to
be able to write a function body.

I'd weigh in favor of syntax B, and compilers can still emit warnings if
'rhs' is used in the function body.

You could enforce that 'rhs' cannot be used in the function body, as if its
identifier got removed from the lexical scope by operator reloc. It would
prevent the user from handling an object that reached its end of life, but
would also prevent any form of debugging (like printing the address of
'rhs' from the function body). This is not my prefered solution.

All those changes from the second revision of the proposal should allow us
to:

   - support a more efficient relocation (objects relocated with their
   "operator reloc" member function are no longer passed to their destructor).
   - support relocation-only objects (albeit they may need to be
   occasionally wrapped into an std::reloc_wrapper).
   - remains compatible with the existing ABI.

Thank you for reading me,

Best regards,
Sébastien

On Wed, May 11, 2022 at 5:01 PM Avi Kivity <avi_at_[hidden]> wrote:

>
> On 05/05/2022 23.48, Arthur O'Dwyer wrote:
>
> On Thu, May 5, 2022 at 10:33 AM Sébastien Bini via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> [Avi Kivity wrote:]
>>
> > That is, the destructor is called immediately, since the reloc operator
>> ended the scope of the name. I think that's a good thing.
>>
>> I cannot be called [...] as it could then lead to objects being
>> destructed twice
>>
>
> Right, I think there's a mismatch between the way Avi was
> using/understanding the terminology and the way you (and I) use it.
> - Immediately after a call to std::relocate_at (or the core-language
> `operator reloc` or whatever), the source object *has become destroyed
> and its lifetime is over*.
> - But *the destructor* is not called.
> This terminology is confusing in the context of today's C++, because
> today's C++ does not (admit|permit) any difference between the ideas of
> "the object's lifetime ends" and "the object's destructor is called." We
> are able to use the same English phrase — "the object is destroyed" — to
> mean both notions, interchangeably, without any ambiguity, because they are
> literally synonymous in C++ today.
> In C++-with-relocation (whether P1144 or otherwise), it is possible for an
> object's lifetime to end in either of two different ways: *either* its
> destructor is called, *or* it is relocated-from. In the former case, its
> destructor is called; in the latter case, its destructor is never called.
> The phrase "the object is destroyed" should now be avoided, because it is
> ambiguous: it could be taken to mean *either* "the object's destructor is
> called," *or* "the object's lifetime ends," and these notions are now no
> longer synonymous.
>
>
> The context was when a the object is not trivially relocatable and a
> relocate operator is not defined, and then the language falls back to
> std::move:
>
>
> > Simply use a regular move (std::move). The destructor of the moved
> instance is called normally (i.e. when it goes out of scope).
>
>
> That is, the destructor is called immediately, since the reloc operator
> ended the scope of the name. I think that's a good thing.
>
>
> I absolutely agree that if the object is trivially relocatable, or if a
> relocation operator is defined, the source object just ceases existing. But
> if relocation is not possible, the move fallback requires the destructor to
> be called (and my comment was that it's called immediately after the move
> constructor, the point where its scope ends).
>
>

Received on 2022-05-17 16:24:58