C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: DBJ <dbj_at_[hidden]>
Date: Tue, 1 Feb 2022 18:28:33 +0100
Sorry to rain on the parade. This almost certainly, leads to ABI break.

Kind regards

Dusan <http://en.wikipedia.org/wiki/Du%C5%A1an>Jovanovic
<http://en.wikipedia.org/wiki/Jovanovi%C4%87>MSc
<http://en.wikipedia.org/wiki/Master_of_Science> Arch
<http://en.wikipedia.org/wiki/Architecture>, TOGAF
<http://en.wikipedia.org/wiki/TOGAF>(R)
---------------------------------------------------------------------
Before printing please consider a lot of various things, although
 just the preservation of the environment might be enough.


On Tue, 1 Feb 2022 at 18:12, Gašper Ažman via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> To add to Arthur's survey - i am the current maintainer of move=bitcopies
> while Niall is otherwise engaged - i was busy with other things, but I'm
> still planning to bring that paper for 26.
>
> On Tue, Feb 1, 2022, 17:02 Sébastien Bini <sebastien.bini_at_[hidden]>
> wrote:
>
>> Hi all,
>>
>> @Marcin Jaczewski
>> > This has one big flaw, `T x` destructor is called by the caller
>> > function of `maybe_destroy` as the lifetime of `T x` needs to be
>> > longer than the call to this function.
>> > To be able to skip this destructor there should be ABI break to
>> > transfer info about destruction of variables in function.
>>
>> Damn it, I didn't know that :/
>>
>> @Maciej Cencora <m.cencora_at_[hidden]>
>> > Skipping destructor call for moved-from object could be only an
>> > optimization and not a requirement from standard PoV.
>> > This way implementers that do not wish to break ABI would keep
>> > caller-destroy convention, and others would switch to/use
>> > callee-destroy convention and have better code-gen.
>>
>> This goes against the relocation constructor philosophy. But your point
>> is valid.
>> Considering that, that's +1 for T(T&&)=bitcopies;
>>
>> @Giuseppe D'Angelo
>> > class IntBuffer {
>> > int *ptr;
>> >
>> > IntBuffer() : i(new int[42]) {}
>> > ~IntBuffer() { delete[] i; }
>> > };
>> >
>> >
>> > Then its correct move constructor is
>> >
>> > IntBuffer(IntBuffer&& other) noexcept
>> > : ptr(std::exchange(other.ptr, nullptr)) {}
>> >
>> >
>> > If you have a destroying move constructor / relocation constructor, and
>> > know that `other` is considered destroyed, then you can skip an
>> operation:
>> >
>> > IntBuffer(IntBuffer~ other) noexcept
>> > : ptr(other.ptr) {} // correct?
>> Yes.
>>
>> > 1) you can't do that in your relocation constructor if `other` may still
>> > be explicitly destroyed after that constructor call (or you'll delete
>> > the buffer). Your relocation constructor must reset other.ptr, which
>> > means, it just falls back to being the move constructor. So what's the
>> > point of this whole approach?
>> other must not go through its destructor after being passed to the
>> relocation constructor. Otherwise, indeed, there would be trouble.
>> The reloc operator takes care of that, it calls the relocation
>> constructor and prevents the call to the destructor. Think of an explicit
>> call to the relocation constructor as an explicit call to the destructor.
>> If you explicitly call either, we assume you know what you are doing and
>> will somehow don't (re)call the destructor afterwards.
>>
>> > 2) under the same assumption that destruction is a QoI, according to the
>> > paper (but I may have misread something) `other.ptr` in the relocation
>> > constructor is a relocation reference. This implies that
>> >
>> > a) you can't even implement the relocation as the move constructor;
>> > you're not allowed to write something into `other.ptr` after building
>> > `ptr`, as `other.ptr` is destroyed (?)
>> The relocation constructor and move constructor are separate things and
>> do different businesses, one likely does a single memcpy, the other needs
>> to transfer resource ownership.
>> You can still cast a relocation reference to an rvalue reference with a
>> static_cast or std::move.
>> You can write anything you want in other.ptr but it will have little
>> value since other is considered destructed at the end of the constructor
>> call.
>>
>> > b) if the compiler that built the relocation constructor decided to
>> > destroy `other.ptr`, and then `other` is then destroyed *again*, that's
>> > a double destruction of the `other.ptr` subobject.
>> Not sure I understand. Why would the compiler decide to destroy other.ptr
>> in the relocation constructor? And having other destructed again is a
>> programming error (like doing: `IntBuffer x; x.~IntPtr();` x is destructed
>> twice) and should not happen. Using the reloc operator guarantees things
>> will be safe.
>>
>> Regards,
>> Sébastien
>>
>> On Tue, Feb 1, 2022 at 5:10 PM Sébastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>
>>> Hello again,
>>>
>>> Thank you both for your comments.
>>>
>>> @Gašper Ažman <gasper.azman_at_[hidden]> :
>>> > I'd very much like to see sections on the comparisons with the various
>>> papers that came in the past. It'll both show you're aware of all the
>>> historical context as an author, and save the readers the trouble of having
>>> to think through all of it when reviewing.
>>>
>>> I will work on it. Thank you for pointing out the proposals on the same
>>> topic, I find it hard to search for them. Is there some central place I'm
>>> missing? I only found
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/ but it's not
>>> convenient to browse through all of them. Likewise, you mentioned this
>>> section should also "answer to the previously surfaced objections"; but do
>>> you know how I find them?
>>>
>>> I will also get in touch with Arthur O'Dwyer and Howard Hinnant. I see
>>> Arthur O'Dwyer is the author of one of the linked proposals but has Howard
>>> Hinnant worked on something similar?
>>>
>>> @Maciej Cencora:
>>> I take it your proposition is to add the reloc keyword to P1029R3. I
>>> honestly did not know of P1029R3, sorry about that. I think it's definitely
>>> something to consider.
>>>
>>> On one hand, I like the new T(T&&) = relocate/bitcopies; move
>>> constructor. I get your point, and it involves less changes in the core
>>> langage, which is a good thing.
>>>
>>> However, having a dedicated constructor for relocation allows for more
>>> freedom in the relocation implementation. In what I propose, the relocation
>>> constructor can be more than a memcpy, if needs be. I don't know if there
>>> are cases where relocation involves more than a memcpy, but it might
>>> (instance tracking?). It's something I need to ponder. But it's definitely
>>> a good point, thanks!
>>>
>>> Regards,
>>> Sébastien
>>>
>>>
>>> On Tue, Feb 1, 2022 at 12:41 PM Maciej Cencora <m.cencora_at_[hidden]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I don't think adding yet another reference type and yet another
>>>> special member function is a good way to solve this, as the complexity
>>>> in this area is already big. I think the solution should reuse as much
>>>> as possible of existing syntax.
>>>> What I think would be better, is that we kept move-constructors, but
>>>> we add a new syntax to mark if a class is relocatable:
>>>> MyClass(MyClass&& other) = relocate;
>>>>
>>>> Marking such a constructor as relocate (no user-provided definition
>>>> allowed), would indicate to compiler two things: 1) moving is just a
>>>> trivial memcpy, 2) moved-from object is left in a state (e.g. default
>>>> constructed) where destructor call has no-side effect.
>>>> So current code:
>>>> MyClass newObject(std::move(other));
>>>>
>>>> becomes (pseudo-code):
>>>> MyClass newObject = uninitialized;
>>>> new (&newObject) MyClass(other); // or memcpy
>>>> new (&other) MyClass();
>>>> // when 'other' goes out of scope, its destruction can be skipped
>>>> because it is a noop.
>>>>
>>>> So far it does not change anything w.r.t. what we have now.
>>>> But if we add Sebastian's proposed operator reloc with such a
>>>> semantics that it will call move-constructor, and mark source object
>>>> as already destroyed we get the semantic checking that moved-from
>>>> object cannot be used anymore.
>>>> Operator reloc can be called on any type that is move-constructible,
>>>> it is just that for types marked as relocatable such an operation can
>>>> be better optimized.
>>>>
>>>> MyClass newObject = reloc other;
>>>> // now other cannot be referenced any more
>>>>
>>>> Since we do not introduce new types of references, or new types of
>>>> member functions, we can gradually migrate code from calling std::move
>>>> to using reloc operator (while preserving ABI and API compatibility).
>>>> This will also allow to finally have a optimal construction for types
>>>> with user-defined constructors:
>>>>
>>>> struct Person
>>>> {
>>>> Person(std::string firstName, lastName)
>>>> : firstName(reloc firstName)
>>>> , lastName(reloc lastName)
>>>> {}
>>>>
>>>> std::string firstName, lastName;
>>>> };
>>>>
>>>> Person p1("John", "Doe"); // no temporaries, no move constructors
>>>> Person p2(p1.firstName, p2.lastName); // one copy, no temporaries, no
>>>> move constructors
>>>>
>>>> Regards,
>>>> Maciej
>>>>
>>>> wt., 1 lut 2022 o 11:04 Gašper Ažman via Std-Proposals
>>>> <std-proposals_at_[hidden]> napisał(a):
>>>> >
>>>> > Hi Sebastien,
>>>> >
>>>> > you sure made a pretty long write-up! What I'm missing on the first
>>>> skim-through is a thorough review of the currently published papers in the
>>>> space and answers to the previously surfaced objections.
>>>> >
>>>> > Some of the papers in this space:
>>>> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1144r5.html
>>>> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4158.pdf
>>>> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1029r3.pdf
>>>> > http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0023r0.pdf
>>>> >
>>>> > I also couldn't find any way that your proposal handles maybe-destroy
>>>> references. For instance, how do you handle
>>>> >
>>>> > bool maybe_destroy(T& x) {
>>>> > if (rand() % 2) { T(static_cast<T~>(x)); return true; }
>>>> > return false;
>>>> > }
>>>> >
>>>> > The compiler will still emit the destructor call for x in the caller
>>>> of maybe_destroy - there's nothing in the interface that communicates
>>>> whether an object has been destroyed or not, *to the compiler*. Being able
>>>> to handle this case has been the major stumbling block for pretty much
>>>> every proposal in this space.
>>>> >
>>>> > G
>>>> >
>>>> > On Tue, Feb 1, 2022 at 9:26 AM Sébastien Bini via Std-Proposals <
>>>> std-proposals_at_[hidden]> wrote:
>>>> >>
>>>> >> Hello everyone,
>>>> >>
>>>> >> I've worked on a proposal in the last months to introduce relocation
>>>> (a destructive move) in C++. I wrote the proposal (you will find it
>>>> enclosed with this email) but it's quite large so I'll write a small
>>>> overview here.
>>>> >>
>>>> >> Relocation allows to "move" one object into another one. It is
>>>> similar to move constructors, except that relocation guarantees that moved
>>>> objects are destructed right after and cannot be reused.
>>>> >>
>>>> >> This proposal is motivated by:
>>>> >> - the confusing and often not properly implemented "moved-from"
>>>> state, introduced by move constructor.
>>>> >> - relocation is simpler to implement than move semantics as the
>>>> relocated object can be left in a dirty invalid state.
>>>> >> - const variables cannot be moved with C++ move semantics. But they
>>>> can "relocated" with this proposal.
>>>> >>
>>>> >> This introduces three main additions to the language:
>>>> >>
>>>> >> 1. Relocation reference
>>>> >>
>>>> >> A new type of reference, called "relocation reference". A relocation
>>>> reference on T is denoted by T~. This reference is mainly introduced
>>>> because of the new constructor.
>>>> >>
>>>> >> 2. Relocation constructor
>>>> >>
>>>> >> We introduce a new constructor, the relocation constructor:
>>>> >> class T {
>>>> >> T(T~ other) noexcept;
>>>> >> };
>>>> >>
>>>> >> This constructor has a new feature no other C++ constructor has: it
>>>> acts as a destructor with regards to its parameter ("other" in the code
>>>> sample). Hence when this constructor is called, a new instance is built as
>>>> always, but the instance it was built from is destructed. This means that
>>>> the destructor of the moved instance must not be called (otherwise the
>>>> instance would be destructed twice).
>>>> >>
>>>> >> This constructor is usually quite straightforward to implement, you
>>>> need simply to copy all data-members from other into the new instance.
>>>> >>
>>>> >> For instance, implementing the relocation constructor for
>>>> std::unique_ptr is simple. You simply need to copy the internal pointer to
>>>> the new instance. The moved instance can be left untouched, and the memory
>>>> it still owns won't be deleted as the moved instance destructor will not be
>>>> called.
>>>> >> In fact there is so little to do that the default implementation
>>>> does the job:
>>>> >> unique_ptr(unique_ptr~) noexcept = default;
>>>> >>
>>>> >> 3. The reloc operator
>>>> >>
>>>> >> Lastly, we introduce a new unary operator: reloc. It will usually be
>>>> used like this: auto y = reloc x;
>>>> >> This relocates x into y, leaving x in a destructed state.
>>>> >> This new operator (a) handles the construction of the new instance
>>>> (will use the relocation constructor, the move constructor or the copy
>>>> constructor, picked in that order), (b) ensures the destruction of the
>>>> relocated instance and (c) prevents any further use in the code of the
>>>> relocated instance.
>>>> >>
>>>> >> Consider the following scenario:
>>>> >> const T x;
>>>> >> auto y = reloc x;
>>>> >> // std::cout << x << std::endl;
>>>> >>
>>>> >> The second line builds y from x:
>>>> >> - Case 1: the type of x provides a relocation constructor: The
>>>> relocation constructor is called. At the end of the expression x is
>>>> considered destructed because of the relocation constructor. The destructor
>>>> of x will not be called when its end of scope is reached.
>>>> >> - Case 2: the type of x does not provide a relocation constructor,
>>>> but a move or copy constructor: The move constructor is called if it exists
>>>> (if not the copy constructor is called) to construct y from x. reloc must
>>>> ensure the destruction of x, so the destructor of x is called at the end of
>>>> the evaluation of the second line.
>>>> >>
>>>> >> The third line attempts to reuse a variable that was relocated.
>>>> Uncommenting this line will raise a compile error. reloc forbids further
>>>> mention of a name that resolves to a relocated object.
>>>> >>
>>>> >> This is it, thank you for reading :)
>>>> >>
>>>> >> Best regards,
>>>> >> Sébastien Bini
>>>> >> --
>>>> >> Std-Proposals mailing list
>>>> >> Std-Proposals_at_[hidden]
>>>> >> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>> >
>>>> > --
>>>> > Std-Proposals mailing list
>>>> > Std-Proposals_at_[hidden]
>>>> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2022-02-01 17:28:56