C++ Logo


Advanced search

Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Mon, 19 Sep 2022 12:42:44 +0100
On Mon, 19 Sept 2022 at 05:55, Nikl Kelbon <kelbonage_at_[hidden]> wrote:

> looks like move semantics

Yes. The difference between C++11 move semantics and relocation is that
move semantics leaves the source object in a constructed (usually
unspecified, but always destructible) state, whereas relocation ends the
lifetime of the source object (the destructor is not called). This permits
various improvements to performance and program correctness.

пн, 19 сент. 2022 г. в 00:14, Edward Catmur <ecatmur_at_[hidden]>:
>> There are other proposals for trivial relocation. We're interested in
>> solving the problem in full generality, meaning non-trivial relocation.
>> This is required to support the composition of self-referential and other
>> address-aware types with immovable (e.g. non-semiregular) or
>> expensively-movable types, e.g. into aggregate record types, and is
>> valuable for logging and instrumentation (e.g. debugging).
>> On Sun, 18 Sept 2022 at 17:50, Nikl Kelbon <kelbonage_at_[hidden]> wrote:
>>> Sorry for not reading the whole big discussion (it would be nice to have
>>> another platform for this), but is no one has yet suggested using
>>> const rvalue reference (const T&&) as relocate constructor tag?
>>> Then specifiy situations when it is used(NRVO/RVO things or similar),
>>> when this relocate constructor is used compiler do not generates destructor
>>> call for this variable
>>> Then add
>>> template<typename T>
>>> constexpr decltype(auto) std::die(T&& v) noexcept {
>>> return std::move(std::as_const(v));
>>> }
>>> And it is undefined behavior to call destructor on variable after
>>> relocate ctor/operator used(for example value relocated from vector)
>>> (Another way - we can explicitly forbit to do ill formed using this ctor
>>> for not local variables or atleast constexpr / global variables...)
>>> And this consrtuctor is not created implicitly, but can be ONLY =
>>> default:
>>> Type(const Type&&) = default;
>>> behaves as if:
>>> constexpr Type(const Type&& v) noexcept {
>>> memcpy(this, std::addressof(v), sizeof(*this)); // pseudocode
>>> }
>>> Type& operator=( const Type&&) = default;
>>> behaves as if:
>>> constexpr Type& operator=(const Type&& v) noexcept {
>>> std::destroy_at(this); // pseudocode
>>> std::construct_at(this, static_cast<const Type&&>(v);
>>> }
>>> This ctor and operator also must be ill-formed if type has virutal
>>> inheriting or one of fields contains initializer to another field of this
>>> class or base type is self-referenced
>>> And it is undefined behavior to call this operator and ctor on self
>>> reference types (when it cant be deducted on compile time) ( also
>>> implementations can check it on runtime on debug, its really possible to
>>> generate such = default operator=, which will check all pointers/references
>>> in fields or bases)
>>> Even If base classes do not have relocate contructors or operators =
>>> default will generate them by those rulls
>>> Then add support std::is_relocatable_v<T> and stl optimizations with it.
>>> Nice?
>>> вс, 18 сент. 2022 г. в 20:27, Edward Catmur via Std-Proposals <
>>> std-proposals_at_[hidden]>:
>>>> On Fri, 16 Sept 2022 at 11:05, Sébastien Bini <sebastien.bini_at_[hidden]>
>>>> wrote:
>>>>> Hi Edward,
>>>>> Thank you for those explanations. I'll get in touch with people from
>>>>> AFNOR when we have a draft ready (ongoing).
>>>>> There is still one technical aspect that we haven't talked about:
>>>>> structured binding relocation.
>>>>> I believe that, in order to support relocate-only types, we need to
>>>>> enable relocation from a structured binding: `auto [x, y] = foo();
>>>>> sink(reloc y);` must work. This is motivated by:
>>>>> - The need to make APIs that support relocate-only types. How do
>>>>> you write an API to extract an item at an arbitrary position from a vector?
>>>>> I suggest the following: `std::pair<iterator, T>
>>>>> vector<T>::pilfer(const_iterator);` (returns next iterator and relocated
>>>>> vector element) as it is consistent with other vector APIs and complies
>>>>> with the core guidelines
>>>>> <https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-out>.
>>>>> Then, what can users do with the returned object as it lies in a pair, and
>>>>> that it is forbidden to relocate only parts of an object? The return value
>>>>> is unusable for relocate-only types, unless we support structured binding
>>>>> relocation.
>>>>> - Almost all C++ developers that I know believe that a structured
>>>>> binding is complete, separate object, and not a name alias to some
>>>>> subobject. As such they would find it weird that they cannot relocate a
>>>>> structured binding.
>>>>> Yes, all good points, and I agree this would be very useful.
>>>>> This is not a blocking point IMO, but will be very nice to have.
>>>>> vector's pilfer can be worked around by rotating the element to the end of
>>>>> the vector and calling some new vector::pilfer_back method that removes and
>>>>> returns the last element. There is no performance overhead, it is just
>>>>> cumbersome to write. Likewise we can just state that we cannot relocate a
>>>>> structured binding and move on with the proposal. But I'd like to give it
>>>>> more thoughts: structured binding relocation sounds like a very convenient
>>>>> facility, and after all, how hard could it be to just relocate one item of
>>>>> a pair? Well it's not simple :/
>>>>> Before we dive in, let's agree to reuse the terms used in
>>>>> https://en.cppreference.com/w/cpp/language/structured_binding . As
>>>>> such, when we write: `auto [x, y] = foo();` it creates a hidden object `e`
>>>>> of type `E` (which is the return type of `foo`), `x` and `y` are newly
>>>>> introduced identifiers that are just aliases to some parts of `e`.
>>>>> The best approach I can think of is that under some conditions, `x`
>>>>> and `y` are not aliases but individual complete objects, and no hidden
>>>>> object `e` is created. Hence we can write `reloc x` without violating the
>>>>> rules we added (which forbids relocation of subobjects). Here are the
>>>>> conditions:
>>>>> - there must be no ref qualifiers in the auto declaration (auto
>>>>> [x, y] = foo(); passes, auto const& [x, y] = foo(); doesn't)
>>>>> - `E` must not have a user-defined destructor, a user-defined
>>>>> relocation, move or copy constructor. This should give us the guarantee
>>>>> that we can safely split an object into individual objects. Thankfully for
>>>>> us, std::pair and std::tuple comply.
>>>>> That part is fine by me. The question is then how do we construct each
>>>>> individual object that corresponds to the new identifiers we introduced? We
>>>>> need to distinguish between the three binding protocols supported in C++17:
>>>>> - C array: the easiest of the three, each object is constructed
>>>>> element-wise using the best-suited constructor. That approach remains
>>>>> compatible with relocation.
>>>>> Yes. Interestingly, you don't see array prvalues much, since they
>>>> can't be returned from functions, but they do exist as temporaries
>>>> (auto&& r = std::type_identity_t<int[2]>{0, 1} constructs a temporary array
>>>> and binds it to a reference).
>>>>> - Binding to data members: again, no real difficulty here, each
>>>>> object is constructed using the best-suited constructor from their
>>>>> corresponding data member. Again that approach remains compatible with
>>>>> relocation.
>>>>> Yes, this is fine.
>>>>> - Tuple-like binding: this is the hardest of all, and probably the
>>>>> one we need to support the most. The main problem with tuple-like binding
>>>>> is that the language doesn't treat std::array, pair and tuples as specials.
>>>>> They are just types like any other, hidden behind the tuple_size,
>>>>> tuple_element and std::get APIs. We can construct each object using the
>>>>> member get (or std::get) functions, and that will work fine for copy
>>>>> constructor and move constructor, but not for relocation. First, the get
>>>>> functions will return references and not prvalues, so the relocation
>>>>> constructor will not be selected. Second, the get functions, had they
>>>>> returned by value, will not be allowed to relocate the subobject they
>>>>> return. Last, we have no warranty that tuple_size and all the get functions
>>>>> return a partition of `E`. It is mandatory to enable construction of the
>>>>> new objects (`x` and `y`) by relocation, as again, the whole point of this
>>>>> is to support relocate-only types (so `x` or `y` may be relocate-only).
>>>>> The only solution that I can think of is an alternative API for the
>>>>> tuple-like binding protocol. If that new API is not provided then we
>>>>> fallback to C++17 structured bindings that cannot be relocated. The API I
>>>>> think of is:
>>>>> - A template static member function get_member that returns the
>>>>> pointer to I-th data-member: `template <std::size_t I> auto
>>>>> E::get_member<I>()`
>>>>> - Or a get_member free function that does the same: `template
>>>>> <class E, std::size_t I> auto get_member()`
>>>>> The return type of the get_member function must be:
>>>>> `std::tuple_element_t<E, I> (E::*)`. The number of supported get_member
>>>>> must match that of tuple_size<E>::value. For std::pair, that would be:
>>>>> template <class First, class Second>
>>>>> constexpr auto get_member<std::pair<First, Second>, 0> { return
>>>>> &std::pair<First, Second>::first; }
>>>>> template <class First, class Second>
>>>>> constexpr auto get_member<std::pair<First, Second>, 1> { return
>>>>> &std::pair<First, Second>::second; }
>>>>> Let's consider the expression: `E e = foo(); auto [x, y] = reloc e;`.
>>>>> If E provides the new API then the new objects (`x` and `y`) can be
>>>>> constructed by relocation. The language needs to track, if any relocation
>>>>> constructor is used, which subobjects of the source (`e`) got destructively
>>>>> relocated into a new object of the "structured binding". Then the
>>>>> destructor of the source object is not called directly. Instead the
>>>>> language will call the destructor on all the subobjects of the source that
>>>>> were not relocated (we need to keep in mind that relocation may not happen
>>>>> for every data-member, and that some data-member may be hidden from the
>>>>> tuple-like binding API). Hopefully, thanks to the pointer to data-member
>>>>> returned by the get_member functions, the language is able to track down
>>>>> which parts are relocated and which aren't.
>>>>> I thought of just taking the address of whatever std::get returns,
>>>>> instead of introducing a new API. However I don't believe we have any
>>>>> guarantee that std::get returns a reference, and even a reference of some
>>>>> subobject that lives within the tuple-like object (it could very well be a
>>>>> reference to some static variable...). get_member solves both those
>>>>> problems.
>>>>> I am not sure this is the best solution there is, I am just sharing my
>>>>> thoughts about the subject. If get_member seems viable, we can easily
>>>>> provide an implementation for pair and tuples. I wonder what can be done
>>>>> with std::array (we cannot take the pointer to data-members that are array
>>>>> elements, although the language considers array elements as subobjects...).
>>>> Yes, I agree this would be nice to have, but as well as the problems
>>>> you have noticed with `get_member`, there is also the problem that pointer
>>>> to data member of a nested (not base class) subobject cannot be formed, and
>>>> also that there could be tuple-like classes that form some elements "on
>>>> demand".
>>>> I have another idea, which is to use recursion in the manner of
>>>> operator->(). For example, let's say that structured binding for
>>>> tuple-like class types first looks for
>>>> `get<std::make_index_sequence<std::tuple_size<E>::value>>(reloc e)` (the
>>>> exact syntax isn't important, but it should be something clearly novel and
>>>> `e` should be prvalue if the original object was). Then, whatever this
>>>> call returns (which must have the same `tuple_size` as `E`) is in turn
>>>> submitted for structured binding (as a prvalue).
>>>> Of course, library authors will then need to create a struct type with
>>>> the appropriate number of fields, but it won't be difficult for compiler
>>>> authors to write an intrinsic to do that, and once it's available for
>>>> std::tuple and std::array then everyone else can just recurse to those
>>>> types. Anyone who can't/ doesn't want to recurse to the Library can use
>>>> preprocessor hackery, other code generation, or possibly metaclasses (once
>>>> those are available), to support any sensible number of elements.
>>>> It'd be ugly to convert `std::array<T, N>` to essentially `struct { T
>>>> t0, t1, ...tn-1; };` but N should be small (at least until we get variadic
>>>> structured binding) and compiler authors can always hack in a better
>>>> solution for privileged Library classes as long as the general case can be
>>>> made to work for non-Standard library authors. A more ambitious solution
>>>> would be to allow returning arrays from functions.
>>>> For example, one could write:
>>>> template<std::size_t... I, class... T> requires
>>>> std::same_as<std::index_sequence<I...>, std::index_sequence_for<T...>>
>>>> auto get<std::index_sequence<I...>>(my_tuple<T...> t) {
>>>> union { my_tuple<T...> tt; } = {.tt = reloc t}; // prevent
>>>> destructor
>>>> return std::tuple<T...>(std::relocate(&get<I>(tt))...);
>>>> }
>>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2022-09-19 11:42:57