C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Sun, 18 Sep 2022 20:14:33 +0100
There are other proposals for trivial relocation. We're interested in
solving the problem in full generality, meaning non-trivial relocation.
This is required to support the composition of self-referential and other
address-aware types with immovable (e.g. non-semiregular) or
expensively-movable types, e.g. into aggregate record types, and is
valuable for logging and instrumentation (e.g. debugging).


On Sun, 18 Sept 2022 at 17:50, Nikl Kelbon <kelbonage_at_[hidden]> wrote:

> Sorry for not reading the whole big discussion (it would be nice to have
> another platform for this), but is no one has yet suggested using
> const rvalue reference (const T&&) as relocate constructor tag?
> Then specifiy situations when it is used(NRVO/RVO things or similar), when
> this relocate constructor is used compiler do not generates destructor call
> for this variable
> Then add
> template<typename T>
> constexpr decltype(auto) std::die(T&& v) noexcept {
> return std::move(std::as_const(v));
> }
> And it is undefined behavior to call destructor on variable after relocate
> ctor/operator used(for example value relocated from vector)
> (Another way - we can explicitly forbit to do ill formed using this ctor
> for not local variables or atleast constexpr / global variables...)
> And this consrtuctor is not created implicitly, but can be ONLY = default:
> Type(const Type&&) = default;
> behaves as if:
> constexpr Type(const Type&& v) noexcept {
> memcpy(this, std::addressof(v), sizeof(*this)); // pseudocode
> }
> Type& operator=( const Type&&) = default;
> behaves as if:
> constexpr Type& operator=(const Type&& v) noexcept {
> std::destroy_at(this); // pseudocode
> std::construct_at(this, static_cast<const Type&&>(v);
> }
> This ctor and operator also must be ill-formed if type has virutal
> inheriting or one of fields contains initializer to another field of this
> class or base type is self-referenced
> And it is undefined behavior to call this operator and ctor on self
> reference types (when it cant be deducted on compile time) ( also
> implementations can check it on runtime on debug, its really possible to
> generate such = default operator=, which will check all pointers/references
> in fields or bases)
> Even If base classes do not have relocate contructors or operators =
> default will generate them by those rulls
> Then add support std::is_relocatable_v<T> and stl optimizations with it.
> Nice?
>
> вс, 18 сент. 2022 г. в 20:27, Edward Catmur via Std-Proposals <
> std-proposals_at_[hidden]>:
>
>> On Fri, 16 Sept 2022 at 11:05, Sébastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>
>>> Hi Edward,
>>>
>>> Thank you for those explanations. I'll get in touch with people from
>>> AFNOR when we have a draft ready (ongoing).
>>>
>>> There is still one technical aspect that we haven't talked about:
>>> structured binding relocation.
>>>
>>> I believe that, in order to support relocate-only types, we need to
>>> enable relocation from a structured binding: `auto [x, y] = foo();
>>> sink(reloc y);` must work. This is motivated by:
>>>
>>> - The need to make APIs that support relocate-only types. How do you
>>> write an API to extract an item at an arbitrary position from a vector? I
>>> suggest the following: `std::pair<iterator, T>
>>> vector<T>::pilfer(const_iterator);` (returns next iterator and relocated
>>> vector element) as it is consistent with other vector APIs and complies
>>> with the core guidelines
>>> <https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-out>.
>>> Then, what can users do with the returned object as it lies in a pair, and
>>> that it is forbidden to relocate only parts of an object? The return value
>>> is unusable for relocate-only types, unless we support structured binding
>>> relocation.
>>> - Almost all C++ developers that I know believe that a structured
>>> binding is complete, separate object, and not a name alias to some
>>> subobject. As such they would find it weird that they cannot relocate a
>>> structured binding.
>>>
>>> Yes, all good points, and I agree this would be very useful.
>>
>>
>>> This is not a blocking point IMO, but will be very nice to have.
>>> vector's pilfer can be worked around by rotating the element to the end of
>>> the vector and calling some new vector::pilfer_back method that removes and
>>> returns the last element. There is no performance overhead, it is just
>>> cumbersome to write. Likewise we can just state that we cannot relocate a
>>> structured binding and move on with the proposal. But I'd like to give it
>>> more thoughts: structured binding relocation sounds like a very convenient
>>> facility, and after all, how hard could it be to just relocate one item of
>>> a pair? Well it's not simple :/
>>>
>>> Before we dive in, let's agree to reuse the terms used in
>>> https://en.cppreference.com/w/cpp/language/structured_binding . As
>>> such, when we write: `auto [x, y] = foo();` it creates a hidden object `e`
>>> of type `E` (which is the return type of `foo`), `x` and `y` are newly
>>> introduced identifiers that are just aliases to some parts of `e`.
>>>
>>> The best approach I can think of is that under some conditions, `x` and
>>> `y` are not aliases but individual complete objects, and no hidden object
>>> `e` is created. Hence we can write `reloc x` without violating the rules we
>>> added (which forbids relocation of subobjects). Here are the conditions:
>>>
>>> - there must be no ref qualifiers in the auto declaration (auto [x,
>>> y] = foo(); passes, auto const& [x, y] = foo(); doesn't)
>>> - `E` must not have a user-defined destructor, a user-defined
>>> relocation, move or copy constructor. This should give us the guarantee
>>> that we can safely split an object into individual objects. Thankfully for
>>> us, std::pair and std::tuple comply.
>>>
>>> That part is fine by me. The question is then how do we construct each
>>> individual object that corresponds to the new identifiers we introduced? We
>>> need to distinguish between the three binding protocols supported in C++17:
>>>
>>> - C array: the easiest of the three, each object is constructed
>>> element-wise using the best-suited constructor. That approach remains
>>> compatible with relocation.
>>>
>>> Yes. Interestingly, you don't see array prvalues much, since they can't
>> be returned from functions, but they do exist as temporaries (auto&& r =
>> std::type_identity_t<int[2]>{0, 1} constructs a temporary array and binds
>> it to a reference).
>>
>>>
>>> - Binding to data members: again, no real difficulty here, each
>>> object is constructed using the best-suited constructor from their
>>> corresponding data member. Again that approach remains compatible with
>>> relocation.
>>>
>>> Yes, this is fine.
>>
>>>
>>> - Tuple-like binding: this is the hardest of all, and probably the
>>> one we need to support the most. The main problem with tuple-like binding
>>> is that the language doesn't treat std::array, pair and tuples as specials.
>>> They are just types like any other, hidden behind the tuple_size,
>>> tuple_element and std::get APIs. We can construct each object using the
>>> member get (or std::get) functions, and that will work fine for copy
>>> constructor and move constructor, but not for relocation. First, the get
>>> functions will return references and not prvalues, so the relocation
>>> constructor will not be selected. Second, the get functions, had they
>>> returned by value, will not be allowed to relocate the subobject they
>>> return. Last, we have no warranty that tuple_size and all the get functions
>>> return a partition of `E`. It is mandatory to enable construction of the
>>> new objects (`x` and `y`) by relocation, as again, the whole point of this
>>> is to support relocate-only types (so `x` or `y` may be relocate-only).
>>>
>>> The only solution that I can think of is an alternative API for the
>>> tuple-like binding protocol. If that new API is not provided then we
>>> fallback to C++17 structured bindings that cannot be relocated. The API I
>>> think of is:
>>>
>>> - A template static member function get_member that returns the
>>> pointer to I-th data-member: `template <std::size_t I> auto
>>> E::get_member<I>()`
>>> - Or a get_member free function that does the same: `template <class
>>> E, std::size_t I> auto get_member()`
>>>
>>> The return type of the get_member function must be:
>>> `std::tuple_element_t<E, I> (E::*)`. The number of supported get_member
>>> must match that of tuple_size<E>::value. For std::pair, that would be:
>>>
>>> template <class First, class Second>
>>> constexpr auto get_member<std::pair<First, Second>, 0> { return
>>> &std::pair<First, Second>::first; }
>>> template <class First, class Second>
>>> constexpr auto get_member<std::pair<First, Second>, 1> { return
>>> &std::pair<First, Second>::second; }
>>>
>>> Let's consider the expression: `E e = foo(); auto [x, y] = reloc e;`. If
>>> E provides the new API then the new objects (`x` and `y`) can be
>>> constructed by relocation. The language needs to track, if any relocation
>>> constructor is used, which subobjects of the source (`e`) got destructively
>>> relocated into a new object of the "structured binding". Then the
>>> destructor of the source object is not called directly. Instead the
>>> language will call the destructor on all the subobjects of the source that
>>> were not relocated (we need to keep in mind that relocation may not happen
>>> for every data-member, and that some data-member may be hidden from the
>>> tuple-like binding API). Hopefully, thanks to the pointer to data-member
>>> returned by the get_member functions, the language is able to track down
>>> which parts are relocated and which aren't.
>>>
>>> I thought of just taking the address of whatever std::get returns,
>>> instead of introducing a new API. However I don't believe we have any
>>> guarantee that std::get returns a reference, and even a reference of some
>>> subobject that lives within the tuple-like object (it could very well be a
>>> reference to some static variable...). get_member solves both those
>>> problems.
>>>
>>> I am not sure this is the best solution there is, I am just sharing my
>>> thoughts about the subject. If get_member seems viable, we can easily
>>> provide an implementation for pair and tuples. I wonder what can be done
>>> with std::array (we cannot take the pointer to data-members that are array
>>> elements, although the language considers array elements as subobjects...).
>>>
>>
>> Yes, I agree this would be nice to have, but as well as the problems you
>> have noticed with `get_member`, there is also the problem that pointer to
>> data member of a nested (not base class) subobject cannot be formed, and
>> also that there could be tuple-like classes that form some elements "on
>> demand".
>>
>> I have another idea, which is to use recursion in the manner of
>> operator->(). For example, let's say that structured binding for
>> tuple-like class types first looks for
>> `get<std::make_index_sequence<std::tuple_size<E>::value>>(reloc e)` (the
>> exact syntax isn't important, but it should be something clearly novel and
>> `e` should be prvalue if the original object was). Then, whatever this
>> call returns (which must have the same `tuple_size` as `E`) is in turn
>> submitted for structured binding (as a prvalue).
>>
>> Of course, library authors will then need to create a struct type with
>> the appropriate number of fields, but it won't be difficult for compiler
>> authors to write an intrinsic to do that, and once it's available for
>> std::tuple and std::array then everyone else can just recurse to those
>> types. Anyone who can't/ doesn't want to recurse to the Library can use
>> preprocessor hackery, other code generation, or possibly metaclasses (once
>> those are available), to support any sensible number of elements.
>>
>> It'd be ugly to convert `std::array<T, N>` to essentially `struct { T t0,
>> t1, ...tn-1; };` but N should be small (at least until we get variadic
>> structured binding) and compiler authors can always hack in a better
>> solution for privileged Library classes as long as the general case can be
>> made to work for non-Standard library authors. A more ambitious solution
>> would be to allow returning arrays from functions.
>>
>> For example, one could write:
>>
>> template<std::size_t... I, class... T> requires
>> std::same_as<std::index_sequence<I...>, std::index_sequence_for<T...>>
>> auto get<std::index_sequence<I...>>(my_tuple<T...> t) {
>> union { my_tuple<T...> tt; } = {.tt = reloc t}; // prevent destructor
>> return std::tuple<T...>(std::relocate(&get<I>(tt))...);
>> }
>>
>>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
>

Received on 2022-09-18 19:14:46