C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: Nikl Kelbon <kelbonage_at_[hidden]>
Date: Mon, 19 Sep 2022 09:55:10 +0500
looks like move semantics

пн, 19 сент. 2022 г. в 00:14, Edward Catmur <ecatmur_at_[hidden]>:

> There are other proposals for trivial relocation. We're interested in
> solving the problem in full generality, meaning non-trivial relocation.
> This is required to support the composition of self-referential and other
> address-aware types with immovable (e.g. non-semiregular) or
> expensively-movable types, e.g. into aggregate record types, and is
> valuable for logging and instrumentation (e.g. debugging).
>
>
> On Sun, 18 Sept 2022 at 17:50, Nikl Kelbon <kelbonage_at_[hidden]> wrote:
>
>> Sorry for not reading the whole big discussion (it would be nice to have
>> another platform for this), but is no one has yet suggested using
>> const rvalue reference (const T&&) as relocate constructor tag?
>> Then specifiy situations when it is used(NRVO/RVO things or similar),
>> when this relocate constructor is used compiler do not generates destructor
>> call for this variable
>> Then add
>> template<typename T>
>> constexpr decltype(auto) std::die(T&& v) noexcept {
>> return std::move(std::as_const(v));
>> }
>> And it is undefined behavior to call destructor on variable after
>> relocate ctor/operator used(for example value relocated from vector)
>> (Another way - we can explicitly forbit to do ill formed using this ctor
>> for not local variables or atleast constexpr / global variables...)
>> And this consrtuctor is not created implicitly, but can be ONLY = default:
>> Type(const Type&&) = default;
>> behaves as if:
>> constexpr Type(const Type&& v) noexcept {
>> memcpy(this, std::addressof(v), sizeof(*this)); // pseudocode
>> }
>> Type& operator=( const Type&&) = default;
>> behaves as if:
>> constexpr Type& operator=(const Type&& v) noexcept {
>> std::destroy_at(this); // pseudocode
>> std::construct_at(this, static_cast<const Type&&>(v);
>> }
>> This ctor and operator also must be ill-formed if type has virutal
>> inheriting or one of fields contains initializer to another field of this
>> class or base type is self-referenced
>> And it is undefined behavior to call this operator and ctor on self
>> reference types (when it cant be deducted on compile time) ( also
>> implementations can check it on runtime on debug, its really possible to
>> generate such = default operator=, which will check all pointers/references
>> in fields or bases)
>> Even If base classes do not have relocate contructors or operators =
>> default will generate them by those rulls
>> Then add support std::is_relocatable_v<T> and stl optimizations with it.
>> Nice?
>>
>> вс, 18 сент. 2022 г. в 20:27, Edward Catmur via Std-Proposals <
>> std-proposals_at_[hidden]>:
>>
>>> On Fri, 16 Sept 2022 at 11:05, Sébastien Bini <sebastien.bini_at_[hidden]>
>>> wrote:
>>>
>>>> Hi Edward,
>>>>
>>>> Thank you for those explanations. I'll get in touch with people from
>>>> AFNOR when we have a draft ready (ongoing).
>>>>
>>>> There is still one technical aspect that we haven't talked about:
>>>> structured binding relocation.
>>>>
>>>> I believe that, in order to support relocate-only types, we need to
>>>> enable relocation from a structured binding: `auto [x, y] = foo();
>>>> sink(reloc y);` must work. This is motivated by:
>>>>
>>>> - The need to make APIs that support relocate-only types. How do
>>>> you write an API to extract an item at an arbitrary position from a vector?
>>>> I suggest the following: `std::pair<iterator, T>
>>>> vector<T>::pilfer(const_iterator);` (returns next iterator and relocated
>>>> vector element) as it is consistent with other vector APIs and complies
>>>> with the core guidelines
>>>> <https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-out>.
>>>> Then, what can users do with the returned object as it lies in a pair, and
>>>> that it is forbidden to relocate only parts of an object? The return value
>>>> is unusable for relocate-only types, unless we support structured binding
>>>> relocation.
>>>> - Almost all C++ developers that I know believe that a structured
>>>> binding is complete, separate object, and not a name alias to some
>>>> subobject. As such they would find it weird that they cannot relocate a
>>>> structured binding.
>>>>
>>>> Yes, all good points, and I agree this would be very useful.
>>>
>>>
>>>> This is not a blocking point IMO, but will be very nice to have.
>>>> vector's pilfer can be worked around by rotating the element to the end of
>>>> the vector and calling some new vector::pilfer_back method that removes and
>>>> returns the last element. There is no performance overhead, it is just
>>>> cumbersome to write. Likewise we can just state that we cannot relocate a
>>>> structured binding and move on with the proposal. But I'd like to give it
>>>> more thoughts: structured binding relocation sounds like a very convenient
>>>> facility, and after all, how hard could it be to just relocate one item of
>>>> a pair? Well it's not simple :/
>>>>
>>>> Before we dive in, let's agree to reuse the terms used in
>>>> https://en.cppreference.com/w/cpp/language/structured_binding . As
>>>> such, when we write: `auto [x, y] = foo();` it creates a hidden object `e`
>>>> of type `E` (which is the return type of `foo`), `x` and `y` are newly
>>>> introduced identifiers that are just aliases to some parts of `e`.
>>>>
>>>> The best approach I can think of is that under some conditions, `x` and
>>>> `y` are not aliases but individual complete objects, and no hidden object
>>>> `e` is created. Hence we can write `reloc x` without violating the rules we
>>>> added (which forbids relocation of subobjects). Here are the conditions:
>>>>
>>>> - there must be no ref qualifiers in the auto declaration (auto [x,
>>>> y] = foo(); passes, auto const& [x, y] = foo(); doesn't)
>>>> - `E` must not have a user-defined destructor, a user-defined
>>>> relocation, move or copy constructor. This should give us the guarantee
>>>> that we can safely split an object into individual objects. Thankfully for
>>>> us, std::pair and std::tuple comply.
>>>>
>>>> That part is fine by me. The question is then how do we construct each
>>>> individual object that corresponds to the new identifiers we introduced? We
>>>> need to distinguish between the three binding protocols supported in C++17:
>>>>
>>>> - C array: the easiest of the three, each object is constructed
>>>> element-wise using the best-suited constructor. That approach remains
>>>> compatible with relocation.
>>>>
>>>> Yes. Interestingly, you don't see array prvalues much, since they can't
>>> be returned from functions, but they do exist as temporaries (auto&& r =
>>> std::type_identity_t<int[2]>{0, 1} constructs a temporary array and binds
>>> it to a reference).
>>>
>>>>
>>>> - Binding to data members: again, no real difficulty here, each
>>>> object is constructed using the best-suited constructor from their
>>>> corresponding data member. Again that approach remains compatible with
>>>> relocation.
>>>>
>>>> Yes, this is fine.
>>>
>>>>
>>>> - Tuple-like binding: this is the hardest of all, and probably the
>>>> one we need to support the most. The main problem with tuple-like binding
>>>> is that the language doesn't treat std::array, pair and tuples as specials.
>>>> They are just types like any other, hidden behind the tuple_size,
>>>> tuple_element and std::get APIs. We can construct each object using the
>>>> member get (or std::get) functions, and that will work fine for copy
>>>> constructor and move constructor, but not for relocation. First, the get
>>>> functions will return references and not prvalues, so the relocation
>>>> constructor will not be selected. Second, the get functions, had they
>>>> returned by value, will not be allowed to relocate the subobject they
>>>> return. Last, we have no warranty that tuple_size and all the get functions
>>>> return a partition of `E`. It is mandatory to enable construction of the
>>>> new objects (`x` and `y`) by relocation, as again, the whole point of this
>>>> is to support relocate-only types (so `x` or `y` may be relocate-only).
>>>>
>>>> The only solution that I can think of is an alternative API for the
>>>> tuple-like binding protocol. If that new API is not provided then we
>>>> fallback to C++17 structured bindings that cannot be relocated. The API I
>>>> think of is:
>>>>
>>>> - A template static member function get_member that returns the
>>>> pointer to I-th data-member: `template <std::size_t I> auto
>>>> E::get_member<I>()`
>>>> - Or a get_member free function that does the same: `template
>>>> <class E, std::size_t I> auto get_member()`
>>>>
>>>> The return type of the get_member function must be:
>>>> `std::tuple_element_t<E, I> (E::*)`. The number of supported get_member
>>>> must match that of tuple_size<E>::value. For std::pair, that would be:
>>>>
>>>> template <class First, class Second>
>>>> constexpr auto get_member<std::pair<First, Second>, 0> { return
>>>> &std::pair<First, Second>::first; }
>>>> template <class First, class Second>
>>>> constexpr auto get_member<std::pair<First, Second>, 1> { return
>>>> &std::pair<First, Second>::second; }
>>>>
>>>> Let's consider the expression: `E e = foo(); auto [x, y] = reloc e;`.
>>>> If E provides the new API then the new objects (`x` and `y`) can be
>>>> constructed by relocation. The language needs to track, if any relocation
>>>> constructor is used, which subobjects of the source (`e`) got destructively
>>>> relocated into a new object of the "structured binding". Then the
>>>> destructor of the source object is not called directly. Instead the
>>>> language will call the destructor on all the subobjects of the source that
>>>> were not relocated (we need to keep in mind that relocation may not happen
>>>> for every data-member, and that some data-member may be hidden from the
>>>> tuple-like binding API). Hopefully, thanks to the pointer to data-member
>>>> returned by the get_member functions, the language is able to track down
>>>> which parts are relocated and which aren't.
>>>>
>>>> I thought of just taking the address of whatever std::get returns,
>>>> instead of introducing a new API. However I don't believe we have any
>>>> guarantee that std::get returns a reference, and even a reference of some
>>>> subobject that lives within the tuple-like object (it could very well be a
>>>> reference to some static variable...). get_member solves both those
>>>> problems.
>>>>
>>>> I am not sure this is the best solution there is, I am just sharing my
>>>> thoughts about the subject. If get_member seems viable, we can easily
>>>> provide an implementation for pair and tuples. I wonder what can be done
>>>> with std::array (we cannot take the pointer to data-members that are array
>>>> elements, although the language considers array elements as subobjects...).
>>>>
>>>
>>> Yes, I agree this would be nice to have, but as well as the problems you
>>> have noticed with `get_member`, there is also the problem that pointer to
>>> data member of a nested (not base class) subobject cannot be formed, and
>>> also that there could be tuple-like classes that form some elements "on
>>> demand".
>>>
>>> I have another idea, which is to use recursion in the manner of
>>> operator->(). For example, let's say that structured binding for
>>> tuple-like class types first looks for
>>> `get<std::make_index_sequence<std::tuple_size<E>::value>>(reloc e)` (the
>>> exact syntax isn't important, but it should be something clearly novel and
>>> `e` should be prvalue if the original object was). Then, whatever this
>>> call returns (which must have the same `tuple_size` as `E`) is in turn
>>> submitted for structured binding (as a prvalue).
>>>
>>> Of course, library authors will then need to create a struct type with
>>> the appropriate number of fields, but it won't be difficult for compiler
>>> authors to write an intrinsic to do that, and once it's available for
>>> std::tuple and std::array then everyone else can just recurse to those
>>> types. Anyone who can't/ doesn't want to recurse to the Library can use
>>> preprocessor hackery, other code generation, or possibly metaclasses (once
>>> those are available), to support any sensible number of elements.
>>>
>>> It'd be ugly to convert `std::array<T, N>` to essentially `struct { T
>>> t0, t1, ...tn-1; };` but N should be small (at least until we get variadic
>>> structured binding) and compiler authors can always hack in a better
>>> solution for privileged Library classes as long as the general case can be
>>> made to work for non-Standard library authors. A more ambitious solution
>>> would be to allow returning arrays from functions.
>>>
>>> For example, one could write:
>>>
>>> template<std::size_t... I, class... T> requires
>>> std::same_as<std::index_sequence<I...>, std::index_sequence_for<T...>>
>>> auto get<std::index_sequence<I...>>(my_tuple<T...> t) {
>>> union { my_tuple<T...> tt; } = {.tt = reloc t}; // prevent destructor
>>> return std::tuple<T...>(std::relocate(&get<I>(tt))...);
>>> }
>>>
>>>> --
>>> Std-Proposals mailing list
>>> Std-Proposals_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>
>>

Received on 2022-09-19 04:55:20