C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: Sébastien Bini <sebastien.bini_at_[hidden]>
Date: Fri, 16 Sep 2022 12:05:18 +0200
Hi Edward,

Thank you for those explanations. I'll get in touch with people from AFNOR
when we have a draft ready (ongoing).

There is still one technical aspect that we haven't talked about:
structured binding relocation.

I believe that, in order to support relocate-only types, we need to enable
relocation from a structured binding: `auto [x, y] = foo(); sink(reloc y);`
must work. This is motivated by:

   - The need to make APIs that support relocate-only types. How do you
   write an API to extract an item at an arbitrary position from a vector? I
   suggest the following: `std::pair<iterator, T>
   vector<T>::pilfer(const_iterator);` (returns next iterator and relocated
   vector element) as it is consistent with other vector APIs and complies
   with the core guidelines
   <https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-out>.
   Then, what can users do with the returned object as it lies in a pair, and
   that it is forbidden to relocate only parts of an object? The return value
   is unusable for relocate-only types, unless we support structured binding
   relocation.
   - Almost all C++ developers that I know believe that a structured
   binding is complete, separate object, and not a name alias to some
   subobject. As such they would find it weird that they cannot relocate a
   structured binding.

This is not a blocking point IMO, but will be very nice to have. vector's
pilfer can be worked around by rotating the element to the end of the
vector and calling some new vector::pilfer_back method that removes and
returns the last element. There is no performance overhead, it is just
cumbersome to write. Likewise we can just state that we cannot relocate a
structured binding and move on with the proposal. But I'd like to give it
more thoughts: structured binding relocation sounds like a very convenient
facility, and after all, how hard could it be to just relocate one item of
a pair? Well it's not simple :/

Before we dive in, let's agree to reuse the terms used in
https://en.cppreference.com/w/cpp/language/structured_binding . As such,
when we write: `auto [x, y] = foo();` it creates a hidden object `e` of
type `E` (which is the return type of `foo`), `x` and `y` are newly
introduced identifiers that are just aliases to some parts of `e`.

The best approach I can think of is that under some conditions, `x` and `y`
are not aliases but individual complete objects, and no hidden object `e`
is created. Hence we can write `reloc x` without violating the rules we
added (which forbids relocation of subobjects). Here are the conditions:

   - there must be no ref qualifiers in the auto declaration (auto [x, y] =
   foo(); passes, auto const& [x, y] = foo(); doesn't)
   - `E` must not have a user-defined destructor, a user-defined
   relocation, move or copy constructor. This should give us the guarantee
   that we can safely split an object into individual objects. Thankfully for
   us, std::pair and std::tuple comply.

That part is fine by me. The question is then how do we construct each
individual object that corresponds to the new identifiers we introduced? We
need to distinguish between the three binding protocols supported in C++17:

   - C array: the easiest of the three, each object is constructed
   element-wise using the best-suited constructor. That approach remains
   compatible with relocation.
   - Binding to data members: again, no real difficulty here, each object
   is constructed using the best-suited constructor from their corresponding
   data member. Again that approach remains compatible with relocation.
   - Tuple-like binding: this is the hardest of all, and probably the one
   we need to support the most. The main problem with tuple-like binding is
   that the language doesn't treat std::array, pair and tuples as specials.
   They are just types like any other, hidden behind the tuple_size,
   tuple_element and std::get APIs. We can construct each object using the
   member get (or std::get) functions, and that will work fine for copy
   constructor and move constructor, but not for relocation. First, the get
   functions will return references and not prvalues, so the relocation
   constructor will not be selected. Second, the get functions, had they
   returned by value, will not be allowed to relocate the subobject they
   return. Last, we have no warranty that tuple_size and all the get functions
   return a partition of `E`. It is mandatory to enable construction of the
   new objects (`x` and `y`) by relocation, as again, the whole point of this
   is to support relocate-only types (so `x` or `y` may be relocate-only).

The only solution that I can think of is an alternative API for the
tuple-like binding protocol. If that new API is not provided then we
fallback to C++17 structured bindings that cannot be relocated. The API I
think of is:

   - A template static member function get_member that returns the pointer
   to I-th data-member: `template <std::size_t I> auto E::get_member<I>()`
   - Or a get_member free function that does the same: `template <class E,
   std::size_t I> auto get_member()`

The return type of the get_member function must be:
`std::tuple_element_t<E, I> (E::*)`. The number of supported get_member
must match that of tuple_size<E>::value. For std::pair, that would be:

template <class First, class Second>
constexpr auto get_member<std::pair<First, Second>, 0> { return
&std::pair<First, Second>::first; }
template <class First, class Second>
constexpr auto get_member<std::pair<First, Second>, 1> { return
&std::pair<First, Second>::second; }

Let's consider the expression: `E e = foo(); auto [x, y] = reloc e;`. If E
provides the new API then the new objects (`x` and `y`) can be constructed
by relocation. The language needs to track, if any relocation constructor
is used, which subobjects of the source (`e`) got destructively relocated
into a new object of the "structured binding". Then the destructor of the
source object is not called directly. Instead the language will call the
destructor on all the subobjects of the source that were not relocated (we
need to keep in mind that relocation may not happen for every data-member,
and that some data-member may be hidden from the tuple-like binding API).
Hopefully, thanks to the pointer to data-member returned by the get_member
functions, the language is able to track down which parts are relocated and
which aren't.

I thought of just taking the address of whatever std::get returns, instead
of introducing a new API. However I don't believe we have any guarantee
that std::get returns a reference, and even a reference of some subobject
that lives within the tuple-like object (it could very well be a reference
to some static variable...). get_member solves both those problems.

I am not sure this is the best solution there is, I am just sharing my
thoughts about the subject. If get_member seems viable, we can easily
provide an implementation for pair and tuples. I wonder what can be done
with std::array (we cannot take the pointer to data-members that are array
elements, although the language considers array elements as subobjects...).

Regards,
Sébastien

On Sun, Sep 11, 2022 at 6:50 PM Edward Catmur <ecatmur_at_[hidden]>
wrote:

> Well, you should start by reading the list of articles under
> https://isocpp.org/std/ - if you haven't already.
>
> I'm assuming that you and Stormshield are domiciled in France? If so your
> national standards body is AFNOR and the good news is that AFNOR has a C++
> subgroup[1] and its members are active in C++ standardization - some names
> you may be familiar with include Jens Gustedt, Joel Falcou, Gabriel Dos
> Reis, Corentin Jabot etc. I gather that AFNOR has regular meetings to
> discuss C++ standardization so a useful step would be to attend those
> meetings either in person or by video to introduce yourself and perhaps
> present your proposal in front of a smaller audience in an informal
> context. At some point you might want to seek affiliation to AFNOR/CN CPP
> on behalf of Stormshield.
>
> 1.
> https://norminfo.afnor.org/structure/afnorcn-cpp/langage-de-programmation-cpp/119670
>
> On Mon, 5 Sept 2022 at 13:29, Sébastien Bini <sebastien.bini_at_[hidden]>
> wrote:
>
>> Hi Edward,
>>
>> Thank you for the thorough explanation. That should keep me busy for
>> quite some time :)
>>
>> Unfortunately, I and the company I work for (Stormshield) are not
>> affiliated with any standards organization. Would you or anyone on this
>> list have some documents describing that standardization process?
>>
>> This being my first proposal, and quite a large one, it would be a gargantuan
>> task to carry it on my own. I hope that, as it gets more light, more
>> people (hopefully standardization champions) will bring their support (just
>> like you have been doing).
>>
>> Thank you for your help with this proposal. I'll gladly contact you when
>> I have a complete draft. Thank you again!
>>
>> Best regards,
>> Sébastien
>>
>> On Mon, Aug 29, 2022 at 6:42 PM Edward Catmur <ecatmur_at_[hidden]>
>> wrote:
>>
>>> On Wed, 24 Aug 2022 at 09:59, Sébastien Bini <sebastien.bini_at_[hidden]>
>>> wrote:
>>>
>>>> Well, unless I am missing something, it looks like we clarified all the
>>>> blocking points so far? Thanks for your help in moving this forward!
>>>>
>>>> What are the next steps? I guess another revision of the proposal needs
>>>> to be written?
>>>>
>>>
>>> Yes, another revision of the proposal, ensuring that it covers:
>>>
>>> - acknowledgement of and references to previous and current
>>> proposals in the space
>>> - motivations (performance, safety/correctness, composability) with
>>> explanation why other proposals fall short
>>> - justification for each of the subfeatures: relocating constructor,
>>> relocating assignment operator, reloc keyword, try-with-init
>>> - specification of special member functions when (a) unspecified,
>>> (b) declared as defaulted, (c) defined as defaulted, (d) deleted, (e)
>>> user-defined; their interactions with the presence and noexcept
>>> qualification of the other special member functions; and the behavior of
>>> the user-defined relocating constructor for bases-and-members not specified
>>> - specification of reloc operator with regard to: loops; gotos;
>>> branches; sequencing within expressions; logical and ternary conditional
>>> expressions; hiding; error detection (making subsequent use ill-formed, not
>>> just UB); discarded-value expressions; function parameters vs. local
>>> variables
>>> - discussion of ABI implications and steps taken to ensure that ABI
>>> does not break unexpectedly and that Standard Library (and third-party
>>> library) implementations can attain as much benefit as possible consistent
>>> with preserving ABI
>>> - discussion of implementation techniques for relocating assignment
>>> operator: destroy-and-rebuild, copy-and-swap (and Just Swap), union
>>> technique (for the overconfident)
>>> - suggestions for type traits and concepts
>>> - suggestions for additional API for Library containers to make use
>>> of this feature (e.g. optional::pop, T vector::pop_back,
>>> make_from_tuple(prvalue))
>>> - suggestions for internal changes to Library (e.g. vector making
>>> use of trivial relocation to reallocate) possibly affecting Library API in
>>> visible ways (e.g. noexcept)
>>> - identification of further direction: defaultable swap (with
>>> Library interaction), perfect forwarding of prvalues
>>> - example code, written in terse, non-motivating style
>>> (single-letter or meaningless class, variable and function names) suitable
>>> for insertion into the Standard to clarify points and serve as
>>> implementer test cases
>>>
>>> Ideally we would also have a sample/reference implementation, compiler
>>> and standard library, along with test cases; I may be able to devote some
>>> time to this.
>>>
>>> Hopefully it goes without saying that the proposal should have its
>>> source text version-controlled and be typeset in a recognized style; many
>>> people use Bikeshed (https://tabatkins.github.io/bikeshed/) which takes
>>> Markdown and outputs HTML, while others use LaTeX outputting to PDF (I'm
>>> not sure exactly what styles they use).
>>>
>>> With regard to motivation, I would suggest starting with the existence
>>> of types that are (trivially) relocatable but not movable, e.g.
>>> gsl::non_null<std::unique_ptr<T>>, noting that these types are currently
>>> practically unusable despite being important to correctness, then noting
>>> that while the other outstanding proposals for trivial relocation would
>>> make some uses possible, they would no longer be composable with other (non
>>> trivially-relocatable) types, the situation requiring a relocation
>>> operation with appropriate (memberwise) behavior for aggregate (Rule of
>>> Zero) class types and the possibility of user-defined behaviors (e.g. for
>>> std::string). Then noting the additional requirement of a relocating
>>> assignment operation (since the copy and move assignment operations require
>>> the source object to be left in a destroyable state), and arguing for
>>> relocating constructor and assignment operator with respective signatures
>>> of T(T) and T& operator=(T) by parallel with the tripartite classification
>>> of value categories, and apologetics for the parameter aliasing its
>>> argument. Then demonstration of how this syntax can be used to accomplish
>>> the (mostly performance) goals of trivial relocatability (by defaulting the
>>> relocating constructor at declaration). Then discussion of why the reloc
>>> keyword is necessary to work with relocatable non-movable values, and
>>> demonstration of how it adds value by preventing bugs (use of moved-from
>>> variables, when used in place of std::move). Justification of
>>> try-with-init can probably wait until specification of the relocating
>>> assignment operator.
>>>
>>> Finally, are you affiliated with ISO/IEC JTC1 SC22 WG21, possibly via
>>> your National Body (NB, your national standards organization)? While
>>> supposedly not an absolute necessity, in practice it's essential for a
>>> proposal (especially one as wide-ranging as this one) to have (multiple)
>>> champions with formal affiliation and a record of participation, ideally
>>> both au fait with the standardization process and having prior experience
>>> successfully getting features into the Standard.
>>>
>>> Feel free to contact me here or off-list to write or review parts of the
>>> proposal; for assistance getting it published; or if you have thoughts
>>> about a sample/reference implementation. Good luck!
>>>
>>> Ed Catmur
>>>
>>

Received on 2022-09-16 10:05:31