C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Fri, 7 Oct 2022 22:26:07 +0100
On Fri, 7 Oct 2022 at 15:35, Sébastien Bini <sebastien.bini_at_[hidden]>
wrote:

> On Sun, Oct 2, 2022 at 11:09 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
>
>> On Fri, 30 Sept 2022 at 12:06, Edward Catmur <ecatmur_at_[hidden]>
>> wrote:
>>
>>> OK, let's run down the cases. Bear in mind that only objects of value
>>> category prvalue can be decomposed; that is, relocating structured binding
>>> would not apply if there are any ref-qualifiers between `auto` and `[`.
>>>
>>> For an array type: yes, there could be dependencies between the members,
>>> but an array prvalue can only be created in-place (it can't be returned
>>> from a function) and so the responsibility would be on the function author.
>>> So decomposition into individual, independent elements is fine.
>>>
>>> For a tuple-like type: the trigger for relocating/decomposing structured
>>> binding would be that the hidden object `e` is prvalue and that the (new,
>>> extension) customization point is found by lookup. So the responsibility is
>>> on the class author to ensure that decomposition is safe; if they can't
>>> guarantee that, they just don't provide the new customization point and the
>>> structured binding yields xvalue references to the results of calling
>>> get<I>.
>>>
>>> For binding to data members: yes, this should only proceed as
>>> decomposition if the class E (and every base class between E and the base
>>> class the data members are found in) does not have a user-defined
>>> destructor. Again, if this does not apply we fall back to structured
>>> bindings as references; lvalues in this case.
>>>
>>> For case 1 (array) and case 3 (binding to data members) this is
>>> completely invisible because of relocation elision; the effect is exactly
>>> as if the compiler were to track which array elements or data members are
>>> subsequently relocated and call the destructor on those which are not.
>>>
>>
>> Also, any issues caused by out-of-order destruction exist today for
>> moving from those objects, for the most part.
>>
>
> Yes, I agree. I guess what worries me the most is incorrect uses of
> std::decompose.
>
>
>> For case 2 (tuple protocol), there is a very visible effect in which
>>> customization point is called; it's the responsibility of the class author
>>> to ensure that they are equivalent.
>>>
>>> Another way would be to have a new statement: `auto reloc [x, y] =
>>>> foo();` (or `auto &< [x, y] = foo();`...) to clearly mark that we want to
>>>> split into individual objects. This best expresses the intent, and would
>>>> not break existing code.
>>>>
>>>
>>> As above, I don't think there's any real danger of breaking existing
>>> code; class authors could write a bad destructuring get_all in the
>>> tuple-like case, but that's their choice and responsibility.
>>>
>>> What I would worry about is the fallback being invisible and degrading
>>> performance, but I think that we're fine if we just make `reloc x`
>>> ill-formed in that case.
>>>
>>
>> On second thoughts, I don't think it should be ill-formed; it should just
>> fall back to move, just as it would for a class that is movable but not
>> relocatable. If the class is truly relocate-only, then it would be
>> ill-formed.
>>
>
> It could act normally, simply changing the value-category, and ruling out
> the relocation ctor during overload resolution. (Unless `x` is
> ref-qualified, in which case it's equivalent to
> `static_cast<decltype(x)>(x)`).
> This is what we are doing when relocating a function parameter captured by
> value, if the function ABI is caller-detroy.
> In both cases it would be ill-formed for relocate-only types.
>
> Note that `auto reloc [x, y] = foo()` could also give the guarantee that
> `x` and `y` are individual objects (would be ill-formed otherwise). Hence
> C++ would not silently rule out the relocation ctor.
> Not to say I am in favor of `auto reloc`, I think proper compiler error
> messages can do the job of explaining why the relocation ctor was ruled out.
>
>
>> Instead, consider: `std::decompose` is accessing (on behalf of its
>>>>>>> caller) each direct subobject (base and data member) that is returned. But
>>>>>>> it is also accessing the *other* direct subobjects that it does not return,
>>>>>>> in order to destroy them. So let's say that to call std::decompose, you
>>>>>>> must have access to each direct subobject, including those that you don't
>>>>>>> request. (Plus their relocators or destructors, respectively.)
>>>>>>>
>>>>>>> Then `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>>>>> painterWithGuard);` would be ill-formed because in that context,
>>>>>>> `painterWithGuard._guard` is ill-formed.
>>>>>>>
>>>>>>> Do you think this would work?
>>>>>>>
>>>>>>
>>>>>> But if _guard were declared as public by mistake then all those
>>>>>> safeties are bypassed and std::decompose will cause trouble.
>>>>>>
>>>>>
>>>>> In that case, aggregate-style structured binding would likely also
>>>>> work. And that's definitely the class author's fault.
>>>>>
>>>>
>>>> Yes but it would not break things the way std::decompose does. (type
>>>> may be movable but not decomposable).
>>>>
>>>
>>> True. I guess you have to accept some small risk of breakage; even an
>>> aggregate struct S { X x; Y y; }; could have invariants established
>>> (post-construction) between x and y that break if x is destroyed before y.
>>> But that's fragile code already; the author should have made those data
>>> members private. At least they can forestall this by adding a user-declared
>>> destructor.
>>>
>>
> Yes. I agree that we need this, but I would have liked it to be safer.
>
> When I first proposed `get_bindings`, it took all subobjects of the class
> passed by prvalue. The language itself would split the source object, and
> pass all parts to get_bindings. This approach has several problems, it is
> quite inconvenient to use when we have a large amount of subobjects, and
> will not detect manual same-type data-member reordering in the class
> declaration. And it does not even support arrays. But at least, it was
> *opt-in*. You could only decompose an object if the type had implemented
> that weird get_bindings function, which would give the guarantee that this
> was a safe operation.
>
> This is what is lacking to std::decompose IMO. The class has almost no say
> in this, it opts in by default. They can provide a user-defined destructor,
> but that feels like an opt-out side effect.
> I'd prefer if things were reversed, a class type opts out by default
> (std::decompose is ill-formed) but can opt-in.
>

As I'm proposing it, if a class has *any* private immediate subobjects
(direct base or member), then `std::decompose` can only be called by code
within the class access boundary: that is, the class and its friends.

So "opt-in by default" really only applies to aggregate types, and by
making all their immediate subobjects public they're implicitly opting in
to anything the Standard adds in future. Plus quite a lot of classes like
that are likely subject to structured binding aggregate decomposition
anyway, so it'd be odd to say that `auto [x, y] = S(...)` is allowed but
`auto [x, y] = std::decompose<&S::x, &S::y>(S(...))` is not.

Another angle on the issue of user-defined destructors: how about if any
class with a user-defined destructor (or, perhaps, any SMF) can only be
`std::decompose`d by code within that class's access boundary, even if all
its immediate subobjects are public? It'd be as if every class with a
user-defined destructor has an implicit anonymous sizeless private member
that must be accessible for `std::decompose` to be valid. This would
protect classes that use SMFs to maintain invariants between public data
members while allowing those classes to decompose themselves. It'd also
mean that `std::unique_ptr::release(this unique_ptr)` can use
std::decompose.

This also limits the use of std::decompose to the class implementation and
>>>>>> friends. It is not necessarily a bad thing as it requires the owner to know
>>>>>> of the class internals (or at least to be in the position of knowing...).
>>>>>> Besides I expect the main use of std::decompose to be from the new special
>>>>>> get_bindings function.
>>>>>>
>>>>>
>>>>> Yes, and that customization point (or some `get_all`,
>>>>> `get<index_sequence<I...>>`, etc.) would be expected to be a member or
>>>>> friend.
>>>>>
>>>>
>>>> That could work. BTW I am more inclined to give it another name than
>>>> `get` as we could have a tuple that contains an index_sequence as
>>>> parameter, and then we don't know what std::get would refer to.
>>>>
>>>
>>> Yes, good point.
>>>
>>> Actually, from a brutally practical perspective, handling tuple-like
>>> types could be omitted from a MVP. You'd still be able to destructure
>>> std::pair by wrapping it in another (derived) type to suppress the tuple
>>> protocol (since `first` and `second` are direct data members of the same
>>> class), and authors of tuple-like types outside the Standard Library would
>>> be able to write their own function to return the relocated elements as
>>> data members of an aggregate.
>>>
>>
> I don't see how it could work for a custom tuple class. You cannot declare
> a structure with a variable amount of heterogeneous types. `template
> <class... Args> struct Tpl { Args... args; };` does not work AFAIK. If the
> compiler does not provide one for you as it does for std::tuple, then I
> don't see how boost::tuple and the like could support this.
>

Boost.Tuple only supports up to 10 elements (it precedes variadic
templates). You can support up to any configurable fixed number using
(non-template) code generation, e.g. Boost.Preprocessor.

In addition, now that I think of it, the no user-provided special ctors and
>>>>>> dtor clause is required to allow structure binding relocation from a
>>>>>> class-type. In auto [x, y] = reloc data, data must be a C array (no problem
>>>>>> there), a class-type with all data-members coming from the class-type or
>>>>>> the same base, or provide a get_binding function that applies recursively.
>>>>>> The end result of get_binding must fall into the C array case or the
>>>>>> class-type case. For the class-type case, we must be allowed to split the
>>>>>> type into individual subobjects, and have again that sort of guarantee that
>>>>>> subobjects can be considered individually. Hence this no user-provided
>>>>>> special ctors and dtor clause must apply to the class-type. This is the
>>>>>> case for whatever std::decompose returns, but it is worth mentioning. And
>>>>>> that bit makes me think that this clause should also apply to
>>>>>> std::decompose.
>>>>>>
>>>>>
>>>>> Remember, the motivation is for libraries (not necessarily the
>>>>> Standard Library) to be able to write decompositions for their types; in
>>>>> particular, for tuple types. For example, Boost.Tuple should be able to
>>>>> support relocating decomposition without any change to its current ABI or
>>>>> API. So user-provided special member functions must be allowed.
>>>>>
>>>>
>>>> Fair point.
>>>>
>>>>
>>>>> I can't feel satisfied with this solution. We end up with two ways of
>>>>>> decomposing: std::decompose and the get_binding function. get_binding is
>>>>>> still necessary as you can't access nested subobjects from std::decompose,
>>>>>> or even generate objects on demand when using structured bindings. And
>>>>>> other times get_bindings can be bypassed if the user performs structured
>>>>>> binding by calling std::decompose directly. I find this dual approach a bit
>>>>>> confusing.
>>>>>>
>>>>>
>>>>> Well, the `get` protocol is for public use, and `std::decompose` for
>>>>> privileged access - although it can be used on aggregate types. Also it'd
>>>>> be useful for prvalue qualified member functions, etc. where previously the
>>>>> "union hack" would have been necessary, e.g.:
>>>>>
>>>>> template<class T>
>>>>> T* unique_ptr<T>::release(this unique_ptr self) {
>>>>> auto const [ptr] = std::decompose<&unique_ptr::ptr_>(reloc self);
>>>>> return ptr;
>>>>> }
>>>>>
>>>>
>>>> Yes, I agree with that. I worry about mistaken uses of std::decompose
>>>> for classes where access control is badly designed.
>>>>
>>>
>>> Yes. But if we don't provide it, the union hack will be used, and then
>>> there's the same risk plus the risk of forgetting subobject destructors.
>>> Maybe this is OK for an initial proposal.
>>>
>>
> I think I'll propose std::decompose, and we will see the reception it gets
> :) Although I am still thinking of a way to make it safer.
>

Received on 2022-10-07 21:26:20