ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Sun, 9 Oct 2022 17:34:46 +0100

On Fri, 7 Oct 2022 at 22:26, Edward Catmur <ecatmur_at_[hidden]> wrote:

> On Fri, 7 Oct 2022 at 15:35, Sébastien Bini <sebastien.bini_at_[hidden]>
> wrote:
>
>> On Sun, Oct 2, 2022 at 11:09 PM Edward Catmur <ecatmur_at_[hidden]>
>> wrote:
>>
>>> On Fri, 30 Sept 2022 at 12:06, Edward Catmur <ecatmur_at_[hidden]>
>>> wrote:
>>>
>>>> OK, let's run down the cases. Bear in mind that only objects of value
>>>> category prvalue can be decomposed; that is, relocating structured binding
>>>> would not apply if there are any ref-qualifiers between `auto` and `[`.
>>>>
>>>> For an array type: yes, there could be dependencies between the
>>>> members, but an array prvalue can only be created in-place (it can't be
>>>> returned from a function) and so the responsibility would be on the
>>>> function author. So decomposition into individual, independent elements is
>>>> fine.
>>>>
>>>> For a tuple-like type: the trigger for relocating/decomposing
>>>> structured binding would be that the hidden object `e` is prvalue and that
>>>> the (new, extension) customization point is found by lookup. So the
>>>> responsibility is on the class author to ensure that decomposition is safe;
>>>> if they can't guarantee that, they just don't provide the new customization
>>>> point and the structured binding yields xvalue references to the results of
>>>> calling get<I>.
>>>>
>>>> For binding to data members: yes, this should only proceed as
>>>> decomposition if the class E (and every base class between E and the base
>>>> class the data members are found in) does not have a user-defined
>>>> destructor. Again, if this does not apply we fall back to structured
>>>> bindings as references; lvalues in this case.
>>>>
>>>> For case 1 (array) and case 3 (binding to data members) this is
>>>> completely invisible because of relocation elision; the effect is exactly
>>>> as if the compiler were to track which array elements or data members are
>>>> subsequently relocated and call the destructor on those which are not.
>>>>
>>>
>>> Also, any issues caused by out-of-order destruction exist today for
>>> moving from those objects, for the most part.
>>>
>>
>> Yes, I agree. I guess what worries me the most is incorrect uses of
>> std::decompose.
>>
>>
>>> For case 2 (tuple protocol), there is a very visible effect in which
>>>> customization point is called; it's the responsibility of the class author
>>>> to ensure that they are equivalent.
>>>>
>>>> Another way would be to have a new statement: `auto reloc [x, y] =
>>>>> foo();` (or `auto &< [x, y] = foo();`...) to clearly mark that we want to
>>>>> split into individual objects. This best expresses the intent, and would
>>>>> not break existing code.
>>>>>
>>>>
>>>> As above, I don't think there's any real danger of breaking existing
>>>> code; class authors could write a bad destructuring get_all in the
>>>> tuple-like case, but that's their choice and responsibility.
>>>>
>>>> What I would worry about is the fallback being invisible and degrading
>>>> performance, but I think that we're fine if we just make `reloc x`
>>>> ill-formed in that case.
>>>>
>>>
>>> On second thoughts, I don't think it should be ill-formed; it should
>>> just fall back to move, just as it would for a class that is movable but
>>> not relocatable. If the class is truly relocate-only, then it would be
>>> ill-formed.
>>>
>>
>> It could act normally, simply changing the value-category, and ruling out
>> the relocation ctor during overload resolution. (Unless `x` is
>> ref-qualified, in which case it's equivalent to
>> `static_cast<decltype(x)>(x)`).
>> This is what we are doing when relocating a function parameter captured
>> by value, if the function ABI is caller-detroy.
>> In both cases it would be ill-formed for relocate-only types.
>>
>> Note that `auto reloc [x, y] = foo()` could also give the guarantee that
>> `x` and `y` are individual objects (would be ill-formed otherwise). Hence
>> C++ would not silently rule out the relocation ctor.
>> Not to say I am in favor of `auto reloc`, I think proper compiler error
>> messages can do the job of explaining why the relocation ctor was ruled out.
>>
>
Another thought - if `std::decompose` checks for private access to classes
with their own SMFs (to guarantee that they have the right to bypass
invariants), then writing instead `auto [x, y] = std::decompose<&S::x,
&S::y>(foo());` will guarantee relocation at the cost of some extra
verbosity.

Instead, consider: `std::decompose` is accessing (on behalf of its caller)
>>>>>>>> each direct subobject (base and data member) that is returned. But it is
>>>>>>>> also accessing the *other* direct subobjects that it does not return, in
>>>>>>>> order to destroy them. So let's say that to call std::decompose, you must
>>>>>>>> have access to each direct subobject, including those that you don't
>>>>>>>> request. (Plus their relocators or destructors, respectively.)
>>>>>>>>
>>>>>>>> Then `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>>>>>> painterWithGuard);` would be ill-formed because in that context,
>>>>>>>> `painterWithGuard._guard` is ill-formed.
>>>>>>>>
>>>>>>>> Do you think this would work?
>>>>>>>>
>>>>>>>
>>>>>>> But if _guard were declared as public by mistake then all those
>>>>>>> safeties are bypassed and std::decompose will cause trouble.
>>>>>>>
>>>>>>
>>>>>> In that case, aggregate-style structured binding would likely also
>>>>>> work. And that's definitely the class author's fault.
>>>>>>
>>>>>
>>>>> Yes but it would not break things the way std::decompose does. (type
>>>>> may be movable but not decomposable).
>>>>>
>>>>
>>>> True. I guess you have to accept some small risk of breakage; even an
>>>> aggregate struct S { X x; Y y; }; could have invariants established
>>>> (post-construction) between x and y that break if x is destroyed before y.
>>>> But that's fragile code already; the author should have made those data
>>>> members private. At least they can forestall this by adding a user-declared
>>>> destructor.
>>>>
>>>
>> Yes. I agree that we need this, but I would have liked it to be safer.
>>
>> When I first proposed `get_bindings`, it took all subobjects of the class
>> passed by prvalue. The language itself would split the source object, and
>> pass all parts to get_bindings. This approach has several problems, it is
>> quite inconvenient to use when we have a large amount of subobjects, and
>> will not detect manual same-type data-member reordering in the class
>> declaration. And it does not even support arrays. But at least, it was
>> *opt-in*. You could only decompose an object if the type had implemented
>> that weird get_bindings function, which would give the guarantee that this
>> was a safe operation.
>>
>> This is what is lacking to std::decompose IMO. The class has almost no
>> say in this, it opts in by default. They can provide a user-defined
>> destructor, but that feels like an opt-out side effect.
>> I'd prefer if things were reversed, a class type opts out by default
>> (std::decompose is ill-formed) but can opt-in.
>>
>
> As I'm proposing it, if a class has *any* private immediate subobjects
> (direct base or member), then `std::decompose` can only be called by code
> within the class access boundary: that is, the class and its friends.
>
> So "opt-in by default" really only applies to aggregate types, and by
> making all their immediate subobjects public they're implicitly opting in
> to anything the Standard adds in future. Plus quite a lot of classes like
> that are likely subject to structured binding aggregate decomposition
> anyway, so it'd be odd to say that `auto [x, y] = S(...)` is allowed but
> `auto [x, y] = std::decompose<&S::x, &S::y>(S(...))` is not.
>
> Another angle on the issue of user-defined destructors: how about if any
> class with a user-defined destructor (or, perhaps, any SMF) can only be
> `std::decompose`d by code within that class's access boundary, even if all
> its immediate subobjects are public? It'd be as if every class with a
> user-defined destructor has an implicit anonymous sizeless private member
> that must be accessible for `std::decompose` to be valid. This would
> protect classes that use SMFs to maintain invariants between public data
> members while allowing those classes to decompose themselves. It'd also
> mean that `std::unique_ptr::release(this unique_ptr)` can use
> std::decompose.
>
> This also limits the use of std::decompose to the class implementation and
>>>>>>> friends. It is not necessarily a bad thing as it requires the owner to know
>>>>>>> of the class internals (or at least to be in the position of knowing...).
>>>>>>> Besides I expect the main use of std::decompose to be from the new special
>>>>>>> get_bindings function.
>>>>>>>
>>>>>>
>>>>>> Yes, and that customization point (or some `get_all`,
>>>>>> `get<index_sequence<I...>>`, etc.) would be expected to be a member or
>>>>>> friend.
>>>>>>
>>>>>
>>>>> That could work. BTW I am more inclined to give it another name than
>>>>> `get` as we could have a tuple that contains an index_sequence as
>>>>> parameter, and then we don't know what std::get would refer to.
>>>>>
>>>>
>>>> Yes, good point.
>>>>
>>>> Actually, from a brutally practical perspective, handling tuple-like
>>>> types could be omitted from a MVP. You'd still be able to destructure
>>>> std::pair by wrapping it in another (derived) type to suppress the tuple
>>>> protocol (since `first` and `second` are direct data members of the same
>>>> class), and authors of tuple-like types outside the Standard Library would
>>>> be able to write their own function to return the relocated elements as
>>>> data members of an aggregate.
>>>>
>>>
>> I don't see how it could work for a custom tuple class. You cannot
>> declare a structure with a variable amount of heterogeneous types.
>> `template <class... Args> struct Tpl { Args... args; };` does not work
>> AFAIK. If the compiler does not provide one for you as it does for
>> std::tuple, then I don't see how boost::tuple and the like could support
>> this.
>>
>
> Boost.Tuple only supports up to 10 elements (it precedes variadic
> templates). You can support up to any configurable fixed number using
> (non-template) code generation, e.g. Boost.Preprocessor.
>
> In addition, now that I think of it, the no user-provided special ctors
>>>>>>> and dtor clause is required to allow structure binding relocation from a
>>>>>>> class-type. In auto [x, y] = reloc data, data must be a C array (no problem
>>>>>>> there), a class-type with all data-members coming from the class-type or
>>>>>>> the same base, or provide a get_binding function that applies recursively.
>>>>>>> The end result of get_binding must fall into the C array case or the
>>>>>>> class-type case. For the class-type case, we must be allowed to split the
>>>>>>> type into individual subobjects, and have again that sort of guarantee that
>>>>>>> subobjects can be considered individually. Hence this no user-provided
>>>>>>> special ctors and dtor clause must apply to the class-type. This is the
>>>>>>> case for whatever std::decompose returns, but it is worth mentioning. And
>>>>>>> that bit makes me think that this clause should also apply to
>>>>>>> std::decompose.
>>>>>>>
>>>>>>
>>>>>> Remember, the motivation is for libraries (not necessarily the
>>>>>> Standard Library) to be able to write decompositions for their types; in
>>>>>> particular, for tuple types. For example, Boost.Tuple should be able to
>>>>>> support relocating decomposition without any change to its current ABI or
>>>>>> API. So user-provided special member functions must be allowed.
>>>>>>
>>>>>
>>>>> Fair point.
>>>>>
>>>>>
>>>>>> I can't feel satisfied with this solution. We end up with two ways of
>>>>>>> decomposing: std::decompose and the get_binding function. get_binding is
>>>>>>> still necessary as you can't access nested subobjects from std::decompose,
>>>>>>> or even generate objects on demand when using structured bindings. And
>>>>>>> other times get_bindings can be bypassed if the user performs structured
>>>>>>> binding by calling std::decompose directly. I find this dual approach a bit
>>>>>>> confusing.
>>>>>>>
>>>>>>
>>>>>> Well, the `get` protocol is for public use, and `std::decompose` for
>>>>>> privileged access - although it can be used on aggregate types. Also it'd
>>>>>> be useful for prvalue qualified member functions, etc. where previously the
>>>>>> "union hack" would have been necessary, e.g.:
>>>>>>
>>>>>> template<class T>
>>>>>> T* unique_ptr<T>::release(this unique_ptr self) {
>>>>>> auto const [ptr] = std::decompose<&unique_ptr::ptr_>(reloc self);
>>>>>> return ptr;
>>>>>> }
>>>>>>
>>>>>
>>>>> Yes, I agree with that. I worry about mistaken uses of std::decompose
>>>>> for classes where access control is badly designed.
>>>>>
>>>>
>>>> Yes. But if we don't provide it, the union hack will be used, and then
>>>> there's the same risk plus the risk of forgetting subobject destructors.
>>>> Maybe this is OK for an initial proposal.
>>>>
>>>
>> I think I'll propose std::decompose, and we will see the reception it
>> gets :) Although I am still thinking of a way to make it safer.
>>
>

Received on 2022-10-09 16:35:00