ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Sun, 2 Oct 2022 22:09:13 +0100

On Fri, 30 Sept 2022 at 12:06, Edward Catmur <ecatmur_at_[hidden]> wrote:

> On Fri, 30 Sept 2022 at 09:33, Sébastien Bini <sebastien.bini_at_[hidden]>
> wrote:
>
>> On Thu, Sep 29, 2022 at 6:06 PM Edward Catmur <ecatmur_at_[hidden]>
>> wrote:
>>
>>> On Thu, 29 Sept 2022 at 14:12, Sébastien Bini <sebastien.bini_at_[hidden]>
>>> wrote:
>>>
>>>> On Mon, Sep 26, 2022 at 7:36 PM Edward Catmur <ecatmur_at_[hidden]>
>>>> wrote:
>>>>
>>>>> On Mon, 26 Sept 2022 at 15:20, Sébastien Bini <
>>>>> sebastien.bini_at_[hidden]> wrote:
>>>>>
>>>>>> I agree with the class invariant part.
>>>>>> But the no user-provided reloc, move and copy ctor clause was not
>>>>>> motivated because of the invariant, but because we wanted to be sure that
>>>>>> subobjects of a class could be independently relocated or destroyed. So at
>>>>>> minimum we must have a default destructor. And that's not enough.
>>>>>>
>>>>>> struct PainterGuard
>>>>>> {
>>>>>> Painter* _p;
>>>>>> StateGuard(Painter& p) : _p{&p} { _p->save(); }
>>>>>> ~StateGuard() { _p->restore(); }
>>>>>> };
>>>>>>
>>>>>> class PainterWithGuard
>>>>>> {
>>>>>> public:
>>>>>> PainterWithGuard(Painter p) : _p{reloc p}, _guard{_p} {}
>>>>>> PainterWithGuard() : _guard{_p} {}
>>>>>> PainterWithGuard(PainterWithGuard) /* reloc ctor */ { _guard._p =
>>>>>> &_p; }
>>>>>> public:
>>>>>> Painter _p;
>>>>>> private:
>>>>>> PainterGuard _guard;
>>>>>> };
>>>>>>
>>>>>> This is a perfectly valid class. Its default destructor works fine.
>>>>>> Unfortunately, from outside the class (with no access to private
>>>>>> data-members), doing:
>>>>>> `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>>>> painterWithGuard);`
>>>>>> Will call `restore()` on a destructed/relocated object.
>>>>>>
>>>>>> Likewise doing (in the class implementation this time):
>>>>>> `auto [p, g] = std::decompose<&PainterWithGuard::_p,
>>>>>> &PainterWithGuard::_guard>(reloc painterWithGuard);`
>>>>>> Will construct a PainterGuard that points to invalid data. This one
>>>>>> can be incriminated to the class writer as it needs access to private data.
>>>>>>
>>>>>> I agree this class has a bad design, but its safe to use otherwise.
>>>>>> std::decompose shouldn't cause crashes on badly designed classes.
>>>>>>
>>>>>> IMO the no user-provided reloc, move and copy ctor clause gives us
>>>>>> that extra guarantee, that there are no relationships between subobjects
>>>>>> and as such it is safe to independently relocate or destroy them.
>>>>>>
>>>>>
>>>>> Great example, and that's the sort of thing I was trying to describe
>>>>> with my talk of private inheritance. (PainterGuard could be a private base
>>>>> of PainterWithGuard, perhaps.)
>>>>>
>>>>> But I can still write a similar class that is currently safe, has no
>>>>> user-defined special member functions but would be unsafe to decompose:
>>>>> instead of writing or deleting the relocating constructor, have
>>>>> PainterGuard inherit from boost::noncopyable. So the existence of
>>>>> user-defined special member functions is insufficient to determine whether
>>>>> a class is safe to decompose.
>>>>>
>>>>
>>>> Yes. But this example could also be viewed as to give us a hint that
>>>> there must be no user-provided default constructor as well.
>>>>
>>>
>>> Still problematic:
>>>
>>> class Greeter : private boost::noncopyable
>>> {
>>> public:
>>> std::atomic<std::string> name;
>>> void run() { thrd = std::jthrd([&] { std::println("Hello {}",
>>> name.load()); }); }
>>> private:
>>> std::jthread thrd;
>>> };
>>>
>>> Modifying `name` from user code is fine, so it's OK (if slightly odd)
>>> for it to be public, but `decompose`ing it out is not, since then the
>>> jthread is referring to a destroyed object. Yet `Greeter` has no special
>>> member functions! So the real issue is that user code is affecting an
>>> object (Greeter::thrd) to which it has no access.
>>>
>>
>> Hmm. It feels off to use access control as a guarantee of class safety.
>> IMO the real issue is that we are trying to decompose something that isn't
>> decomposable.
>>
>
> Access control *is* how we guarantee class safety. Everything within an
> access boundary is required to work together to maintain class invariants
> during and up until the end of class lifetime.
>
> Unfortunately there is no is_decomposable trait, and implementing one
>> seems impossible. Even a C array could not be decomposable because of some
>> dependencies between its elements: `std::any arr[2];` good luck trying to
>> find out what's safe to do.
>>
>> My main worry is that structured binding relocation is built on top of
>> structured binding. In `auto [x, y] = foo();`, what is to say that `x` and
>> `y` are structured bindings and not complete individual objects? For the
>> moment we stated the rules on how it would work but not what triggers it.
>>
>> We must be sure that statements that work fine today because they use
>> structured bindings will not break because they silently got turned into
>> individual complete objects, but their original type was not decomposable.
>> For instance, in C++17 `auto [x, y] = foo();` the hidden object of type E
>> is introduced, but E may not be decomposable, albeit safe to create from
>> `foo`. We must be sure that this statement does not break silently with
>> this proposal.
>>
>> We could simply state that if reloc is used on one of the identifiers
>> introduced in the structured binding syntax, and that said identifier is
>> not ref-qualified, then we use the new APIs to construct individual
>> complete objects instead of structured bindings. I don't know if that's
>> reasonable as it implies searching in the rest of the function body to know
>> how to interpret that first statement.
>>
>
> OK, let's run down the cases. Bear in mind that only objects of value
> category prvalue can be decomposed; that is, relocating structured binding
> would not apply if there are any ref-qualifiers between `auto` and `[`.
>
> For an array type: yes, there could be dependencies between the members,
> but an array prvalue can only be created in-place (it can't be returned
> from a function) and so the responsibility would be on the function author.
> So decomposition into individual, independent elements is fine.
>
> For a tuple-like type: the trigger for relocating/decomposing structured
> binding would be that the hidden object `e` is prvalue and that the (new,
> extension) customization point is found by lookup. So the responsibility is
> on the class author to ensure that decomposition is safe; if they can't
> guarantee that, they just don't provide the new customization point and the
> structured binding yields xvalue references to the results of calling
> get<I>.
>
> For binding to data members: yes, this should only proceed as
> decomposition if the class E (and every base class between E and the base
> class the data members are found in) does not have a user-defined
> destructor. Again, if this does not apply we fall back to structured
> bindings as references; lvalues in this case.
>
> For case 1 (array) and case 3 (binding to data members) this is completely
> invisible because of relocation elision; the effect is exactly as if the
> compiler were to track which array elements or data members are
> subsequently relocated and call the destructor on those which are not.
>

Also, any issues caused by out-of-order destruction exist today for moving
from those objects, for the most part.

For case 2 (tuple protocol), there is a very visible effect in which
> customization point is called; it's the responsibility of the class author
> to ensure that they are equivalent.
>
> Another way would be to have a new statement: `auto reloc [x, y] = foo();`
>> (or `auto &< [x, y] = foo();`...) to clearly mark that we want to split
>> into individual objects. This best expresses the intent, and would not
>> break existing code.
>>
>
> As above, I don't think there's any real danger of breaking existing code;
> class authors could write a bad destructuring get_all in the tuple-like
> case, but that's their choice and responsibility.
>
> What I would worry about is the fallback being invisible and degrading
> performance, but I think that we're fine if we just make `reloc x`
> ill-formed in that case.
>

On second thoughts, I don't think it should be ill-formed; it should just
fall back to move, just as it would for a class that is movable but not
relocatable. If the class is truly relocate-only, then it would be
ill-formed.

Instead, consider: `std::decompose` is accessing (on behalf of its caller)
>>>>> each direct subobject (base and data member) that is returned. But it is
>>>>> also accessing the *other* direct subobjects that it does not return, in
>>>>> order to destroy them. So let's say that to call std::decompose, you must
>>>>> have access to each direct subobject, including those that you don't
>>>>> request. (Plus their relocators or destructors, respectively.)
>>>>>
>>>>> Then `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>>> painterWithGuard);` would be ill-formed because in that context,
>>>>> `painterWithGuard._guard` is ill-formed.
>>>>>
>>>>> Do you think this would work?
>>>>>
>>>>
>>>> But if _guard were declared as public by mistake then all those
>>>> safeties are bypassed and std::decompose will cause trouble.
>>>>
>>>
>>> In that case, aggregate-style structured binding would likely also work.
>>> And that's definitely the class author's fault.
>>>
>>
>> Yes but it would not break things the way std::decompose does. (type may
>> be movable but not decomposable).
>>
>
> True. I guess you have to accept some small risk of breakage; even an
> aggregate struct S { X x; Y y; }; could have invariants established
> (post-construction) between x and y that break if x is destroyed before y.
> But that's fragile code already; the author should have made those data
> members private. At least they can forestall this by adding a user-declared
> destructor.
>
> This also limits the use of std::decompose to the class implementation and
>>>> friends. It is not necessarily a bad thing as it requires the owner to know
>>>> of the class internals (or at least to be in the position of knowing...).
>>>> Besides I expect the main use of std::decompose to be from the new special
>>>> get_bindings function.
>>>>
>>>
>>> Yes, and that customization point (or some `get_all`,
>>> `get<index_sequence<I...>>`, etc.) would be expected to be a member or
>>> friend.
>>>
>>
>> That could work. BTW I am more inclined to give it another name than
>> `get` as we could have a tuple that contains an index_sequence as
>> parameter, and then we don't know what std::get would refer to.
>>
>
> Yes, good point.
>
> Actually, from a brutally practical perspective, handling tuple-like types
> could be omitted from a MVP. You'd still be able to destructure std::pair
> by wrapping it in another (derived) type to suppress the tuple protocol
> (since `first` and `second` are direct data members of the same class), and
> authors of tuple-like types outside the Standard Library would be able to
> write their own function to return the relocated elements as data members
> of an aggregate.
>
> In addition, now that I think of it, the no user-provided special ctors
>>>> and dtor clause is required to allow structure binding relocation from a
>>>> class-type. In auto [x, y] = reloc data, data must be a C array (no problem
>>>> there), a class-type with all data-members coming from the class-type or
>>>> the same base, or provide a get_binding function that applies recursively.
>>>> The end result of get_binding must fall into the C array case or the
>>>> class-type case. For the class-type case, we must be allowed to split the
>>>> type into individual subobjects, and have again that sort of guarantee that
>>>> subobjects can be considered individually. Hence this no user-provided
>>>> special ctors and dtor clause must apply to the class-type. This is the
>>>> case for whatever std::decompose returns, but it is worth mentioning. And
>>>> that bit makes me think that this clause should also apply to
>>>> std::decompose.
>>>>
>>>
>>> Remember, the motivation is for libraries (not necessarily the Standard
>>> Library) to be able to write decompositions for their types; in particular,
>>> for tuple types. For example, Boost.Tuple should be able to support
>>> relocating decomposition without any change to its current ABI or API. So
>>> user-provided special member functions must be allowed.
>>>
>>
>> Fair point.
>>
>>
>>> I can't feel satisfied with this solution. We end up with two ways of
>>>> decomposing: std::decompose and the get_binding function. get_binding is
>>>> still necessary as you can't access nested subobjects from std::decompose,
>>>> or even generate objects on demand when using structured bindings. And
>>>> other times get_bindings can be bypassed if the user performs structured
>>>> binding by calling std::decompose directly. I find this dual approach a bit
>>>> confusing.
>>>>
>>>
>>> Well, the `get` protocol is for public use, and `std::decompose` for
>>> privileged access - although it can be used on aggregate types. Also it'd
>>> be useful for prvalue qualified member functions, etc. where previously the
>>> "union hack" would have been necessary, e.g.:
>>>
>>> template<class T>
>>> T* unique_ptr<T>::release(this unique_ptr self) {
>>> auto const [ptr] = std::decompose<&unique_ptr::ptr_>(reloc self);
>>> return ptr;
>>> }
>>>
>>
>> Yes, I agree with that. I worry about mistaken uses of std::decompose for
>> classes where access control is badly designed.
>>
>
> Yes. But if we don't provide it, the union hack will be used, and then
> there's the same risk plus the risk of forgetting subobject destructors.
> Maybe this is OK for an initial proposal.
>
>

Received on 2022-10-02 21:09:26