ISOCPP std-proposals List: Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Fri, 30 Sep 2022 12:06:39 +0100

On Fri, 30 Sept 2022 at 09:33, Sébastien Bini <sebastien.bini_at_[hidden]>
wrote:

> On Thu, Sep 29, 2022 at 6:06 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
>
>> On Thu, 29 Sept 2022 at 14:12, Sébastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>
>>> On Mon, Sep 26, 2022 at 7:36 PM Edward Catmur <ecatmur_at_[hidden]>
>>> wrote:
>>>
>>>> On Mon, 26 Sept 2022 at 15:20, Sébastien Bini <sebastien.bini_at_[hidden]>
>>>> wrote:
>>>>
>>>>> I agree with the class invariant part.
>>>>> But the no user-provided reloc, move and copy ctor clause was not
>>>>> motivated because of the invariant, but because we wanted to be sure that
>>>>> subobjects of a class could be independently relocated or destroyed. So at
>>>>> minimum we must have a default destructor. And that's not enough.
>>>>>
>>>>> struct PainterGuard
>>>>> {
>>>>> Painter* _p;
>>>>> StateGuard(Painter& p) : _p{&p} { _p->save(); }
>>>>> ~StateGuard() { _p->restore(); }
>>>>> };
>>>>>
>>>>> class PainterWithGuard
>>>>> {
>>>>> public:
>>>>> PainterWithGuard(Painter p) : _p{reloc p}, _guard{_p} {}
>>>>> PainterWithGuard() : _guard{_p} {}
>>>>> PainterWithGuard(PainterWithGuard) /* reloc ctor */ { _guard._p =
>>>>> &_p; }
>>>>> public:
>>>>> Painter _p;
>>>>> private:
>>>>> PainterGuard _guard;
>>>>> };
>>>>>
>>>>> This is a perfectly valid class. Its default destructor works fine.
>>>>> Unfortunately, from outside the class (with no access to private
>>>>> data-members), doing:
>>>>> `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>>> painterWithGuard);`
>>>>> Will call `restore()` on a destructed/relocated object.
>>>>>
>>>>> Likewise doing (in the class implementation this time):
>>>>> `auto [p, g] = std::decompose<&PainterWithGuard::_p,
>>>>> &PainterWithGuard::_guard>(reloc painterWithGuard);`
>>>>> Will construct a PainterGuard that points to invalid data. This one
>>>>> can be incriminated to the class writer as it needs access to private data.
>>>>>
>>>>> I agree this class has a bad design, but its safe to use otherwise.
>>>>> std::decompose shouldn't cause crashes on badly designed classes.
>>>>>
>>>>> IMO the no user-provided reloc, move and copy ctor clause gives us
>>>>> that extra guarantee, that there are no relationships between subobjects
>>>>> and as such it is safe to independently relocate or destroy them.
>>>>>
>>>>
>>>> Great example, and that's the sort of thing I was trying to describe
>>>> with my talk of private inheritance. (PainterGuard could be a private base
>>>> of PainterWithGuard, perhaps.)
>>>>
>>>> But I can still write a similar class that is currently safe, has no
>>>> user-defined special member functions but would be unsafe to decompose:
>>>> instead of writing or deleting the relocating constructor, have
>>>> PainterGuard inherit from boost::noncopyable. So the existence of
>>>> user-defined special member functions is insufficient to determine whether
>>>> a class is safe to decompose.
>>>>
>>>
>>> Yes. But this example could also be viewed as to give us a hint that
>>> there must be no user-provided default constructor as well.
>>>
>>
>> Still problematic:
>>
>> class Greeter : private boost::noncopyable
>> {
>> public:
>> std::atomic<std::string> name;
>> void run() { thrd = std::jthrd([&] { std::println("Hello {}",
>> name.load()); }); }
>> private:
>> std::jthread thrd;
>> };
>>
>> Modifying `name` from user code is fine, so it's OK (if slightly odd) for
>> it to be public, but `decompose`ing it out is not, since then the jthread
>> is referring to a destroyed object. Yet `Greeter` has no special member
>> functions! So the real issue is that user code is affecting an object
>> (Greeter::thrd) to which it has no access.
>>
>
> Hmm. It feels off to use access control as a guarantee of class safety.
> IMO the real issue is that we are trying to decompose something that isn't
> decomposable.
>

Access control *is* how we guarantee class safety. Everything within an
access boundary is required to work together to maintain class invariants
during and up until the end of class lifetime.

Unfortunately there is no is_decomposable trait, and implementing one seems
> impossible. Even a C array could not be decomposable because of some
> dependencies between its elements: `std::any arr[2];` good luck trying to
> find out what's safe to do.
>
> My main worry is that structured binding relocation is built on top of
> structured binding. In `auto [x, y] = foo();`, what is to say that `x` and
> `y` are structured bindings and not complete individual objects? For the
> moment we stated the rules on how it would work but not what triggers it.
>
> We must be sure that statements that work fine today because they use
> structured bindings will not break because they silently got turned into
> individual complete objects, but their original type was not decomposable.
> For instance, in C++17 `auto [x, y] = foo();` the hidden object of type E
> is introduced, but E may not be decomposable, albeit safe to create from
> `foo`. We must be sure that this statement does not break silently with
> this proposal.
>
> We could simply state that if reloc is used on one of the identifiers
> introduced in the structured binding syntax, and that said identifier is
> not ref-qualified, then we use the new APIs to construct individual
> complete objects instead of structured bindings. I don't know if that's
> reasonable as it implies searching in the rest of the function body to know
> how to interpret that first statement.
>

OK, let's run down the cases. Bear in mind that only objects of value
category prvalue can be decomposed; that is, relocating structured binding
would not apply if there are any ref-qualifiers between `auto` and `[`.

For an array type: yes, there could be dependencies between the members,
but an array prvalue can only be created in-place (it can't be returned
from a function) and so the responsibility would be on the function author.
So decomposition into individual, independent elements is fine.

For a tuple-like type: the trigger for relocating/decomposing structured
binding would be that the hidden object `e` is prvalue and that the (new,
extension) customization point is found by lookup. So the responsibility is
on the class author to ensure that decomposition is safe; if they can't
guarantee that, they just don't provide the new customization point and the
structured binding yields xvalue references to the results of calling
get<I>.

For binding to data members: yes, this should only proceed as decomposition
if the class E (and every base class between E and the base class the data
members are found in) does not have a user-defined destructor. Again, if
this does not apply we fall back to structured bindings as references;
lvalues in this case.

For case 1 (array) and case 3 (binding to data members) this is completely
invisible because of relocation elision; the effect is exactly as if the
compiler were to track which array elements or data members are
subsequently relocated and call the destructor on those which are not.

For case 2 (tuple protocol), there is a very visible effect in which
customization point is called; it's the responsibility of the class author
to ensure that they are equivalent.

Another way would be to have a new statement: `auto reloc [x, y] = foo();`
> (or `auto &< [x, y] = foo();`...) to clearly mark that we want to split
> into individual objects. This best expresses the intent, and would not
> break existing code.
>

As above, I don't think there's any real danger of breaking existing code;
class authors could write a bad destructuring get_all in the tuple-like
case, but that's their choice and responsibility.

What I would worry about is the fallback being invisible and degrading
performance, but I think that we're fine if we just make `reloc x`
ill-formed in that case.

Instead, consider: `std::decompose` is accessing (on behalf of its caller)
>>>> each direct subobject (base and data member) that is returned. But it is
>>>> also accessing the *other* direct subobjects that it does not return, in
>>>> order to destroy them. So let's say that to call std::decompose, you must
>>>> have access to each direct subobject, including those that you don't
>>>> request. (Plus their relocators or destructors, respectively.)
>>>>
>>>> Then `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc
>>>> painterWithGuard);` would be ill-formed because in that context,
>>>> `painterWithGuard._guard` is ill-formed.
>>>>
>>>> Do you think this would work?
>>>>
>>>
>>> But if _guard were declared as public by mistake then all those safeties
>>> are bypassed and std::decompose will cause trouble.
>>>
>>
>> In that case, aggregate-style structured binding would likely also work.
>> And that's definitely the class author's fault.
>>
>
> Yes but it would not break things the way std::decompose does. (type may
> be movable but not decomposable).
>

True. I guess you have to accept some small risk of breakage; even an
aggregate struct S { X x; Y y; }; could have invariants established
(post-construction) between x and y that break if x is destroyed before y.
But that's fragile code already; the author should have made those data
members private. At least they can forestall this by adding a user-declared
destructor.

This also limits the use of std::decompose to the class implementation and
>>> friends. It is not necessarily a bad thing as it requires the owner to know
>>> of the class internals (or at least to be in the position of knowing...).
>>> Besides I expect the main use of std::decompose to be from the new special
>>> get_bindings function.
>>>
>>
>> Yes, and that customization point (or some `get_all`,
>> `get<index_sequence<I...>>`, etc.) would be expected to be a member or
>> friend.
>>
>
> That could work. BTW I am more inclined to give it another name than `get`
> as we could have a tuple that contains an index_sequence as parameter, and
> then we don't know what std::get would refer to.
>

Yes, good point.

Actually, from a brutally practical perspective, handling tuple-like types
could be omitted from a MVP. You'd still be able to destructure std::pair
by wrapping it in another (derived) type to suppress the tuple protocol
(since `first` and `second` are direct data members of the same class), and
authors of tuple-like types outside the Standard Library would be able to
write their own function to return the relocated elements as data members
of an aggregate.

In addition, now that I think of it, the no user-provided special ctors and
>>> dtor clause is required to allow structure binding relocation from a
>>> class-type. In auto [x, y] = reloc data, data must be a C array (no problem
>>> there), a class-type with all data-members coming from the class-type or
>>> the same base, or provide a get_binding function that applies recursively.
>>> The end result of get_binding must fall into the C array case or the
>>> class-type case. For the class-type case, we must be allowed to split the
>>> type into individual subobjects, and have again that sort of guarantee that
>>> subobjects can be considered individually. Hence this no user-provided
>>> special ctors and dtor clause must apply to the class-type. This is the
>>> case for whatever std::decompose returns, but it is worth mentioning. And
>>> that bit makes me think that this clause should also apply to
>>> std::decompose.
>>>
>>
>> Remember, the motivation is for libraries (not necessarily the Standard
>> Library) to be able to write decompositions for their types; in particular,
>> for tuple types. For example, Boost.Tuple should be able to support
>> relocating decomposition without any change to its current ABI or API. So
>> user-provided special member functions must be allowed.
>>
>
> Fair point.
>
>
>> I can't feel satisfied with this solution. We end up with two ways of
>>> decomposing: std::decompose and the get_binding function. get_binding is
>>> still necessary as you can't access nested subobjects from std::decompose,
>>> or even generate objects on demand when using structured bindings. And
>>> other times get_bindings can be bypassed if the user performs structured
>>> binding by calling std::decompose directly. I find this dual approach a bit
>>> confusing.
>>>
>>
>> Well, the `get` protocol is for public use, and `std::decompose` for
>> privileged access - although it can be used on aggregate types. Also it'd
>> be useful for prvalue qualified member functions, etc. where previously the
>> "union hack" would have been necessary, e.g.:
>>
>> template<class T>
>> T* unique_ptr<T>::release(this unique_ptr self) {
>> auto const [ptr] = std::decompose<&unique_ptr::ptr_>(reloc self);
>> return ptr;
>> }
>>
>
> Yes, I agree with that. I worry about mistaken uses of std::decompose for
> classes where access control is badly designed.
>

Yes. But if we don't provide it, the union hack will be used, and then
there's the same risk plus the risk of forgetting subobject destructors.
Maybe this is OK for an initial proposal.

Received on 2022-09-30 11:06:52