On Fri, 30 Sept 2022 at 12:06, Edward Catmur <ecatmur@googlemail.com> wrote:
On Fri, 30 Sept 2022 at 09:33, Sébastien Bini <sebastien.bini@gmail.com> wrote:
On Thu, Sep 29, 2022 at 6:06 PM Edward Catmur <ecatmur@googlemail.com> wrote:
On Thu, 29 Sept 2022 at 14:12, Sébastien Bini <sebastien.bini@gmail.com> wrote:
On Mon, Sep 26, 2022 at 7:36 PM Edward Catmur <ecatmur@googlemail.com> wrote:
On Mon, 26 Sept 2022 at 15:20, Sébastien Bini <sebastien.bini@gmail.com> wrote:
I agree with the class invariant part.
But the no user-provided reloc, move and copy ctor clause was not motivated because of the invariant, but because we wanted to be sure that subobjects of a class could be independently relocated or destroyed. So at minimum we must have a default destructor. And that's not enough.

struct PainterGuard
{
    Painter* _p;
    StateGuard(Painter& p) : _p{&p} { _p->save(); }
    ~StateGuard() {  _p->restore(); }
};

class PainterWithGuard
{
public:
    PainterWithGuard(Painter p) : _p{reloc p}, _guard{_p} {}
    PainterWithGuard() : _guard{_p} {}
    PainterWithGuard(PainterWithGuard) /* reloc ctor */ { _guard._p = &_p; }
public:
    Painter _p;
private:
    PainterGuard _guard;
};

This is a perfectly valid class. Its default destructor works fine. Unfortunately, from outside the class (with no access to private data-members), doing:
`auto [p] = std::decompose<&PainterWithGuard::_p>(reloc painterWithGuard);`
Will call `restore()` on a destructed/relocated object.

Likewise doing (in the class implementation this time):
`auto [p, g] = std::decompose<&PainterWithGuard::_p, &PainterWithGuard::_guard>(reloc painterWithGuard);`
Will construct a PainterGuard that points to invalid data. This one can be incriminated to the class writer as it needs access to private data.

I agree this class has a bad design, but its safe to use otherwise. std::decompose shouldn't cause crashes on badly designed classes.

IMO the no user-provided reloc, move and copy ctor clause gives us that extra guarantee, that there are no relationships between subobjects and as such it is safe to independently relocate or destroy them.

Great example, and that's the sort of thing I was trying to describe with my talk of private inheritance. (PainterGuard could be a private base of PainterWithGuard, perhaps.)

But I can still write a similar class that is currently safe, has no user-defined special member functions but would be unsafe to decompose: instead of writing or deleting the relocating constructor, have PainterGuard inherit from boost::noncopyable.  So the existence of user-defined special member functions is insufficient to determine whether a class is safe to decompose.

Yes. But this example could also be viewed as to give us a hint that there must be no user-provided default constructor as well.

Still problematic:

class Greeter : private boost::noncopyable
{
public:
    std::atomic<std::string> name;
    void run() { thrd = std::jthrd([&] { std::println("Hello {}", name.load()); }); }
private:
    std::jthread thrd;
};

Modifying `name` from user code is fine, so it's OK (if slightly odd) for it to be public, but `decompose`ing it out is not, since then the jthread is referring to a destroyed object. Yet `Greeter` has no special member functions! So the real issue is that user code is affecting an object (Greeter::thrd) to which it has no access.

Hmm. It feels off to use access control as a guarantee of class safety. IMO the real issue is that we are trying to decompose something that isn't decomposable.

Access control *is* how we guarantee class safety. Everything within an access boundary is required to work together to maintain class invariants during and up until the end of class lifetime.

Unfortunately there is no is_decomposable trait, and implementing one seems impossible. Even a C array could not be decomposable because of some dependencies between its elements: `std::any arr[2];` good luck trying to find out what's safe to do.

My main worry is that structured binding relocation is built on top of structured binding. In `auto [x, y] = foo();`, what is to say that `x` and `y` are structured bindings and not complete individual objects? For the moment we stated the rules on how it would work but not what triggers it.

We must be sure that statements that work fine today because they use structured bindings will not break because they silently got turned into individual complete objects, but their original type was not decomposable. For instance, in C++17 `auto [x, y] = foo();` the hidden object of type E is introduced, but E may not be decomposable, albeit safe to create from `foo`. We must be sure that this statement does not break silently with this proposal.

We could simply state that if reloc is used on one of the identifiers introduced in the structured binding syntax, and that said identifier is not ref-qualified, then we use the new APIs to construct individual complete objects instead of structured bindings. I don't know if that's reasonable as it implies searching in the rest of the function body to know how to interpret that first statement.

OK, let's run down the cases. Bear in mind that only objects of value category prvalue can be decomposed; that is, relocating structured binding would not apply if there are any ref-qualifiers between `auto` and `[`.

For an array type: yes, there could be dependencies between the members, but an array prvalue can only be created in-place (it can't be returned from a function) and so the responsibility would be on the function author. So decomposition into individual, independent elements is fine.

For a tuple-like type: the trigger for relocating/decomposing structured binding would be that the hidden object `e` is prvalue and that the (new, extension) customization point is found by lookup. So the responsibility is on the class author to ensure that decomposition is safe; if they can't guarantee that, they just don't provide the new customization point and the structured binding yields xvalue references to the results of calling get<I>.

For binding to data members: yes, this should only proceed as decomposition if the class E (and every base class between E and the base class the data members are found in) does not have a user-defined destructor. Again, if this does not apply we fall back to structured bindings as references; lvalues in this case.

For case 1 (array) and case 3 (binding to data members) this is completely invisible because of relocation elision; the effect is exactly as if the compiler were to track which array elements or data members are subsequently relocated and call the destructor on those which are not.

Also, any issues caused by out-of-order destruction exist today for moving from those objects, for the most part.

For case 2 (tuple protocol), there is a very visible effect in which customization point is called; it's the responsibility of the class author to ensure that they are equivalent.

Another way would be to have a new statement: `auto reloc [x, y] = foo();` (or `auto &< [x, y] = foo();`...) to clearly mark that we want to split into individual objects. This best expresses the intent, and would not break existing code.

As above, I don't think there's any real danger of breaking existing code; class authors could write a bad destructuring get_all in the tuple-like case, but that's their choice and responsibility.

What I would worry about is the fallback being invisible and degrading performance, but I think that we're fine if we just make `reloc x` ill-formed in that case.

On second thoughts, I don't think it should be ill-formed; it should just fall back to move, just as it would for a class that is movable but not relocatable. If the class is truly relocate-only, then it would be ill-formed.

Instead, consider: `std::decompose` is accessing (on behalf of its caller) each direct subobject (base and data member) that is returned. But it is also accessing the *other* direct subobjects that it does not return, in order to destroy them.  So let's say that to call std::decompose, you must have access to each direct subobject, including those that you don't request. (Plus their relocators or destructors, respectively.)

Then `auto [p] = std::decompose<&PainterWithGuard::_p>(reloc painterWithGuard);` would be ill-formed because in that context, `painterWithGuard._guard` is ill-formed.

Do you think this would work?

But if _guard were declared as public by mistake then all those safeties are bypassed and std::decompose will cause trouble.

In that case, aggregate-style structured binding would likely also work. And that's definitely the class author's fault.

Yes but it would not break things the way std::decompose does. (type may be movable but not decomposable).

True. I guess you have to accept some small risk of breakage; even an aggregate struct S { X x; Y y; }; could have invariants established (post-construction) between x and y that break if x is destroyed before y. But that's fragile code already; the author should have made those data members private. At least they can forestall this by adding a user-declared destructor.

This also limits the use of std::decompose to the class implementation and friends. It is not necessarily a bad thing as it requires the owner to know of the class internals (or at least to be in the position of knowing...). Besides I expect the main use of std::decompose to be from the new special get_bindings function.

Yes, and that customization point (or some `get_all`, `get<index_sequence<I...>>`, etc.) would be expected to be a member or friend.

That could work. BTW I am more inclined to give it another name than `get` as we could have a tuple that contains an index_sequence as parameter, and then we don't know what std::get would refer to.

Yes, good point.

Actually, from a brutally practical perspective, handling tuple-like types could be omitted from a MVP. You'd still be able to destructure std::pair by wrapping it in another (derived) type to suppress the tuple protocol (since `first` and `second` are direct data members of the same class), and authors of tuple-like types outside the Standard Library would be able to write their own function to return the relocated elements as data members of an aggregate.

In addition, now that I think of it, the no user-provided special ctors and dtor clause is required to allow structure binding relocation from a class-type. In auto [x, y] = reloc data, data must be a C array (no problem there), a class-type with all data-members coming from the class-type or the same base, or provide a get_binding function that applies recursively. The end result of get_binding must fall into the C array case or the class-type case. For the class-type case, we must be allowed to split the type into individual subobjects, and have again that sort of guarantee that subobjects can be considered individually. Hence this no user-provided special ctors and dtor clause must apply to the class-type. This is the case for whatever std::decompose returns, but it is worth mentioning. And that bit makes me think that this clause should also apply to std::decompose.

Remember, the motivation is for libraries (not necessarily the Standard Library) to be able to write decompositions for their types; in particular, for tuple types. For example, Boost.Tuple should be able to support relocating decomposition without any change to its current ABI or API. So user-provided special member functions must be allowed.

Fair point.
 
I can't feel satisfied with this solution. We end up with two ways of decomposing: std::decompose and the get_binding function. get_binding is still necessary as you can't access nested subobjects from std::decompose, or even generate objects on demand when using structured bindings. And other times get_bindings can be bypassed if the user performs structured binding by calling std::decompose directly. I find this dual approach a bit confusing.

Well, the `get` protocol is for public use, and `std::decompose` for privileged access - although it can be used on aggregate types. Also it'd be useful for prvalue qualified member functions, etc. where previously the "union hack" would have been necessary, e.g.:

template<class T>
T* unique_ptr<T>::release(this unique_ptr self) {
    auto const [ptr] = std::decompose<&unique_ptr::ptr_>(reloc self);
    return ptr;
}

Yes, I agree with that. I worry about mistaken uses of std::decompose for classes where access control is badly designed.

Yes. But if we don't provide it, the union hack will be used, and then there's the same risk plus the risk of forgetting subobject destructors. Maybe this is OK for an initial proposal.