std-proposals: Re: Add a specialized "Extension" concept to the inheritance syntax

From: Mark A. Gibbs <indi.in.the.wired_at_[hidden]>
Date: Sun, 19 May 2019 19:50:25 -0400

On 2019-05-19 7:26 a.m., Ofri Sadowsky via Std-Proposals wrote:
>
> The killer problem with this idea, I think, is this:
>
> struct Base1
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Base2
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Base3
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Derived : Base1, Base2, Base3
> {
> void setup() override
> {
> do_base_calls_here;
>
> /* The above is expanded by the compiler to:
> Base1::setup();
> Base2::setup();
> Base3::setup();
> */
>
> // any other set up
> }
>
> // ...
> };
>
> What happens if Base2::setup() throws? Base1's setup is already
> complete, so we probably need to call Base1::cleanup() or we're
> going to get a resource leak of some kind. How can the
> "do_base_calls_here" keyword know to do that?
>
> And what happens of the constructor of Base2 throws during the
> construction of Derived and after Base1 is already constructed? We
> should follow the same strategy.

Which is what?

This idea works with constructors because there is a clearly-defined
cleanup function - the destructor - paired with every constructor. How
would this work with an arbitrary function? How would the compiler know
that cleanup() is the correct function to call when cleanup is needed
after setup()?

For this to work what the compiler needs to generate is effectively this:

void Derived::setup() {
    /*
    Base1::setup();
    try
    {
        Base2::setup();
        try
       {
           Base3::setup();
       }
        catch (...)
        {
           // cleanup Base2::setup();
            throw;
        }
    }
    catch (...)
    {
        // cleanup Base1::setup();
        throw;
    }
    */
}

The compiler needs to know what to do in each "cleanup
Base/N/::setup()". For constuctors this is no problem; the cleanup for a
constructor is always the destructor. But how would it work for an
arbitrary function? How does the compiler know that the right thing to
do is call Base/N/::cleanup() and not some other function Base/N/::end()
or Base/N/::_do_cleanup() (or even possibly nothing at all)?

If you insist on defining the rules for extensibility in the base class,
then as you noted, the base class would have to define what the cleanup
function is. Which means your syntax would have to be something like this:

struct Base {

    tail_extensible(do_the_cleanup) void setup();

    /* virtual? or *_extensible? */ void do_the_cleanup();

};

... or some other way of telling the compiler that if setup() needs to
be undone, do_the_cleanup() is the way to do it.

> You can get a similar but even more nefarious problem with only
> two bases and noexcept functions like this:
>
> struct Base1
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup();
>
> // ...
> };
>
> struct Base2
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup();
>
> // ...
> };
>
> struct Derived : Base1, Base2
> {
> void setup(std::string s) noexcept override
> {
> do_base_calls_here;
>
> /* Expanded to:
> Base1::setup(s);
> Base2::setup(s);
> */
>
> // etc.
> }
> };
>
> s is copied twice, and the second copy might throw even though the
> actual function calls are all noexcept.
>
>
> I'm not sure what your question is in this case. Exceptions can
> happen regardless and independently of the choice of using method
> extension or the way we implement the extension.

The question is how do we deal with them.

> As for the copy, it's the designer of the virtual method who decided
> to use pass by value and bear its cost.

Sure, okay... but what is the cost? That's the question. You can't
decide if or how to deal with the cost of something until you know what
that cost is.

If an extensible function takes an argument by value, is there no way to
use this mechanism safely? For example, if I'm writing the calls out
manually, I can do this (assume it's all wrapped in a try block and
provides some way to signal failure):

void Derived::setup(std::string s) noexcept
{
    auto s1 = s;
    auto s2 = s;

    // At this point I can call the base class setup functions safely.
    Base1::setup(std::move(s1));
    Base2::setup(std::move(s2));

    // and I can still have s to use for this function if I need it.
}

Does the extension mechanism provide no help or guarantees here? Even
though all the base class setup() functions are noexcept, could I still
be surprised by an exception coming out of invisible code, and there's
nothing I can do about it?

> Also, the second copy might not even be necessary. If the
> programmer knows s won't be needed any more, they could manually
> write the second call as "Base2::setup(std::move(s));". So this
> construct isn't only dangerous, it's inefficient.
>
> And yet, one can do that even today! On the contrary, if we formally
> declare tail/head extension and pass an rvalue reference, we can
> define a rule that forbids any derived class to use it -- it's the
> base class designer who's supposed to do something with it, probably
> emptying or so, and it's no longer available to the derived class.
> What would be the point of defining such a case with "virtual" and
> then letting the derived class access an rvalue-reference parameter
> which the base class should have emptied by now?
>
> Of all the points raised here, actually the issue of rvalue reference
> with multiple inheritance may be hardest to solve, even with existing
> virtual methods. Actually, again, in the virtual diamond pattern we
> are good, because the top level class receives the rvalue reference
> once, and nobody else should.

So if I write:

struct base {
    tail_extension void foo(std::string&&);
};

struct derived : base {
    void foo(std::string&& s) override {
        // I can't use s here?
    }
};

Even though there's no indication in derived or derived::foo() that s is
moved-from in derived - I'd have to go all the way back to base to see
that it's a tail_extension function - that's what's happened? In fact,
if base is in one file and derived in another, there's no indication /at
all/ that s is silently moved-from before the opening brace in foo().

That doesn't seem particularly safe, or easy to teach.

> Consider what you'd have to write today(-ish, using P0052's
> <scope>) for the above:
>
> struct Base1
> {
> virtual void setup(std::string);
> virtual void cleanup();
> };
>
> struct Base2
> {
> virtual void setup(std::string);
> virtual void cleanup();
> };
>
> struct Derived : Base1, Base2
> {
> void setup(std::string s) override
> {
> Base1::setup(s);
> auto f1 = std::make_scope_fail([&]{ Base1::cleanup(); });
> Base2::setup(std::move(s)); // with or without move
> auto f2 = std::make_scope_fail([&]{ Base2::cleanup(); });
>
> // etc.
> }
> };
>
> Granted, it's a bit verbose and boilerplate-y, so how would you
> express all that with all its attendant complexity in a simpler way?
>
> First, if this is what today-ish requires, then it's real hell. How
> on Earth am I suppose to figure it out? Wouldn't any of us rather
> have this automated, one way or another?
> Second, more generally speaking, the association of failure-handling
> with any call is not the subject of this discussion. You just
> demonstrated that the problem exists regardless of the extension
> concept. That you came up with some form of boilerplate-y solution
> means that it should be possible to come up with a parallel, less
> boilerplate-y solution for extensions. The syntax is likely to be
> ugly (maybe less ugly than now) because, assuming that a cleanup
> method exists (which may not be the case at all), it would somehow
> have to be specified in the declaration of the extensible method. But
> it only emphasizes the need for extension -- this will ensure that it
> really will be called on failure.

I agree that less boilerplate would be nice, but I don't believe it
follows that we need an extension. An extension only makes sense if:

1. The pattern it's replacing is frequent enough to justify yet more
    complication in the language.
2. The pattern it's replacing is complex enough to justify yet more
    complication in the language.
3. The extension can actually replace the pattern.
4. The extension doesn't add even more:
     1. complexity
     2. inefficiency
     3. confusion; or
     4. danger.

I don't see this proposal satisfying even one of those requirements.

Yes, I provided a boilerplate-y solution... to /one specific form of the
problem/. The boilerplate I gave won't work generically. And even if it
could, it would require specifying so much detail (such as how to clean
up after each function) that you'd almost end up writing the same amount
of code.

Just consider this simple problem: You have a base class with a setup
function and paired cleanup function, and you want to specify that it's
only tail-extensible. (I've already pointed out that restricting it to
/only/ tail extensibility doesn't make sense, but let's put that aside
for now.) You want to extend that class and its setup function, which
means this:

struct base {
    tail_extensible void setup();
    head_extensible void cleanup();
    // etc.
};

struct derived : base {
    void setup() override;

    // etc.
}

This is basically just your motivating example restated.

Okay, so let's actually try writing derived::setup():

void derived::setup()
{
    // Allocate some resources here.
}

Now, at the closing "}" in derived::setup(), this extension promises to
automatically call base::setup(). Cool. But then... base::setup() fails.

What happens to the resources allocated in derived::setup()? What
happens to derived's invariants?

If I have to manually call the base class function, this problem is trivial:

void derived::setup()
{
    // Allocate some resources with a RAII type like unique_ptr.
    auto resources = allocate_resources();

    // Do the base class setup.
    base::setup();
    // If this fails, resources are automatically released.

    // Everything worked, so store the allocated resources
    // in the object.
    this->_resources = std::move(resources);
}

But since the call is automatic and invisible, and happens at the
closing brace, how do I cleanup the work done in derived::setup() in the
event of base::setup() failing? A function try block? That might work
for a single point of failure (the call to base::setup()), but how would
you make that work generally?

> Even if you try to avoid most of the complications by restricting
> this to single inheritance and using /all/ noexcept functions...:
>
> struct base
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup() noexcept;
> };
>
> struct derived : base
> {
> void setup(std::string s) noexcept override
> {
> extension; // or whatever keyword or syntax
>
>
> None whatsoever -- extension is declared and determined in the base class.
>
>
>
> /* Expands to:
> base::setup(s); // s is copied, might throw
> */
>
> // etc., but if there are any errors, is base cleanup done?
> }
> };
>
> ... you can still end up with a surprise potential throw. Plus you
> still need to worry about doing the cleanup.
>
> I think that one of us misunderstood the concept of noexcept. At
> least, I don't understand yours.

Just because a function is noexcept, that doesn't mean that calling it
won't generate an exception:

auto foo(std::string) noexcept;

auto s = std::string{"some long enough string..."};
foo(s); // this line can throw

Because foo() takes a string by value, there's an invisible copy of s
into the function's argument stack. That copy can throw.

You can avoid/control that potential throw by doing the copy elsewhere:

auto foo(std::string) noexcept;

auto s = std::string{"some long enough string..."};

auto s2 = s; // this line can throw
foo(std::move(s2)); // this line CANNOT throw

So to restate the problem code:

struct base {
    tail_extensible void setup(std::string) noexcept {}
};

struct derived : base {
    void setup(std::string) noexcept override {}
};

Even though every function here is noexcept, and there are no visible
copies being made (the function argument isn't even named!), and there's
not a single statement in the body of derived::setup() (or
base::setup(), for that matter)... it could still throw, and then
presumably terminate.

An invisible, surprise termination would be bad enough, but there
doesn't seem to be any way to stop it. The problematic copies are all
invisible and inaccessible here. And a function try block won't work in
this case, for example.

Saying "it can be solved with the manual/boilerplate code, therefore
it's a solved problem" doesn't fly. With manual code, the pitfalls can
be avoided a multitude of ways, depending on what the code's needs are.
But there's no single solution that would work in all cases that you
could use the rule for what the extension should do. At least not that I
can see.

Received on 2019-05-19 18:52:22