std-proposals: Re: Add a specialized "Extension" concept to the inheritance syntax

From: Mark A. Gibbs <indi.in.the.wired_at_[hidden]>
Date: Wed, 22 May 2019 23:43:38 -0400

This is a long message, so there's a handy summary at the end if you're
skimming.

The exception issues I was concerned about are not the ones that come up
in third-party functions like f(). I assume that it should be possible
to make exceptions work properly within extensible functions... and if
it isn't then the extension just isn't workable. Once exceptions work
properly within extensible functions, then handling them in a
third-party function is just business-as-usual C++; you'd use try-catch
blocks or RAII scope guards or some other standard technique. The
problem is whether exceptions work properly within the extensible functions.

But since you brought third-party function exception handling....

On 2019-05-20 2:40 a.m., Ofri Sadowsky wrote:
> class Base
> {
> protected:
> tail_extensible void setup(); // may throw
> head_extensible void cleanup(); // should not throw
> };
>
> class Derived : public Base
> {
> protected:
> void setup() extension; // may throw
> void cleanup() extension; // should not throw
> };
>
> void f()
> {
> std::unique_ptr<Base> myObject = std::make_unique<Derived>;
> try
> {
> myObject->setup();
> }
> catch(...)
> {
> myObject->cleanup();
> return;
> }
>
> // do something with myObject
>
> myObject->cleanup();
> }

This is simply unacceptable, for a number of reasons.

First, try-catch blocks are a code smell. They should only be necessary
in very low-level routines, wrapped up and hidden from any real
business-logic code. If I have to wrap every call to setup() in a
try-catch block just to make it not leak resources, then I'm going to
have to wrap calling setup() in a function at the very least... which,
of course, would be a non-trivial operation which would require unit
testing. That means this extension is just creating more work; it would
probably be less work and less error-prone to not use it at all and just
use the boilerplate code I showed before.

But the real problem is the requirement that cleanup() doesn't just
release resources, but that it also has logic to detect when resources
haven't been allocated, have been only partially allocated, or have
already been released. That's not how release functions work anywhere
else in C++ or any other language that I'm aware of. So to write a
cleanup function to work with this extension, programmers would have to
know that everything they've ever learned about resource management
functions doesn't apply here - this is a special case.

And worse, it's not just the people writing the class that need to be
aware of this special case... /everyone who uses the class needs to
know, too/. Nowhere else in C++ (or any other language) do you call
release on a resource after failing to acquire it. Even the most basic
language utilities - like delete, for example - don't work that way. If
new successfully allocated memory but then the constructor failed, new
doesn't simply leak the memory and require the programmer to manually
call delete to clean up the partially-completed task.

It's also worth mentioning that you only need the try-catch block in
this situation because it's so far out of the way everything else works
in C++ that there is no existing mechanism that does the job you need.
You can't do this, for example:

void f()
{
    std::unique_ptr<Base> myObject = std::make_unique<Derived>;

    myObject->setup();
    auto onexit = std::make_scope_exit([&]{ myObject->cleanup(); });

    // do something with myObject

    // automatic cleanup
}

I suppose you could make this work by doing this:

void f()
{
    std::unique_ptr<Base> myObject = std::make_unique<Derived>;

    auto onexit = std::make_scope_exit([&]{ myObject->cleanup(); });
    myObject->setup();

    // do something with myObject

    // automatic cleanup
}

but that's bizarrely different from normal practice - programmers would
have to know that setup() is a "special" function, and that it needs
special handling to cleanup safely.

All of the problems here stem from the fact that the extensible function
isn't exception safe by itself; it doesn't even offer the basic
guarantee. If extensible functions can't even be made to offer the basic
exception guarantee, then I'd say the whole idea is a non-starter.

So we'd need to find a way to have extensible functions offer - at least
- the basic exception guarantee. That's what I was focusing on before,
and it's what I'm going to focus on here.

Let's start with the simple case: a tail-extensible function.

struct base {
    tail_extensible void foo()
    {
        // 1
        _r_base = acquire("base");
        // 2
    }

    resource _r_base;

    // etc.
};

struct derived {
    void foo() extension
    {
        // 3
        _r_derived = acquire("derived");
        // 4
    }

    resource _r_derived;

    // etc.
}:

Assume that acquire() offers at least the basic exception guarantee
(which it should; every function should). Let's ignore release functions
for now.

The extension promises to make derived:foo() "as-if" it is this:

void derived::foo()
{
    { // base::foo() START
        // 1
        _r_base = acquire("base");
        // 2
    } // base::foo() END

    // 3
    _r_derived = acquire("derived");
    // 4
}

So what happens if an exception is thrown at points 1 through 4?

  * 1. No problem. Nothing has been acquired yet, so there's nothing to
    release.
  * 2. _r_base must be released.
  * 3. _r_base must be released.
  * 4. _r_base and _r_derived must be released.

How can the extension promise this?

Let's start by writing each function properly, so that each is
exception-safe on its own:

struct base {
    tail_extensible void foo()
    {
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2

        /*
        // alternately, you could write it old-school style like this:
        // 1
        _r_base = acquire("base");
        try
        {
            // 2
        }
        catch (...)
        {
            release(_r_base);
        }
        // ... but the scope guard is shorter and safer.
        */
    }

    resource _r_base;

    // etc.
};

struct derived {
    void foo() extension
    {
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    }

    resource _r_derived;

    // etc.
};

Which means that derived::foo() now expands like this:

void derived::foo()
{
    { // base::foo() START
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    } // base::foo() END

    // 3
    _r_derived = acquire("derived");
    auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
    // 4
}

So what happens if an exception is thrown at points 1 through 4?

  * 1. No problem. Nothing has been acquired yet, so there's nothing to
    release.
  * 2. _r_base must be released... and it is, by the scope guard.
  * 3. _r_base must be released... but it's not (the scope guard is no
    help here).
  * 4. _r_base and _r_derived must be released. _r_derived is, but
    _r_base is not.

So we're half-way there. What's missing?

Well, after the "} // base::foo() END", everything base::foo() did needs
to be cleaned up in the event of a failure.

Let's assume there is a matching cleanup function bar(). Logically, to
be the complement to foo(), bar() should be head-extensible. In order
for the compiler to know how to actually call it, it should probably
also have no parameters (if there were any parameters, how would the
compiler know what arguments to use?). These things could probably be
enforced by the compiler, so that when you write:

tail_extensible(bar) void foo();

the compiler will verify that:

  * bar(void) exists (the return type doesn't matter)
  * bar(void) is declared head_extensible
  * bar(void) does not declare a cleanup function of its own; and
  * every derived class that defines an extension to foo() also defines
    an extension to bar() (even if it's empty).

Let's also assume that bar() is no-fail and noexcept. (If you write a
cleanup function that isn't no-fail, well, you're just asking for grief.)

So now we have:

struct base {
    tail_extensible(bar) void foo()
    {
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    }

    head_extensible void bar() noexcept
    {
        release(_r_base);
    }

    resource _r_base;

    // etc.
};

struct derived {
    void foo() extension
    {
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    }

    void bar() noexcept extension
    {
        release(_r_derived);
    }

    resource _r_derived;

    // etc.
};

With the cleanup function, derived::foo() can now expand like this:

void derived::foo()
{
    { // base::foo() START
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    } // base::foo() END

    try
    {
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    }
    catch (...)
    {
        base::bar();
        throw;
    }
}

So what happens if an exception is thrown at points 1 through 4?

  * 1. No problem. Nothing has been acquired yet, so there's nothing to
    release.
  * 2. _r_base must be released... and it is, by the scope guard.
  * 3. _r_base must be released... and it is, by base::bar().
  * 4. _r_base and _r_derived must be released. _r_derived is released
    by the scope guard, and _r_base by base::bar().

No leaks! Everything works!

But does it extend?

struct derived_more : derived {
    void foo() extension
    {
        // 5
        _r_derived_more = acquire("derived more");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived_more); });
        // 6
    }

    void bar() noexcept extension
    {
        release(_r_derived_more);
    }

    resource _r_derived_more;

    // etc.
};

derived_more::foo() expands first as:

void derived_more::foo()
{
    { // derived::foo() START
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    } // derived::foo() END

    try
    {
        // 5
        _r_derived_more = acquire("derived more");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived_more); });
        // 6
    }
    catch (...)
    {
        derived::bar();
        throw;
    }
}

and the internal call to derived::foo() would further expand as:

void derived_more::foo()
{
    { // derived::foo() START
        { // base::foo() START
            // 1
            _r_base = acquire("base");
            auto onfail = std::make_scope_fail([&]{ release(_r_base); });
            // 2
        } // base::foo() END

        try
        {
            // 3
            _r_derived = acquire("derived");
            auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
            // 4
        }
        catch (...)
        {
            base::bar();
        }
    } // derived::foo() END

    try
    {
        // 5
        _r_derived_more = acquire("derived more");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived_more); });
        // 6
    }
    catch (...)
    {
        derived::bar();
        throw;
    }
}

And if you take the time to work through it... everything just works!
And the pattern can extend for as many levels of inheritance as you
like. Given a base class Base that defines:

  * tail_extensible(clean_x) RET x(ARGS...); and
  * head_extensible RET2 clean_x()

and a series of derived classes Derived{1, 2, 3, ...} where Derived/N/
inherits directly from Derived/N-1/, and Derived1 inherits directly from
Base, and Derived/N/::x() written as:

RET Derived/N/::x(ARGS...) extension
{
    // content
}

this will be expanded by the compiler as-if:

RET Derived/N/::x(ARGS...)
{
    Derived/N-1/::x(ARGS...);

    try
    {
        // content
    }
    catch (...)
    {
        Derived/N-1/::clean_x();
        throw;
    }
}

and that solves the exception-safety problem.

So far so good... but does it work with head-extensible functions?

Let's try it:

struct base {
    head_extensible(bar) void foo()
    {
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    }

    tail_extensible void bar() noexcept
    {
        release(_r_base);
    }

    // etc.
};

struct derived : base {
    void foo() extension
    {
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    }

    void bar() noexcept extension
    {
        release(_r_derived);
    }

    // etc.
};

Now there's a design decision to make: how should the compiler expand
derived::foo()? There are two options: separate scope and combined
scope. The choice didn't matter for tail-extended functions, but it
matters for head-extended functions.

Separate scope looks like this:

void derived::foo()
{
    {
        // body of derived::foo() goes here
    }

    // implicit call to base::foo()
    base::foo();
}

Combined scope looks like this:

void derived::foo()
{
    // body of derived::foo() goes here

    // implicit call to base::foo()
    base::foo();
}

The difference is whether the body of derived::foo() gets its own scope
separate from the implicit call to base::foo(), or if they both share a
combined scope. There are pros and cons either way.

Let's start with the separate scope model. Using separate scopes,
derived::foo() expands like this:

void derived::foo()
{
    { // derived::foo()'s private scope START
        // 3
        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
        // 4
    } // derived::foo()'s private scope END

    { // base::foo() START
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    } // base::foo() END
}

What happens if an exception is thrown at points 1 through 4?

  * 3. No problem. Nothing has been acquired yet, so there's nothing to
    release.
  * 4. _r_derived must be released... and it is, by the scope guard.
  * 1. _r_derived must be released... but it is NOT.
  * 2. _r_base and _r_derived must be released. _r_base is, by the scope
    guard. _r_derived is NOT.

So there's a leak here.

Before trying to fix this, let's see how the combined scope model works.
Using a combined scope, derived::foo() expands like this:

void derived::foo()
{
    // 3
    _r_derived = acquire("derived");
    auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
    // 4

    { // base::foo() START
        // 1
        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
        // 2
    } // base::foo() END
}

What happens if an exception is thrown at points 1 through 4?

  * 3. No problem. Nothing has been acquired yet, so there's nothing to
    release.
  * 4. _r_derived must be released... and it is, by the scope guard.
  * 1. _r_derived must be released... and it is, by the scope guard.
  * 2. _r_base and _r_derived must be released. _r_base is, by the scope
    guard. _r_derived is, by the scope guard.

So no leaks. It all works perfectly.

Does it extend to further inheritance?

struct more_derived : derived {
    void foo() extension
    {
        _r_more_derived = acquire("more derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_more_derived); });
    }

    void bar() noexcept extension
    {
        release(_r_more_derived);
    }

    // etc.
};

more_derived::foo() expands to:

void more_derived::foo()
{
    _r_more_derived = acquire("more derived");
    auto onfail = std::make_scope_fail([&]{ release(_r_more_derived); });

    // implicit:
    derived::foo();
}

which also works properly.

So the combined scope model seems to work fine... except, not quite.
There's a crucial problem with it.

To illustrate that problem, imagine we have to use a resource that
requires a lock:

struct base {
    head_extensible(bar) void foo()
    {
        auto lock = std::scoped_lock(resource_manager);

        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
    }
};
struct derived : base {
    void foo() extension
    {
        auto lock = std::scoped_lock(resource_manager);

        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
    }
};

When that expands:

void derived::foo()
{
    auto lock = std::scoped_lock(resource_manager);

    _r_derived = acquire("derived");
    auto onfail = std::make_scope_fail([&]{ release(_r_derived); });

    { // base::foo() START
        auto lock = std::scoped_lock(resource_manager); // DEADLOCK!

        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
    } // base::foo() END
}

you get a surprise, invisible deadlock, because base::foo() is trying to
acquire a lock that derived::foo() is holding.

This problem doesn't come up with the separate scopes model:

void derived::foo()
{
    { // derived::foo()'s private scope START
        auto lock = std::scoped_lock(resource_manager);

        _r_derived = acquire("derived");
        auto onfail = std::make_scope_fail([&]{ release(_r_derived); });
    } // derived::foo()'s private scope END

    { // base::foo() START
        auto lock = std::scoped_lock(resource_manager);

        _r_base = acquire("base");
        auto onfail = std::make_scope_fail([&]{ release(_r_base); });
    } // base::foo() END
}

because the lock is released at the end of derived::foo()'s private
scope. But of course, the separate scopes model isn't exception-safe.

So both models fail.

The only solution that pops into my head to fix this is to say that the
compiler expands head-extensible functions like this:

struct derived : base {
    void foo() extension
    {
        // body of derived::foo()
    }

    void bar() noexcept extension
    {
        // body of derived::bar()
    }
};

becomes:

struct derived : base {
    void foo() extension
    {
        // implicit scope
        {
            // body of derived::foo()
        }

        try
        {
            base::foo();
        }
        catch (...)
        {
            __bar_impl();
            throw;
        }
    }

    void bar() noexcept extension
    {
        __bar_impl();
    }

private:
    void __bar_impl() noexcept
    {
        // body of derived::bar()
    }
};

That seems to fix all the exception safety problems, but an expert
should really check my work.

    So here's the summary:

To make these extensible functions exception safe, these models seem to
work:

In general, if an extensible function is declared with an optional
cleanup function, the cleanup function:

  * must be defined in every class that defines the extensible function
  * must take no arguments
  * must be the opposite type of the extensible function (that is, if
    the extensible function is tail-extensible, the cleanup function
    must be head-extensible, and vice versa); and
  * must not declare its own cleanup function.

So:

struct foo {
    tail_extensible(cleanup) void setup();

    // this must be defined:
    head_extensible void cleanup();
};

or:

struct bar {
    head_extensible(fini) void init();

    // this must be defined:
    tail_extensible void fini();
};

When the compiler expands tail-extensible functions, given:

struct base {
    tail_extensible(bar) void foo();
    head_extensible void bar();
};
stuct derived {
    void foo() extension
    {
        // body of derived::foo()
    }

    void bar() extension
    {
        // body of derived::bar()
    }
};

the expansions should be as-if:

void derived::foo()
{
    base::foo();

    try
    {
        // body of derived::foo()
    }
    catch (...)
    {
        base::bar();
    }
}
void derived::bar()
{
    // body of derived::bar()

    base::bar();
}

When the compiler expands head-extensible functions, given:

struct base {
    head_extensible(bar) void foo();
    tail_extensible void bar();
};
stuct derived {
    void foo() extension
    {
        // body of derived::foo()
    }

    void bar() extension
    {
        // body of derived::bar()
    }
};

the expansions should be as-if:

void derived::foo()
{
    { // implicit scope
        // body of derived::foo()
    }

    try
    {
        base::foo();
    }
    catch (...)
    {
        __bar_impl();
    }
}
void derived::bar()
{
    base::bar();

    __bar_impl();
}
void derived::__bar_impl() // implicitly-defined
{
    // body of derived::bar()
}

This setup seems to solve all the exception safety problems, though it
should be reviewed.

But do bear in mind that this all "works" only for functions with no
arguments. If either functions take any arguments, everything starts to
break. But that's a separate problem.

Received on 2019-05-22 22:45:25