std-proposals: Re: Add a specialized "Extension" concept to the inheritance syntax

From: Ofri Sadowsky <sadowsky.o.phd_at_[hidden]>
Date: Sun, 19 May 2019 14:26:06 +0300

On Sat, May 18, 2019 at 2:16 AM Mark A. Gibbs via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> It seems a little unnecessarily restrictive to specify in the base class
> whether derived classes must "head" extend or "tail" extend. I might want
> to do some extra work *before* the base class's set up. For example, the
> base class might have some core functionality, and I might want a class
> that does some checking before using it... which means I want to do the
> checks *before* Base::setup(). Having that decision forced by the base
> class seems to defy the whole point of extensibility.
>
I am afraid that I was misunderstood. I don't ask to change the notion of
a virtual method that can be overridden. I ask to add a new concept of an
"extensible" method to which extension can be placed at the end (tail) or
the beginning (head). Of course, it will not cover all the cases in which
a virtual method has to call its base class method. However, it will force
those methods that require, by their semantics, a tail extension or a head
extension, to be defined in such a way. And the qualifier of the extension
is defined from the base class down; that is, the base class decides if
it's head or tail extension.

> So perhaps it might be better if it looked something like this:
>
> class Derived : public Base
> {
> public:
> void setup() extension(head)
> {
> // Base::setup() is called automatically
> // do my own setup
> }
>
> void cleanup() extension(tail)
> {
> // do my own cleanup
> // Base::cleanup is called automatically
> }
> };
>
> So the idea really was to enforce head or tail extension at the base class
level, and this example does not meet the point.

> But in that case, there's no need for a new "extensible" keyword. That's
> just "virtual".
>
>
> class Derived : public Base
> {
> public:
> void setup() extension(head)
> {
> // Base::setup() is called automatically
> // do my own setup
> }
>
> void cleanup() extension(tail)
> {
> // do my own cleanup
> // Base::cleanup is called automatically
> }
> };
>

Actually, this suggestion can create all sorts of logical errors. What
happens if there is an inheritance of A -> B -> C, and B decides to extend
head while C decides to extend tail? Where should C's extension go: BAC?
BCA?

The idea really was to let A decide which side all the extensions take. So
if it's tail extension, it's always ABC and if it's head extension it's
always CBA.

> But do I really want the "extension" declarations as part of the
> *interface*? That doesn't seem kosher. Whether the base class function is
> called first, last, in the middle, or at all seems to be an implementation
> detail. Thus, the right place to put it seems to be within the function
> body:
>
I beg to differ on the claim that it's an "implementation detail". It's a
policy, or (carefully speaking) a contract. Specifically, in case of
setup/cleanup pairs, where setup acquires resources and cleanup releases
them, setup should be tail-extended and cleanup should be head-extended.
And if an implementer forgets to include a call to the base method, the
derived class will be broken since it did not acquire the resources that
its base requires.

> There *might* be some additional benefits, like automatically forwarding
> function arguments,
>

I consider this a secondary question to the subject.

> and maybe you could say all base class functions with the same signature
> would be called in forward or reverse standard order (that is, the order
> used during construction or destruction, respectively)...
>

This only applies to cases where you need it. You can continue using
classical virtual methods, but in cases where tail or head extension are
needed, the language would support you.

There's a specific example that I presented in another email, of multiple
inheritance with the virtual diamond pattern, which, I suspect, cannot even
be answered by explicit calls to the base class method and only a compiler
which is aware of the diamond structure will resolve it correctly. Just
like with constructors, except that you could apply it to any method, and
you would not have to write the call explicitly.

> but even those benefits come with a whole new slew of caveats and
> difficulties. (For example, if there are multiple functions to be called,
> you can't simply use perfect forwarding, because the first call would move
> away all the arguments. You could end up with a slew of invisible,
> expensive copies.
>

Again, I consider this a secondary issue where you could even use simple
forwarding rules like "keep the signature as is" and let the method
designer decide on it. With today's virtual method feature, you cannot
achieve much better anyhow.

> And what happens if something throws between two base calls?
>

> And if there are multiple base classes what happens if not all the base
> classes have the required override signature?
>

You only invoke the ones that do have the same signature.

How would a virtual method answer for the following:

class Base1
{
protected: // for purists' sake
    virtual setup(int) { }
};

class Base2
{
protected:
    virtual setup(int) { }
};

class Derived : public Base1, public Base2
{
protected:
    virtual setup(int) override { } // which of the bases is overridden?
}

The same answer to the case above should answer for the extension case.

> What if they have variants that "work" - like Base1::setup(int) and
> Base2::setup(long)? And so on and so forth.)
>

Only extend the one that has equal signature, just like virtual methods.

> The killer problem with this idea, I think, is this:
> struct Base1
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Base2
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Base3
> {
> virtual void setup();
> virtual void cleanup();
>
> // ...
> };
>
> struct Derived : Base1, Base2, Base3
> {
> void setup() override
> {
> do_base_calls_here;
>
> /* The above is expanded by the compiler to:
> Base1::setup();
> Base2::setup();
> Base3::setup();
> */
>
> // any other set up
> }
>
> // ...
> };
>
> What happens if Base2::setup() throws? Base1's setup is already complete,
> so we probably need to call Base1::cleanup() or we're going to get a
> resource leak of some kind. How can the "do_base_calls_here" keyword know
> to do that?
>
And what happens of the constructor of Base2 throws during the construction
of Derived and after Base1 is already constructed? We should follow the
same strategy.

> You can get a similar but even more nefarious problem with only two bases
> and noexcept functions like this:
> struct Base1
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup();
>
> // ...
> };
>
> struct Base2
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup();
>
> // ...
> };
>
> struct Derived : Base1, Base2
> {
> void setup(std::string s) noexcept override
> {
> do_base_calls_here;
>
> /* Expanded to:
> Base1::setup(s);
> Base2::setup(s);
> */
>
> // etc.
> }
> };
>
> s is copied twice, and the second copy might throw even though the actual
> function calls are all noexcept.
>

I'm not sure what your question is in this case. Exceptions can happen
regardless and independently of the choice of using method extension or the
way we implement the extension.

As for the copy, it's the designer of the virtual method who decided to use
pass by value and bear its cost.

> Also, the second copy might not even be necessary. If the programmer knows
> s won't be needed any more, they could manually write the second call as "
> Base2::setup(std::move(s));". So this construct isn't only dangerous,
> it's inefficient.
>
And yet, one can do that even today! On the contrary, if we formally
declare tail/head extension and pass an rvalue reference, we can define a
rule that forbids any derived class to use it -- it's the base class
designer who's supposed to do something with it, probably emptying or so,
and it's no longer available to the derived class. What would be the point
of defining such a case with "virtual" and then letting the derived class
access an rvalue-reference parameter which the base class should have
emptied by now?

Of all the points raised here, actually the issue of rvalue reference with
multiple inheritance may be hardest to solve, even with existing virtual
methods. Actually, again, in the virtual diamond pattern we are good,
because the top level class receives the rvalue reference once, and nobody
else should.

> Consider what you'd have to write today(-ish, using P0052's <scope>) for
> the above:
> struct Base1
> {
> virtual void setup(std::string);
> virtual void cleanup();
> };
>
> struct Base2
> {
> virtual void setup(std::string);
> virtual void cleanup();
> };
>
> struct Derived : Base1, Base2
> {
> void setup(std::string s) override
> {
> Base1::setup(s);
> auto f1 = std::make_scope_fail([&]{ Base1::cleanup(); });
> Base2::setup(std::move(s)); // with or without move
> auto f2 = std::make_scope_fail([&]{ Base2::cleanup(); });
>
> // etc.
> }
> };
>
> Granted, it's a bit verbose and boilerplate-y, so how would you express
> all that with all its attendant complexity in a simpler way?
>
First, if this is what today-ish requires, then it's real hell. How on
Earth am I suppose to figure it out? Wouldn't any of us rather have this
automated, one way or another?
Second, more generally speaking, the association of failure-handling with
any call is not the subject of this discussion. You just demonstrated that
the problem exists regardless of the extension concept. That you came up
with some form of boilerplate-y solution means that it should be possible
to come up with a parallel, less boilerplate-y solution for extensions.
The syntax is likely to be ugly (maybe less ugly than now) because,
assuming that a cleanup method exists (which may not be the case at all),
it would somehow have to be specified in the declaration of the extensible
method. But it only emphasizes the need for extension -- this will ensure
that it really will be called on failure.

> Even if you try to avoid most of the complications by restricting this to
> single inheritance and using *all* noexcept functions...:
> struct base
> {
> virtual void setup(std::string) noexcept;
> virtual void cleanup() noexcept;
> };
>
> struct derived : base
> {
> void setup(std::string s) noexcept override
> {
> extension; // or whatever keyword or syntax
>

None whatsoever -- extension is declared and determined in the base class.

>
> /* Expands to:
> base::setup(s); // s is copied, might throw
> */
>
> // etc., but if there are any errors, is base cleanup done?
> }
> };
>
> ... you can still end up with a surprise potential throw. Plus you still
> need to worry about doing the cleanup.
>
I think that one of us misunderstood the concept of noexcept. At least, I
don't understand yours.

-- 
Ofri Sadowsky, PhD
Computer Science Consulting and Training
7 Carmel St., #37
Rehovot  76305
Israel
Tel: +972-77-3436003
Mob: +972-54-3113572

Received on 2019-05-19 06:28:01