ISOCPP Std Discussion List: Re: Some feedback on scope guards

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Sat, 8 Apr 2023 22:18:19 -0500

On Sat, 8 Apr 2023 at 19:47, Andrey Semashev via Std-Discussion <
std-discussion_at_[hidden]> wrote:

> On 4/9/23 02:23, Edward Catmur wrote:
> >
> >
> > On Sat, 8 Apr 2023 at 05:21, Andrey Semashev via Std-Discussion
> > <std-discussion_at_[hidden]
> > <mailto:std-discussion_at_[hidden]>> wrote:
> >
> > The added cost may not be as high as it may seem.
> >
> > 1. The unhandled exception counter needs to be captured on coroutine
> > creation. This overhead is added to the existing overhead of
> dynamically
> > allocating the coroutine state and copying all the couroutine
> arguments
> > to it. Copying one more integer doesn't seem much in comparison.
> >
> > 2. Updating the coroutine-local counter needs to happen when an
> > exception is thrown/re-thrown/caught. This overhead is added to
> > allocating and copying the exception object and the associated stack
> > unwinding process. Depending on the exception type and the cost of
> > unwinding, this overhead may or may not be noticeable. However, this
> > overhead only exists on the exceptional code path.
> >
> > 3. When the exception passes the boundary of the coroutine and the
> > caller is also a coroutine, the caller's counter needs to be updated.
> > The compiler already generates an implicit try/catch block around the
> > coroutine body to catch any exceptions leaving it to call
> > promise_type::unhandled_exception(), which may capture the exception
> so
> > that it can be later rethrown in the caller. So this counter update
> is
> > actually implemented as part of #2 described above, no additional
> > overhead here.
> >
> > Hm, there's something I'm failing to understand here. Let's say a
> > coroutine is constructed and its frame constructs a scope_failure object
> > while some number - say 3 - exceptions are in flight. The coroutine is
> > then suspended, and one of the exceptions is cleaned up, leaving 2
> > exceptions in flight. The coroutine is then resumed, and throws an
> > exception out, so that again there are 3 exceptions in flight. Wouldn't
> > that result in the uncaught exception count comparison incorrectly
> > returning true?
>
> That's the point of saving the unhandled exception counter to the
> coroutine state. When the coroutine is resumed,
> co_unhandled_exceptions() would still return 3, so long as no exception
> is thrown from it. When the exception is thrown, it would increment the
> coroutine-specific counter to 4, which would allow scope_fail to detect
> the exception. When the exception propagates to the caller and then
> rethrown, then the caller's counter is incremented from 2 to 3 before
> the stack unwinding proceeds.
>

Thanks, I think I understand. But how would throwing an exception increment
that counter? Are we talking extra per-thread state, possibly a
thread-local linked list of coroutines (bear in mind that there can be
multiple active coroutine frames on a thread, if they call each other
rather than awaiting) or is the unwinder detecting coroutine frames to find
the corresponding counter? Can this be implemented without breaking
backwards compatibility (linking new object files against old runtime)?

> In any case, I'm not insisting on this co_unhandled_exceptions()
> > implementation, or even on the particular co_unhandled_exceptions()
> > interface. If there is a more efficient way to discover whether an
> > exception is in flight, including in coroutines, that's great and
> let's
> > standardize that.
> >
> > "In flight" is the tricky thing. There may be any number of exceptions
> > in flight over the life of a stack frame, and all the more so a
> > coroutine frame, with the function (or coroutine) blissfully unaware of
> > this. Really, the question we are asking is whether the destruction of
> > the object was *caused* by an exception being thrown. But this is
> > impossible to answer; causality has no physical correlate, so it's
> > definitely impossible to observe from within the abstract machine.
> > However, we can determine whether an object whose complete object is an
> > automatic variable was destroyed through exceptional stack unwinding or
> > through normal control flow.
>
> Storage type or membership are irrelevant. For example, destroying e.g.
> a stack-allocated unique_ptr pointing to a dynamically allocated scope
> guard should work the same as destroying the scope guard allocated
> directly on the stack. Really, the definitive factor that decides
> whether scope_fail should invoke the action is whether the destructor is
> ultimately called because of the stack unwinding.

"Because of" is problematic. That implies causality, which is a huge and
unsolved area of philosophy.

For example:

void f(bool b) {
  unique_ptr<scope_fail> psf1;
  {
    unique_ptr<scope_fail> psf2;
    try {
      unique_ptr<scope_fail> psf = ...;
      scope_success ss([&] { psf1 = std::move(psf); });
      scope_fail sf([&] { psf2 = std::move(psf); });
      if (b)
        throw 1;
    } catch (...) {
    }
  }
  throw 2;
}

If `b` is true, the dynamic lifetime scope_fail is transferred to psf2, so
its destruction is immediately caused by the nonexceptional destruction of
psf2, but that was only possible because of the exceptional destruction of
sf, so which counts as "causing" the destruction? And I'm sure people will
be able to come up with trickier examples; we need to talk in terms of
observable attributes of the abstract machine (notwithstanding that we may
augment the language to do so).

The answer to that is
> pretty clear, as the abstract machine is either in the process of
> unwinding the stack, or it isn't; there is no third option, AFAIK.
>

But you can do anything you like while stack unwinding is occurring; you
can call whatever code you like from a destructor, as long as you don't
leak another exception out. So stack unwinding is a necessary but not
sufficient condition.

> Indeed, coroutines add a third option to this; a coroutine stack frame
> > automatic variable can also be destroyed by calling destroy() on its
> > coroutine_handle while suspended. The problem is that the language can't
> > know whether the call to destroy() was caused by unwinding or by normal
> > control flow. Should such destruction be considered success, failure,
> > both, or neither?
>
> This is an interesting question, but I tend to think we should keep
> things straightforward. That is, if the call to
> coroutine_handle::destroy() is done during the stack unwinding, that
> should be considered as such (i.e. all destructors called as a result of
> calling destroy() should be able to tell they are being called due to
> stack unwinding). This would be consistent with the current semantics of
> coroutine_handle being essentially a pointer to the coroutine state and
> coroutine_handle::destroy() essentially calling `operator delete` on it.
>

It's difficult to know whether destroying a suspended coroutine should be
considered success or failure, and I think that's orthogonal to the
immediate cause of the destroy(). Say you have two coroutines, one
performing a long running task and the other a watchdog (a timeout, etc.)
on the first. If the first is destroyed while suspended that's probably a
bad thing (the task failed) but if the second is destroyed while suspended
that's usually a good thing (the task succeeded, so the watchdog is no
longer needed)!

I have to admit this doesn't work well with my co_unhandled_exceptions()
> idea because, assuming the coroutine state is destroyed in the context
> of the caller, it would return the exception counter of the caller, not
> the one cached in the coroutine state. I suppose, we could work around
> this by requiring co_unhandled_exceptions() to still return the cached
> value in this case, but that would mean coroutine_handle::destroy()
> would have to do something more than just deleting the state. I'm not
> sure if there's a better way to fix this.
>
> > > I think it's a lot more likely that compilers will just create
> types
> > > whose destructors are marked as being called only either from
> > scope exit
> > > or from unwind cleanup, and omitted in the other case. (At least,
> > that's
> > > how I'd implement it.) So Boost.Scope etc. will need to either use
> > those
> > > magic library types, or the attributes/intrinsics those library
> types
> > > use (if they aren't implemented with True Name magic).
> >
> > Not calling the destructor on normal return (or on exception - we'd
> need
> > both for scope_success and scope_fail) will not work because even if
> you
> > don't invoke the scope guard action, you still want to destroy it.
> >
> > True. You could have multiple cooperating objects, but that gets a bit
> > fiddly. Perhaps extra destructors might be necessary; as today we have
> > subobject and complete object destructors, the compiler might add
> > automatic and unwind destructors.
>
> Syntactically and semantically, there are just destructors currently.
> They don't discriminate between complete objects or subobjects. Or have
> I missed some new core language feature?
>

It's not exposed to syntax, but it is there in ABI, so arguably in
semantics.

> The problem is then that this would only work for scope guard objects
> > that are complete objects with automatic lifetime. Do we want standard
> > scope guards to "work" (in whatever sense) as subobjects (even nested
> > objects) and/or in dynamic lifetime?
>
> Again, complete object/subobject, as well as storage types are a red
> herring.
>
> > Also, if you're thinking about marking the objects on construction to
> > skip their destructors upon different kinds of scope exiting, that
> > sounds like a more expensive solution compared to
> > co_unhandled_exceptions(). In this case, the overhead would be
> attached
> > to every non-trivially-destructible object created on the stack,
> > coroutines or not, compared to just a few key points in coroutines
> > creation and stack unwinding.
> >
> > It wouldn't need to be a dynamic property. Exception cleanups are
> > (usually) separate blocks of code to normal scope exits, so can call
> > different (sets of) destructors.
>
> I don't think so. Destructors may not be inline, meaning that the
> unwinding code may need to pass a runtime piece of information to the
> destructor on whether it is called due to the stack unwinding or not. Or
> there has to be a way for the destructor to discover this on its own,
> and we're back to square one.
>

Not if we add more destructors to ABI.

I should note that adding this runtime piece of information would be an
> ABI breaking change. As well as adding a new set of destructors for
> unwinding/normal destruction (which would be detrimental to code sizes
> on its own). In practice, I don't think this would be acceptable by
> compiler writers.
>

It would only be these specific classes that would have the extra
destructors, and they would likely be inline. Obviously, this would only
work for scope guards complete automatic objects, which is why I'm
interested to know whether they are expected to work as subobjects or
dynamically.

Received on 2023-04-09 03:18:33