Date: Wed, 21 Oct 2020 16:16:04 -0400
On Wed, Oct 21, 2020 at 1:20 PM Michael L. Scott <mlscott_at_[hidden]> wrote:
> On Oct 21, 2020, at 9:54 AM, Michael Spear via SG5 <sg5_at_[hidden]>
> wrote:
>
> Hans, thanks for the great summary of our discussion. In the spirit of
> "keeping the conversation going", I want to point out the following about
> implementation.
>
> If an implementation protects all transactions with a single, global,
> reentrant lock, then there are three pieces to the implementation.
> - The first is modifying the parser to support the new syntax. This is
> common to any implementation, so whatever effort it entails is really a
> fixed cost for the TS.
> - The second is some kind of run-time library that maps the beginning and
> end of a lexically scoped transaction to operations for lock acquire and
> lock release. This is quite simple.
> - The third is whatever static analysis is needed to warn programmers when
> their atomic blocks do something forbidden. This may be difficult,
> depending on how we specify things. For example, if the compiler *must*
> warn when an exception escapes a transaction, then if the compiler does not
> have access to definition of a function that might throw, how could it give
> such a warning?
>
> My guess is that the third piece is going to be "best effort" by the
> compiler writer.
>
>
> Or maybe it’s careful choice on our part between “not allowed” and
> “undefined behavior.”
>
> Surely some things should produce warnings, but with an understanding that
> there might be false negatives (e.g., we miss a warning for a transaction
> that has undefined behavior, because the transaction spans TUs and in some
> other TU, it uses std::cout or a std::atomic variable, or opens a socket,
> or does something else that can legally constitute inter-thread
> communication).
>
> Turning to the second popular implementation, which uses Intel/ARM/IBM
> HTM, note that pieces 1 and 3 are unchanged. As for piece 2, it is only
> slightly more complicated, still quite simple.
>
> Now on to the third implementation, with STM. Task 2 is much harder,
> because we have to instrument the reads and writes in the bodies of
> functions called from the transaction. The key simplification here is that
> we do not support explicit self-abort. While the user might assume that
> the implementation uses speculation to get better concurrency, there is no
> speculation or rollback in the proposed TS. This means that even a
> sophisticated STM implementation can fall back to serializing on a single
> global lock. For multi-TU transactions, such fallback is expected in the
> proposed TS, because we do not want viral annotations on functions, and we
> did not specify "transaction-safe" annotations or hints.
>
> From a pragmatic perspective, this means any function in the C++ standard
> library can simply be declared "transaction safe", as long as there is a
> reasonable argument for doing so. Since HTM and STM both are allowed to
> fall back to a lock at any time, the only thing we really need to do is
> ensure that a single lock implementation that allowed standard library
> function X would still be correct, and then it should be valid to say that
> standard library function X is safe to call from within transactions.
>
> In the case of memory allocation, the argument is straightforward: making
> assumptions about the bit representation of the return value of operator
> new is not legal. Correct C++ programs cannot use return values from
> operator new to infer the behavior of other threads, or to communicate with
> each other, because doing so is not portable, and the implementation of the
> allocator is not defined by the specification. Therefore, as long as an
> STM transaction's call to operator new or operator delete can be "logically
> undone" if the run-time system aborts and restarts the transaction, there
> is no semantic issue. At the implementation level, operator new might call
> sbrk() or mmap(), which could cause an HTM transaction to fall back to
> serializing, but that's not a correctness issue.
>
> In the case of throw, the expression of exceptions in the intermediate
> representation is via function calls or intrinsics (cxa_throw, cxa_catch,
> LLVM landing pads, etc). The implementation of these is dependent on the
> OS and architecture. For example, IIRC on Linux targets they traverse
> DWARF tables within the binary itself. So, again, making assumptions about
> these is not portable, and correct programs cannot assume anything more
> than the high-level specified behavior. That behavior is little more than
> "nonlocal jump, with destruction of everything on the popped part of the
> run-time stack". Note: if I am wrong, and the behavior of stack unwinding
> is more carefully defined, and that definition includes the possibility of
> stack unwinding being a legal form of inter-thread communication, then my
> whole argument is invalid. Hopefully someone more knowledgeable about
> exceptions in C++ can confirm or refute my claim.
>
>
> I am not a language lawyer, but: I strongly suspect that inter-thread
> communication _is_ a supported part of exception handling. After all,
> thinks like std::lock_guard count on order execution of destructors with
> inter-thread impacts.
>
>
But is that more than just program order? If a destructor does not touch
shared memory, then can the fact that the destructor ran be legally used as
a communication channel with another thread?
> In the original TMTS, exceptions are tricky because there need to be
> writes to the exception object, and these writes need to be instrumented in
> a special way, because these writes can escape the transaction even if the
> transaction gets undone. In the proposed TMTS, exceptions cannot escape a
> transaction (and the transaction doesn't ever have to get undone), which
> means that STM can fall back to a single lock whenever it encounters a
> throw. This is easy to do, because the STM instrumentation pass simply
> says "here's a function for which I don't have a definition... I guess I'll
> just serialize the transaction here". cxa_throw is no different than a
> multi-TU transaction.
>
> If we *wanted* to specify commit-on-exception-escape, it would be trivial
> to support in a reasonable TM implementation. I don't think we want to,
> and I'm in favor of forbidding exceptions from escaping transactions.
> However, the use of exceptions within a transaction does not appear to pose
> an implementation challenge, nor does it appear to pose a correctness
> challenge.
>
> So, in summary, I do not see implementation issues with supporting
> throw+catch inside a transaction, or allocation inside a transaction. As
> the minutes state, this will open up a huge portion of the standard library
> to transactions (perhaps only threads, coroutines, mutexes, atomics, and
> iostreams would still be forbidden?).
>
> - Mike
>
> On Wed, Oct 21, 2020 at 12:28 AM Hans Boehm via SG5 <sg5_at_[hidden]>
> wrote:
>
>> Following up on our resolution to discuss this between meetings:
>>
>> In the HTM case, since the hardware actually implements strong atomicity,
>> for atomically accessible data, is it correct to implement
>>
>> atomic do { r = data; }
>>
>> as essentially
>>
>> r.load(data, memory_order_seq_cst);
>>
>> ?
>>
>> That would very partially address the concern we had during the
>> meeting, and maybe make Michael Scott happier. With some implementation
>> effort simple transactional loads could be made cheap on x86 and maybe
>> cheap enough on ARM. It would unfortunately require a less trivial
>> implementation to do so.
>>
>> Hans
>>
>> On Tue, Oct 20, 2020 at 6:16 PM Hans Boehm <boehm_at_[hidden]> wrote:
>>
>>> Attendees: Hans Boehm, Victor Luchangco, Jens Maurer (at the beginning),
>>> Michael Scott, Mike Spear, and Michael Wong
>>>
>>> Initially discussed p2066r4 and vector::size(), and the question of
>>> whether it is and should be included in the set of guaranteed
>>> transaction-safe functions.
>>>
>>> We agreed that the current definition of "accessors" is not clear
>>> enough. At this point, there also seemed to be close to consensus that we
>>> should try to initially keep things as simple as possible, and avoid
>>> anything that might look scary to implementors.
>>>
>>> We then tried to switch back to the discussion of including new.
>>>
>>> Mike Spear argued that adding new and supporting non-escaping exceptions
>>> doesn't really complicate simple implementations. Both single-global-lock
>>> and HTM implementations still work.
>>>
>>> Mike Spear: What can you actually do with the p2066 proposal?
>>>
>>> Michael Scott: implement lock-free data structures like linked lists
>>> that read without a lock, and then transactionally update it at the right
>>> place.
>>>
>>> Mike Spear: No, because the initial reads can't be on atomic data.
>>>
>>> Can't really use these transactions as an NCAS replacement. All read
>>> accesses preceding NCAS would also have to be transactional, which is too
>>> expensive for something like Intel HTM. Transactions cost as much as locks,
>>> so surrounding every read by one is not practical.
>>>
>>> Victor L.: Transactional reads could work for tree traversal followed by
>>> update, which is probably more common.
>>>
>>> Clarification that this avoids contention in an HTM context, but still
>>> has latency issues even in the uncontended case.
>>>
>>> Discussion of double-checked locking, with a similar conclusion. It's
>>> not practical to put the initial read in a transaction.
>>>
>>> We unfortunately concluded that we either need more of a usability
>>> argument for what we have, or probably a revision of both the design and
>>> wording documents to allow new + caught exceptions. If we had the latter,
>>> most of the STL data structures could become usable in transactions. Given
>>> that roll-your-own data structures don't seem terribly viable, that would
>>> give us a much better usability argument.
>>>
>>> Closed with a strong recommendation to continue this discussion online,
>>> and not wait until the next meeting.
>>>
>>> Next meeting:
>>>
>>> Nov. 17, 8:00 PST, 11:00 EST
>>> Join with Google Meet
>>> meet.google.com/sbj-cvgh-vnd
>>> Join by phone
>>> (US) +1 208-925-0196 PIN: 255 542#
>>>
>>>
>>>
>>> --
>> SG5 mailing list
>> SG5_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg5
>>
> --
> SG5 mailing list
> SG5_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg5
>
>
>
> On Oct 21, 2020, at 9:54 AM, Michael Spear via SG5 <sg5_at_[hidden]>
> wrote:
>
> Hans, thanks for the great summary of our discussion. In the spirit of
> "keeping the conversation going", I want to point out the following about
> implementation.
>
> If an implementation protects all transactions with a single, global,
> reentrant lock, then there are three pieces to the implementation.
> - The first is modifying the parser to support the new syntax. This is
> common to any implementation, so whatever effort it entails is really a
> fixed cost for the TS.
> - The second is some kind of run-time library that maps the beginning and
> end of a lexically scoped transaction to operations for lock acquire and
> lock release. This is quite simple.
> - The third is whatever static analysis is needed to warn programmers when
> their atomic blocks do something forbidden. This may be difficult,
> depending on how we specify things. For example, if the compiler *must*
> warn when an exception escapes a transaction, then if the compiler does not
> have access to definition of a function that might throw, how could it give
> such a warning?
>
> My guess is that the third piece is going to be "best effort" by the
> compiler writer.
>
>
> Or maybe it’s careful choice on our part between “not allowed” and
> “undefined behavior.”
>
> Surely some things should produce warnings, but with an understanding that
> there might be false negatives (e.g., we miss a warning for a transaction
> that has undefined behavior, because the transaction spans TUs and in some
> other TU, it uses std::cout or a std::atomic variable, or opens a socket,
> or does something else that can legally constitute inter-thread
> communication).
>
> Turning to the second popular implementation, which uses Intel/ARM/IBM
> HTM, note that pieces 1 and 3 are unchanged. As for piece 2, it is only
> slightly more complicated, still quite simple.
>
> Now on to the third implementation, with STM. Task 2 is much harder,
> because we have to instrument the reads and writes in the bodies of
> functions called from the transaction. The key simplification here is that
> we do not support explicit self-abort. While the user might assume that
> the implementation uses speculation to get better concurrency, there is no
> speculation or rollback in the proposed TS. This means that even a
> sophisticated STM implementation can fall back to serializing on a single
> global lock. For multi-TU transactions, such fallback is expected in the
> proposed TS, because we do not want viral annotations on functions, and we
> did not specify "transaction-safe" annotations or hints.
>
> From a pragmatic perspective, this means any function in the C++ standard
> library can simply be declared "transaction safe", as long as there is a
> reasonable argument for doing so. Since HTM and STM both are allowed to
> fall back to a lock at any time, the only thing we really need to do is
> ensure that a single lock implementation that allowed standard library
> function X would still be correct, and then it should be valid to say that
> standard library function X is safe to call from within transactions.
>
> In the case of memory allocation, the argument is straightforward: making
> assumptions about the bit representation of the return value of operator
> new is not legal. Correct C++ programs cannot use return values from
> operator new to infer the behavior of other threads, or to communicate with
> each other, because doing so is not portable, and the implementation of the
> allocator is not defined by the specification. Therefore, as long as an
> STM transaction's call to operator new or operator delete can be "logically
> undone" if the run-time system aborts and restarts the transaction, there
> is no semantic issue. At the implementation level, operator new might call
> sbrk() or mmap(), which could cause an HTM transaction to fall back to
> serializing, but that's not a correctness issue.
>
> In the case of throw, the expression of exceptions in the intermediate
> representation is via function calls or intrinsics (cxa_throw, cxa_catch,
> LLVM landing pads, etc). The implementation of these is dependent on the
> OS and architecture. For example, IIRC on Linux targets they traverse
> DWARF tables within the binary itself. So, again, making assumptions about
> these is not portable, and correct programs cannot assume anything more
> than the high-level specified behavior. That behavior is little more than
> "nonlocal jump, with destruction of everything on the popped part of the
> run-time stack". Note: if I am wrong, and the behavior of stack unwinding
> is more carefully defined, and that definition includes the possibility of
> stack unwinding being a legal form of inter-thread communication, then my
> whole argument is invalid. Hopefully someone more knowledgeable about
> exceptions in C++ can confirm or refute my claim.
>
>
> I am not a language lawyer, but: I strongly suspect that inter-thread
> communication _is_ a supported part of exception handling. After all,
> thinks like std::lock_guard count on order execution of destructors with
> inter-thread impacts.
>
>
But is that more than just program order? If a destructor does not touch
shared memory, then can the fact that the destructor ran be legally used as
a communication channel with another thread?
> In the original TMTS, exceptions are tricky because there need to be
> writes to the exception object, and these writes need to be instrumented in
> a special way, because these writes can escape the transaction even if the
> transaction gets undone. In the proposed TMTS, exceptions cannot escape a
> transaction (and the transaction doesn't ever have to get undone), which
> means that STM can fall back to a single lock whenever it encounters a
> throw. This is easy to do, because the STM instrumentation pass simply
> says "here's a function for which I don't have a definition... I guess I'll
> just serialize the transaction here". cxa_throw is no different than a
> multi-TU transaction.
>
> If we *wanted* to specify commit-on-exception-escape, it would be trivial
> to support in a reasonable TM implementation. I don't think we want to,
> and I'm in favor of forbidding exceptions from escaping transactions.
> However, the use of exceptions within a transaction does not appear to pose
> an implementation challenge, nor does it appear to pose a correctness
> challenge.
>
> So, in summary, I do not see implementation issues with supporting
> throw+catch inside a transaction, or allocation inside a transaction. As
> the minutes state, this will open up a huge portion of the standard library
> to transactions (perhaps only threads, coroutines, mutexes, atomics, and
> iostreams would still be forbidden?).
>
> - Mike
>
> On Wed, Oct 21, 2020 at 12:28 AM Hans Boehm via SG5 <sg5_at_[hidden]>
> wrote:
>
>> Following up on our resolution to discuss this between meetings:
>>
>> In the HTM case, since the hardware actually implements strong atomicity,
>> for atomically accessible data, is it correct to implement
>>
>> atomic do { r = data; }
>>
>> as essentially
>>
>> r.load(data, memory_order_seq_cst);
>>
>> ?
>>
>> That would very partially address the concern we had during the
>> meeting, and maybe make Michael Scott happier. With some implementation
>> effort simple transactional loads could be made cheap on x86 and maybe
>> cheap enough on ARM. It would unfortunately require a less trivial
>> implementation to do so.
>>
>> Hans
>>
>> On Tue, Oct 20, 2020 at 6:16 PM Hans Boehm <boehm_at_[hidden]> wrote:
>>
>>> Attendees: Hans Boehm, Victor Luchangco, Jens Maurer (at the beginning),
>>> Michael Scott, Mike Spear, and Michael Wong
>>>
>>> Initially discussed p2066r4 and vector::size(), and the question of
>>> whether it is and should be included in the set of guaranteed
>>> transaction-safe functions.
>>>
>>> We agreed that the current definition of "accessors" is not clear
>>> enough. At this point, there also seemed to be close to consensus that we
>>> should try to initially keep things as simple as possible, and avoid
>>> anything that might look scary to implementors.
>>>
>>> We then tried to switch back to the discussion of including new.
>>>
>>> Mike Spear argued that adding new and supporting non-escaping exceptions
>>> doesn't really complicate simple implementations. Both single-global-lock
>>> and HTM implementations still work.
>>>
>>> Mike Spear: What can you actually do with the p2066 proposal?
>>>
>>> Michael Scott: implement lock-free data structures like linked lists
>>> that read without a lock, and then transactionally update it at the right
>>> place.
>>>
>>> Mike Spear: No, because the initial reads can't be on atomic data.
>>>
>>> Can't really use these transactions as an NCAS replacement. All read
>>> accesses preceding NCAS would also have to be transactional, which is too
>>> expensive for something like Intel HTM. Transactions cost as much as locks,
>>> so surrounding every read by one is not practical.
>>>
>>> Victor L.: Transactional reads could work for tree traversal followed by
>>> update, which is probably more common.
>>>
>>> Clarification that this avoids contention in an HTM context, but still
>>> has latency issues even in the uncontended case.
>>>
>>> Discussion of double-checked locking, with a similar conclusion. It's
>>> not practical to put the initial read in a transaction.
>>>
>>> We unfortunately concluded that we either need more of a usability
>>> argument for what we have, or probably a revision of both the design and
>>> wording documents to allow new + caught exceptions. If we had the latter,
>>> most of the STL data structures could become usable in transactions. Given
>>> that roll-your-own data structures don't seem terribly viable, that would
>>> give us a much better usability argument.
>>>
>>> Closed with a strong recommendation to continue this discussion online,
>>> and not wait until the next meeting.
>>>
>>> Next meeting:
>>>
>>> Nov. 17, 8:00 PST, 11:00 EST
>>> Join with Google Meet
>>> meet.google.com/sbj-cvgh-vnd
>>> Join by phone
>>> (US) +1 208-925-0196 PIN: 255 542#
>>>
>>>
>>>
>>> --
>> SG5 mailing list
>> SG5_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg5
>>
> --
> SG5 mailing list
> SG5_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg5
>
>
>
Received on 2020-10-21 15:16:17