It also occurred to me later that the single lock fallback was going to be a problem. Unfortunate.

I think that if we add a version counter:

1) To the HTM fallback only, we can get cheaper scalable single reads.
2) Always, we can get cheap scalable read-only transactions.

What's the experience with those implementation strategies? I actually kind of like those implementation directions here, in that they give transactions a large performance advantage for some kinds of code, without requiring HTM, and they require ambitious compiler analysis only for performance, not correctness. Performance is scalable with compiler effort.

Hans

On Wed, Oct 21, 2020 at 10:26 AM Michael L. Scott via SG5 <sg5@lists.isocpp.org> wrote:
On Oct 21, 2020, at 10:01 AM, Michael Spear <mfs409@gmail.com> wrote:

I like this idea.  But note that it's going to require a lot of whole program reasoning.

I agree that it trivially applies to HTM.  But it does not apply to a single lock fallback, unless the compiler can ensure that no transaction performs two writes to /data/.  Otherwise, the internal state of the two-write transaction will be visible.  I haven't thought about the STM (especially redo case), but I suspect that it would break too.

I think both single-lock and STM are broken :-(

Consider
  T1: atomic { x++; y++; }
and
  T2: atomic {s = x;}  atomic {t = y;}

I see no way to preclude the case in which T2 sees the new x and the old y, with no strictly serializable execution order.

The use case is when the larger transaction does 
atomic do { data += 1; if (foo()) data += 1; else data += 3; }
If data is initially even, then we cannot allow the single read to see an odd value.

Also, if a transaction wrote {data1++; data2++}, data 1 always equals data 2, and another thread performed 
atomic do { r1 = data1;} atomic do {r2 = data2;}
We could also see intermediate state.  It's not really a different problem than making sure that STM writeback is atomic before privatization, but it applies to the single lock fallback as well as STM.

- Mike

On Wed, Oct 21, 2020 at 9:49 AM Michael L. Scott via SG5 <sg5@lists.isocpp.org> wrote:
That’s a good point.  It does indeed make me happier.  One could imagine the compiler similarly recognizing single-word store and RMW idioms.  Still no way to specify relaxed ordering when you want it, though.

On Oct 21, 2020, at 12:27 AM, Hans Boehm via SG5 <sg5@lists.isocpp.org> wrote:

Following up on our resolution to discuss this between meetings:

In the HTM case, since the hardware actually implements strong atomicity, for atomically accessible data, is it correct to implement

atomic do { r = data; }

as essentially 

r.load(data, memory_order_seq_cst);

?

That would very partially address the concern we had during the meeting, and maybe make Michael Scott happier. With some implementation effort simple transactional loads could be made cheap on x86 and maybe cheap enough on ARM. It would unfortunately require a less trivial implementation to do so.

Hans

On Tue, Oct 20, 2020 at 6:16 PM Hans Boehm <boehm@acm.org> wrote:
Attendees: Hans Boehm, Victor Luchangco, Jens Maurer (at the beginning), Michael Scott, Mike Spear, and Michael Wong

Initially discussed p2066r4 and vector::size(), and the question of whether it is and should be included in the set of guaranteed transaction-safe functions.

We agreed that the current definition of "accessors" is not clear enough. At this point, there also seemed to be close to consensus that we should try to initially keep things as simple as possible, and avoid anything that might look scary to implementors.

We then tried to switch back to the discussion of including new.

Mike Spear argued that adding new and supporting non-escaping exceptions doesn't really complicate simple implementations. Both single-global-lock and HTM implementations still work.

Mike Spear: What can you actually do with the p2066 proposal?

Michael Scott: implement lock-free data structures like linked lists that read without a lock, and then transactionally update it at the right place.

Mike Spear: No, because the initial reads can't be on atomic data.

Can't really use these transactions as an NCAS replacement. All read accesses preceding NCAS would also have to be transactional, which is too expensive for something like Intel HTM. Transactions cost as much as locks, so surrounding every read by one is not practical.

Victor L.: Transactional reads could work for tree traversal followed by update, which is probably more common.

Clarification that this avoids contention in an HTM context, but still has latency issues even in the uncontended case.

Discussion of double-checked locking, with a similar conclusion. It's not practical to put the initial read in a transaction.

We unfortunately concluded that we either need more of a usability argument for what we have, or probably a revision of both the design and wording documents to allow new + caught exceptions. If we had the latter, most of the STL data structures could become usable in transactions. Given that roll-your-own data structures don't seem terribly viable, that would give us a much better usability argument.

Closed with a strong recommendation to continue this discussion online, and not wait until the next meeting.

Next meeting:

Nov. 17, 8:00 PST, 11:00 EST
Join with Google Meet
meet.google.com/sbj-cvgh-vnd
Join by phone
‪(US) +1 208-925-0196‬ PIN: ‪255 542‬#



--
SG5 mailing list
SG5@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg5

--
SG5 mailing list
SG5@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg5

--
SG5 mailing list
SG5@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg5