ISOCPP SG14 List: [isocpp-sg14] Subject: Canonical parallel reduction

From: andrew drakeford <andreedrakeford_at_[hidden]>
Date: Sun, 15 Feb 2026 20:14:57 +0000

Hi all,
Following the recent SG14 discussion on deterministic parallel reduction, I’ve now written up the proposal as a formal WG21 draft.
The paper focuses on expression-structure determinism for parallel reductions: it defines a canonical pairwise reduction tree parameterised by a lane count L. Implementations (threads, SIMD, GPU) must produce results as-if evaluating that fixed tree.
This extends the determinism guarantee of std::accumulate (fixed evaluation structure) into parallel and vectorised contexts, without imposing associativity or commutativity requirements.
Draft paper (HTML):
https://andyd123.github.io/canonical-reduce/generated/DxxxxR0.html
As discussed, the reduction kernel follows the “iterated pairwise summation” approach described in:
Dalton, Wang, Blainey — “Fast, Accurate Summation of Floating-Point Numbers” (IBM Research)
https://research.ibm.com/publications/fast-accurate-summation-of-floating-point-numbers
I would particularly welcome feedback from SG14 on:

  *
Whether the semantic contract is clear and appropriately scoped
  *
Any concerns regarding interaction with existing parallel algorithms
  *
Whether the lane-parameterised formulation is the right abstraction boundary

If there are no major objections, I intend to move this toward LEWG for initial direction.
Best regards,
Andrew

Received on 2026-02-15 20:15:02