ISOCPP std-proposals List: Re: [std-proposals] Modular integers

From: Jan Schultke <janschultke_at_[hidden]>
Date: Tue, 13 Jan 2026 21:26:34 +0100

>
> > If recursive template-based implementations can generate better code,
> > then the compiler could just skip the recursive template nonsense and
> > just generate the better code. That recursive templates can be used to
> > trick the compiler into that better code generation doesn't stop this
> > from being a QoI issue.
>
> The recursive templates can't easily be omitted because the iteration is
> too complex to be written by hand, and part of the optimizations happens in
> the pipelining, which is on a lower level.

There is still no reason why the target-specific instruction selection for
iN operations in LLVM necessarily selects worse instructions for say,
128-bit division than division between mod_int. CPU pipelining is below
assembly anyway, so somewhat beyond what compilers can directly control.

> > If you want this to move forward, you need to show:
> >
> > 1. That better code generation exists. This is not about the
> > *mechanism* of that code generation (ie: recursive templates); I'm
> > talking about the actual resulting assembly.
> > 2. That a compiler is incapable of generating that code using the
> > existing `_BitInt` mechanism. Not that compilers don't currently do
> > it; that a compiler *fundamentally* cannot do so through `_BitInt`.
> > That there is some information which cannot exist that prevents it
> > from generating the optimal assembly.
>
> It relies on a mathematical analysis which can't be done in the general
> multiprecision division algorithm.
>

But LLVM does not perform multiprecison until instruction selection in the
backend. It does mathematical analysis in the middle-end for iN as if that
integer width was supported. That is, dividing by 2**100 should get
strength-reduced to a 100-bit right-shift instead of first being broken
down into countless instructions that comprise a 64-bit operation,
preventing this strength reduction.

The fact that mathematical analysis can't be done when you break it down
into multiprecision is actually why it's a terrible idea to use pure
library implementations instead of fundamental types for N-bit integers:
emitting 100 64-bit IR instructions instead of a single 128-bit IR
instruction obfuscates the mathematics of the operation to the compiler.
This is why we need _BitInt.

Received on 2026-01-13 20:26:49