Date: Sat, 9 Aug 2025 12:18:22 +0200
On 09.08.2025 09:23, Sebastian Wittmeier via Std-Proposals wrote:
> The very long multiplication with O(n*log n) may need a higher number of
> bits for each individual multiplication than the trivial O(n²) solution.
> Leading to a higher constant factor, if it exceeds the native capability
> of the ALU or the size of registers.
>
Algorithms more advanced than the trivial one /always/ need at least
some extra bits. And while they save on multiplies, they require extra
additions and other manipulations. So they are always less efficient
with sizes up to several times bigger than processors can handle
natively. I would not expect anything to be faster than the trivial
algorithm until about 512 bits at least, maybe a fair bit more than
that. It can make sense to inline the trivial algorithm, but other
algorithms would be library code.
After that, Karatsuba (which splits the n-bit numbers into two, then
uses three n/2 bit multiplications instead of the four needed by the
trivial algorithm) is best for a good while.
I can't see it making sense to have anything in the C++ standard that
requires more than that - fancier algorithms are in the realms of
advanced arbitrary precision calculations and big libraries like the GMP.
And the only known O(n . log n) multiplication algorithm is a so-called
"galactic algorithm" - it is only worth using when you have numbers the
size of a galaxy!
A C++ standards proposal should of course only specify the interface -
the algorithms used are a matter of the implementation.
> The very long multiplication with O(n*log n) may need a higher number of
> bits for each individual multiplication than the trivial O(n²) solution.
> Leading to a higher constant factor, if it exceeds the native capability
> of the ALU or the size of registers.
>
Algorithms more advanced than the trivial one /always/ need at least
some extra bits. And while they save on multiplies, they require extra
additions and other manipulations. So they are always less efficient
with sizes up to several times bigger than processors can handle
natively. I would not expect anything to be faster than the trivial
algorithm until about 512 bits at least, maybe a fair bit more than
that. It can make sense to inline the trivial algorithm, but other
algorithms would be library code.
After that, Karatsuba (which splits the n-bit numbers into two, then
uses three n/2 bit multiplications instead of the four needed by the
trivial algorithm) is best for a good while.
I can't see it making sense to have anything in the C++ standard that
requires more than that - fancier algorithms are in the realms of
advanced arbitrary precision calculations and big libraries like the GMP.
And the only known O(n . log n) multiplication algorithm is a so-called
"galactic algorithm" - it is only worth using when you have numbers the
size of a galaxy!
A C++ standards proposal should of course only specify the interface -
the algorithms used are a matter of the implementation.
Received on 2025-08-09 10:18:27