Date: Sat, 9 Aug 2025 09:23:19 +0200
The very long multiplication with O(n*log n) may need a higher number of bits for each individual multiplication than the trivial O(n²) solution. Leading to a higher constant factor, if it exceeds the native capability of the ALU or the size of registers.
-----Ursprüngliche Nachricht-----
Von:Hans Åberg via Std-Proposals <std-proposals_at_[hidden]>
Gesendet:Fr 08.08.2025 23:16
Betreff:Re: [std-proposals] Multiprecision division
An:std-proposals_at_[hidden];
CC:Hans Åberg <haberg_1_at_[hidden]>;
> On 8 Aug 2025, at 20:38, Thiago Macieira via Std-Proposals <std-proposals_at_[hidden]> wrote:
>
> _BitInt is in C23. It's not yet officially part of C++, but compilers that
> support it in C will support it in C++ at some point (Clang currently does,
> GCC does not). There's also someone writing a paper to add std::bitint<N>.
>
> The upper limit of N is implementation dependent. GCC 15 has an upper limit of
> 65535 bits and Clang's is 8388608 bits.
The standard “long” multiplication has quadratic time complexity in the number of words, which sets a practical limit. The Wikipedia indicates algorithms that are better in this respect, but they have high overhead for smaller numbers, with a best theoretical limit of O(n log n).
Therefore, one wants to use the largest possible word, like 64-bit instead of a 64/32-bit hybrid. I admit the pure 64-bit, relying on a 64/32-bit div_wide in Tiago Freire's proposal if divq is not available (like the LLVM implementation mentioned below).
> As you can see in https://gcc.godbolt.org/z/f5zTqrYK4,
> * there already is a runtime library function for performing generic
> multiprecision divisions, called __divmodbitint4[1][2]
From what I can see on the link below, this is similar to what I suggested, only that I admit one to choose the word size, and do not have any allocation requirements other than what is necessary. With “const” on u, it must be copied, probably internally, as they do not seem to use r, like alloca or malloc.
https://gcc.gnu.org/onlinedocs/gccint/Integer-library-routines.html
> * Clang inlined the division instead of calling the runtime library function
> * the runtime library function is not public; only the compiler is allowed to
> call it
All my functions are inlined, and written to use functions as in Tiago Freire's proposal, which can be optimized with assembly code where available. The division function does not require words of higher sizes, but uses these functions instead. There are default implementations for all available word sizes.
> * users are limited to sizes known at compile time
> Your proposal right now is giving direct access to the runtime functions, by
> giving them a public name.
>
> Do we need sizes only known at runtime?
Both compile-time and runtime sizes work. I gave an example before in this thread: the function div32, using static arrays.
> If you do, wouldn't it be better to just use some other MP library?
The article below shows the problem with other implementations, leading up to a variation of div_wide<uint64_t> in Tiago Freire's proposal. External libraries must be maintained, and there are license issues.
https://danlark.org/2020/06/14/128-bit-division/
https://raw.githubusercontent.com/llvm/llvm-project/refs/heads/main/compiler-rt/lib/builtins/udivmodti4.c
--
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
Received on 2025-08-09 07:33:43