Date: Wed, 14 Jan 2026 10:43:46 +0100
>
> Yes, that is how I discovered it was much faster. It is a template,
> essentially an implementation of div_wide in this proposal:
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3161r4.html#functions.div_wide
>
> LLVM implements it with a loop. I made recursive templates, removing the
> loop and all multiplications except one per halfword.
>
> Tiago Freire has a copy which might be used for a reference
> implementation, in case you would like to add requirements that compilers
> have it, not merely being optional.
>
If you have an optimization opportunity that LLVM does not take, why don't
you make an LLVM PR or bug report instead of a C++ proposal?
> And if so, why can't the compiler do
> > the same? What is the reason it cannot do the same?
>
> Why don't you ask the compiler or compiler writer? :-)
>
Why don't we ask you? You're the one arguing there needs to be a mod_int
C++ feature, so you're the one who needs to motivate it.
In order to remove the loop, one has to restructure the condition. For a
> full word implementation, one has to use two's complement features. Then in
> addition, using the mathematical facts that preliminary division overshoots
> with at most 2, that the add back step is not needed, and can further be
> exploited to eliminate a final multiplication.
>
And why is it innately impossible for LLVM to perform this
two's-complement-based and mathematical optimization for
unsigned _BitInt(128), unlike for mod_int<128>?
> Yes, that is how I discovered it was much faster. It is a template,
> essentially an implementation of div_wide in this proposal:
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3161r4.html#functions.div_wide
>
> LLVM implements it with a loop. I made recursive templates, removing the
> loop and all multiplications except one per halfword.
>
> Tiago Freire has a copy which might be used for a reference
> implementation, in case you would like to add requirements that compilers
> have it, not merely being optional.
>
If you have an optimization opportunity that LLVM does not take, why don't
you make an LLVM PR or bug report instead of a C++ proposal?
> And if so, why can't the compiler do
> > the same? What is the reason it cannot do the same?
>
> Why don't you ask the compiler or compiler writer? :-)
>
Why don't we ask you? You're the one arguing there needs to be a mod_int
C++ feature, so you're the one who needs to motivate it.
In order to remove the loop, one has to restructure the condition. For a
> full word implementation, one has to use two's complement features. Then in
> addition, using the mathematical facts that preliminary division overshoots
> with at most 2, that the add back step is not needed, and can further be
> exploited to eliminate a final multiplication.
>
And why is it innately impossible for LLVM to perform this
two's-complement-based and mathematical optimization for
unsigned _BitInt(128), unlike for mod_int<128>?
Received on 2026-01-14 09:44:02
