ISOCPP std-proposals List: Re: [std-proposals] Integer overflow arithmetic

From: Jan Schultke <janschultke_at_[hidden]>
Date: Sun, 18 Feb 2024 17:35:15 +0100

> In this case you are doing modular arithmetic, in that situation your inputs can never be larger than what you can divide.

I think you're right; this didn't occur to me.

> That is where I disagree, and what I see is the exact opposite, people think they want 128bit arithmetic, but what they really want is an easy way to do multi-word arithmetic.

We've gone over this a number of times and honestly, it doesn't really
matter whether one thinks of 128-bit operations as 128-bit or
2x64-bit. It's a philosophical question.

What's much more important is what a language/library feature lets you
express. Basically, the "bang for your buck" in terms of what it costs
to add to the language and what benefit you get from it. I'm starting
to oppose multi-word arithmetic in principle because 128-bit obsoletes
almost all use cases in practice.

- You can implement mul_wide in terms of 128-bit multiplication
between two 64-bit operands.
- You can implement add_wide in terms of 128-bit addition between two
64-bit operands.
- You can implement div_wide in terms of 128-bit division with a 64-bit divisor.
- You can add [[assume]] to match the semantics of your function exactly.
- You can implement rem_wide in terms of 128-bit division.

What's the point of any of these functions then, if the user can
implement them trivially themselves, using a much more ergonomic, more
powerful, and more portable feature?

The codegen for some 128-bit operations is not optimal yet, but this
is 100% a quality of implementation issue. A proposal can't succeed on
QoI issues alone. For example, you could easily contribute a diff to
__umodti3 in GCC and clang which implements 128-bit remainder in terms
of two DIV instructions, like in the trick you've shown.

> So does every single compiler that provide 128bit arithmetic,
> Look here:
> https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/builtins/udivmodti4.c#L84

If 128-bit is the use case, then just give me 128-bit integers. It's
completely pointless to go through these weird middle-men like
div_wide.

I'm not saying that this function isn't useful in principle, but it's
not useful in a way that warrants standardization. In the case of
implementing 128-bit arithmetic, just give me 128-bit integers.
128-bit numbers express 128-bit operations a thousand times better
than two 64-bit numbers.

However, more generally, your proposal is about abstracting hardware
operations. The problem here is that there is only a single
architecture which supports this operation in the first place. Every
architecture except x86_64 will have to software-emulate div_wide,
making standardization pointless.

Just write inline assembly. The people who are using _udiv128 and
other flavors of this already are doing so in projects that make heavy
use of x86 intrinsics anway. You won't help these people much by
replacing one out of thousands of intrinsics with a standard function.

> In my opinion this is an extremely basal function. I shouldn't have to write assembly or re-invent the wheel to get it.

You can make the same case for every AVX-512 instruction as well. Why
don't we have a standard library wrapper for all AVX-512 instructions.
All other architectures can just software-emulate the functionality.

Furthermore, I think it's quite a stretch to call div_wide "extremely
basal". The fact that it's literally UB for a large number of inputs
and can only be used in specific cases makes it not so basal in my
opinion.

Received on 2024-02-18 16:35:27