C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Signed Overflow

From: Hans Åberg <haberg_1_at_[hidden]>
Date: Fri, 5 Sep 2025 11:13:56 +0200
> On 5 Sep 2025, at 10:35, David Brown <david.brown_at_[hidden]> wrote:
>
> On 05/09/2025 10:01, Hans Åberg wrote:
>>> On 4 Sep 2025, at 16:48, David Brown via Std-Proposals <std-proposals_at_[hidden]> wrote:
>>>
>>> Let's pretend we have 4 bit ints - because it makes the numbers much smaller and easier.
>>>
>>> If you write "a + b" with wrapping, and get a result of -2, how do you know the result is correct? How do you know the correct mathematical sum of "a" and "b" is -2, and not 14 ? You don't - you have lost information when overflow occurred. You do not get the right answer for integer summation.
>> If you need the overflow values, it is best to use the functions in Tiago Freire's proposal, as they can use assembler if available. It would be more C++ if one can set a trap if needed, and call a function that, for example, throws an exception. I think this may be possible for floating-point types (on x86_64v and arm64).
>> https://isocpp.org/files/papers/P3161R4.html
>
> C++26 already has ckd_add, ckd_sub and ckd_mul (from C23). These would surely be good enough to get the overflow result if you want it? You could argue that the syntax and naming is not as nice as it could be - returning a tuple of (carry out, wrapped sum) might be better.

It is good to have a variation of functions, stripping the unneeded features. For example, ARM64 has a single-word madd (multiplication-add) function, and a separate function to get the high word. But the madd function does not set a carry bit, so in multi-precision multiplication, one must first get the two-word multiplication result and add the carry to that two-word. If one does not need the carry, one should not use this function.

>> If assembler is not available, for c = a + b to wrap around, a and b must be of the same sign, as otherwise the result is in the interval [a, b]. If both a and b are positive, the overflow occurs when c is less than any of a and b, as if 𝛽 is the base, the return value is a+b-𝛽, and both a-𝛽 and b-𝛽 are negative. Similarly, but reversed, when both a and b are negative.
>> For unsigned types, there is only one overflow test to do. It turns out that it is sufficient for doing full-word multiprecision division. The reason the original division algorithm is written for 64/32-bit words may be that the modular unsigned types were not available back then, in the 1990s or earlier, when it was first written.
>> When going down to a half-word implementation, the division algorithm slows down by a factor of 4–5 at least, I think. Since I wrote it as a template, I could check by comparing it with 16-bit words. (It is also useful with 8-bit words, as that may be used in the literature.)
>> The reason for ditching the undefined behavior is probably that all current hardware architectures are modular, so it is not needed anymore from that point of view. The UNIVAC 1100/2200 series uses ones' complement and is still supported, but I could see if it uses C++.
>> https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series
>> https://en.wikipedia.org/wiki/Ones%27_complement
>
> I don't see that as a reason for removing UB - but it is certainly a good reason for picking modulo wrapping as the behaviour if you first say it should be well-defined.

To me, it sufficed to have modular unsigned types. Without that, it would have been necessary to go down to half-words in the absence of assembly.

>> The optimizations are probably a later development, as the compilers became really good at it first in the 1990s or even later.
>> Dropping undefined behavior in favor of modularity would not break any user code, and the compilers would only need to change any optimizations using it into options.
>
> When something is currently UB, then later giving it a defined behaviour cannot break correctness of previously correct code. But it /can/ lead to changes in efficiency, which can be an issue for some code. (Most code does not need to be maximally efficient, but some code does.)

It can still be an optimization for such code; it just has to avoid using the overflow.

> But if you first give something a defined behaviour, you can't (easily) change it or get back to UB. If, say, C++29 were to mandate wrapping overflow behaviour for signed integer arithmetic, then a compiler can't have a "make overflow UB" flag without risking perfectly good, correct C++29 becoming incorrect. It is fine for compilers to have option flags that add new semantics for existing UB (like "-fwrapv", or "-fno-strict-aliasing"), but a lot more problematic to /remove/ semantics. It is not unheard-of - gcc's "-ffast-math" could be viewed that way.

That is right, this is pretty much a one-way road. I think though that the undefined behavior is an anachronism, and one will never see a hardware architecture requiring it.

You could do it for C++26: it would not matter. :-)

> I am also wary of semantic differences in the language, or differences in the definitions of behaviours, being determined by compiler flags. It can encourage forks in the language - some people writing code that depends on a particular behaviour for correctness, others writing code that depends on other behaviour for efficiency. Would it not be better to have the choice explicitly in code in some manner? There has always been an way to get signed modular arithmetic - convert to unsigned types, do your calculations, and convert back to signed types. (Let's pretend no one has to divide...) But could there be a better syntax with clearer expression of intention, and more flexibility?

I know that the signed types are popular, but when I need overflow, I use only unsigned. So I can't really tell the public opinion here. But if changed, there might be an optimization flag. The question is how important this type of optimizations are in actual code. That can only be tested hands-on.

> If the language (or implementation) defined overflow to wrap, then optimisation efficiency could be restored by use of contracts, contract assertions, and [[assume]] attributes. I am in favour of encouraging the use of these features in code, but I don't know how well they would work with current tools.

If it can simplify code verification, that would be the way to go, I think. It would be a future alternative to letting the compiler detect overflows, or maybe throwing exceptions via a trap (if this last one is possible).
 

Received on 2025-09-05 09:14:15