Date: Fri, 5 Sep 2025 10:35:41 +0200
On 05/09/2025 10:01, Hans Γ
berg wrote:
>
>> On 4 Sep 2025, at 16:48, David Brown via Std-Proposals <std-proposals_at_[hidden]> wrote:
>>
>> Let's pretend we have 4 bit ints - because it makes the numbers much smaller and easier.
>>
>> If you write "a + b" with wrapping, and get a result of -2, how do you know the result is correct? How do you know the correct mathematical sum of "a" and "b" is -2, and not 14 ? You don't - you have lost information when overflow occurred. You do not get the right answer for integer summation.
>
> If you need the overflow values, it is best to use the functions in Tiago Freire's proposal, as they can use assembler if available. It would be more C++ if one can set a trap if needed, and call a function that, for example, throws an exception. I think this may be possible for floating-point types (on x86_64v and arm64).
> https://isocpp.org/files/papers/P3161R4.html
>
C++26 already has ckd_add, ckd_sub and ckd_mul (from C23). These would
surely be good enough to get the overflow result if you want it? You
could argue that the syntax and naming is not as nice as it could be -
returning a tuple of (carry out, wrapped sum) might be better.
> If assembler is not available, for c = a + b to wrap around, a and b must be of the same sign, as otherwise the result is in the interval [a, b]. If both a and b are positive, the overflow occurs when c is less than any of a and b, as if π½ is the base, the return value is a+b-π½, and both a-π½ and b-π½ are negative. Similarly, but reversed, when both a and b are negative.
>
> For unsigned types, there is only one overflow test to do. It turns out that it is sufficient for doing full-word multiprecision division. The reason the original division algorithm is written for 64/32-bit words may be that the modular unsigned types were not available back then, in the 1990s or earlier, when it was first written.
>
> When going down to a half-word implementation, the division algorithm slows down by a factor of 4β5 at least, I think. Since I wrote it as a template, I could check by comparing it with 16-bit words. (It is also useful with 8-bit words, as that may be used in the literature.)
>
> The reason for ditching the undefined behavior is probably that all current hardware architectures are modular, so it is not needed anymore from that point of view. The UNIVAC 1100/2200 series uses ones' complement and is still supported, but I could see if it uses C++.
> https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series
> https://en.wikipedia.org/wiki/Ones%27_complement
>
I don't see that as a reason for removing UB - but it is certainly a
good reason for picking modulo wrapping as the behaviour if you first
say it should be well-defined.
> The optimizations are probably a later development, as the compilers became really good at it first in the 1990s or even later.
>
> Dropping undefined behavior in favor of modularity would not break any user code, and the compilers would only need to change any optimizations using it into options.
>
When something is currently UB, then later giving it a defined behaviour
cannot break correctness of previously correct code. But it /can/ lead
to changes in efficiency, which can be an issue for some code. (Most
code does not need to be maximally efficient, but some code does.)
But if you first give something a defined behaviour, you can't (easily)
change it or get back to UB. If, say, C++29 were to mandate wrapping
overflow behaviour for signed integer arithmetic, then a compiler can't
have a "make overflow UB" flag without risking perfectly good, correct
C++29 becoming incorrect. It is fine for compilers to have option flags
that add new semantics for existing UB (like "-fwrapv", or
"-fno-strict-aliasing"), but a lot more problematic to /remove/
semantics. It is not unheard-of - gcc's "-ffast-math" could be viewed
that way.
I am also wary of semantic differences in the language, or differences
in the definitions of behaviours, being determined by compiler flags.
It can encourage forks in the language - some people writing code that
depends on a particular behaviour for correctness, others writing code
that depends on other behaviour for efficiency. Would it not be better
to have the choice explicitly in code in some manner? There has always
been an way to get signed modular arithmetic - convert to unsigned
types, do your calculations, and convert back to signed types. (Let's
pretend no one has to divide...) But could there be a better syntax
with clearer expression of intention, and more flexibility?
If the language (or implementation) defined overflow to wrap, then
optimisation efficiency could be restored by use of contracts, contract
assertions, and [[assume]] attributes. I am in favour of encouraging
the use of these features in code, but I don't know how well they would
work with current tools.
>
>> On 4 Sep 2025, at 16:48, David Brown via Std-Proposals <std-proposals_at_[hidden]> wrote:
>>
>> Let's pretend we have 4 bit ints - because it makes the numbers much smaller and easier.
>>
>> If you write "a + b" with wrapping, and get a result of -2, how do you know the result is correct? How do you know the correct mathematical sum of "a" and "b" is -2, and not 14 ? You don't - you have lost information when overflow occurred. You do not get the right answer for integer summation.
>
> If you need the overflow values, it is best to use the functions in Tiago Freire's proposal, as they can use assembler if available. It would be more C++ if one can set a trap if needed, and call a function that, for example, throws an exception. I think this may be possible for floating-point types (on x86_64v and arm64).
> https://isocpp.org/files/papers/P3161R4.html
>
C++26 already has ckd_add, ckd_sub and ckd_mul (from C23). These would
surely be good enough to get the overflow result if you want it? You
could argue that the syntax and naming is not as nice as it could be -
returning a tuple of (carry out, wrapped sum) might be better.
> If assembler is not available, for c = a + b to wrap around, a and b must be of the same sign, as otherwise the result is in the interval [a, b]. If both a and b are positive, the overflow occurs when c is less than any of a and b, as if π½ is the base, the return value is a+b-π½, and both a-π½ and b-π½ are negative. Similarly, but reversed, when both a and b are negative.
>
> For unsigned types, there is only one overflow test to do. It turns out that it is sufficient for doing full-word multiprecision division. The reason the original division algorithm is written for 64/32-bit words may be that the modular unsigned types were not available back then, in the 1990s or earlier, when it was first written.
>
> When going down to a half-word implementation, the division algorithm slows down by a factor of 4β5 at least, I think. Since I wrote it as a template, I could check by comparing it with 16-bit words. (It is also useful with 8-bit words, as that may be used in the literature.)
>
> The reason for ditching the undefined behavior is probably that all current hardware architectures are modular, so it is not needed anymore from that point of view. The UNIVAC 1100/2200 series uses ones' complement and is still supported, but I could see if it uses C++.
> https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series
> https://en.wikipedia.org/wiki/Ones%27_complement
>
I don't see that as a reason for removing UB - but it is certainly a
good reason for picking modulo wrapping as the behaviour if you first
say it should be well-defined.
> The optimizations are probably a later development, as the compilers became really good at it first in the 1990s or even later.
>
> Dropping undefined behavior in favor of modularity would not break any user code, and the compilers would only need to change any optimizations using it into options.
>
When something is currently UB, then later giving it a defined behaviour
cannot break correctness of previously correct code. But it /can/ lead
to changes in efficiency, which can be an issue for some code. (Most
code does not need to be maximally efficient, but some code does.)
But if you first give something a defined behaviour, you can't (easily)
change it or get back to UB. If, say, C++29 were to mandate wrapping
overflow behaviour for signed integer arithmetic, then a compiler can't
have a "make overflow UB" flag without risking perfectly good, correct
C++29 becoming incorrect. It is fine for compilers to have option flags
that add new semantics for existing UB (like "-fwrapv", or
"-fno-strict-aliasing"), but a lot more problematic to /remove/
semantics. It is not unheard-of - gcc's "-ffast-math" could be viewed
that way.
I am also wary of semantic differences in the language, or differences
in the definitions of behaviours, being determined by compiler flags.
It can encourage forks in the language - some people writing code that
depends on a particular behaviour for correctness, others writing code
that depends on other behaviour for efficiency. Would it not be better
to have the choice explicitly in code in some manner? There has always
been an way to get signed modular arithmetic - convert to unsigned
types, do your calculations, and convert back to signed types. (Let's
pretend no one has to divide...) But could there be a better syntax
with clearer expression of intention, and more flexibility?
If the language (or implementation) defined overflow to wrap, then
optimisation efficiency could be restored by use of contracts, contract
assertions, and [[assume]] attributes. I am in favour of encouraging
the use of these features in code, but I don't know how well they would
work with current tools.
Received on 2025-09-05 08:35:51