C++ Logo

std-proposals

Advanced search

Re: [std-proposals] D3666R0 Bit-precise integers

From: Oliver Hunt <oliver_at_[hidden]>
Date: Wed, 03 Sep 2025 12:09:48 -0700
> On Sep 3, 2025, at 4:53 AM, David Brown <david.brown_at_[hidden]> wrote:
>>>
>>> Do we /need/ UB on signed arithmetic overflow? No. Do we /want/ UB on signed arithmetic overflow? Yes, IMHO. I am of the opinion that it makes no sense to add two negative numbers and end up with a positive number. There are very, very few situations where wrapping behaviour on signed integer arithmetic is helpful - making it defined as wrapping is simply saying that the language will pick a nonsensical result that can lead to bugs and confusion, limit optimisations and debugging, and cannot possibly give you a mathematically correct answer, all in the name of claiming to avoid undefined behaviour.
>> This as argument for unspecified or implementation defined behavior, not introducing a brand new type, with *all* of the known security issues of `int` (that we decided were not necessary for the performance of unsigned integers).
>
> "int" does not, in itself, have "security issues". Incorrect programming can have security implications. Overflow - whether wrapping or UB, or anything else - is usually a bug in the code. Giving a fixed definition to the incorrect value does not magically fix the error in the code.

No, Any new feature that introduces UB for no reason is a security feature, for the same reason that creating new types that introduce lifetime issues when used in a trivially obvious way (think: returning a string_view).

Saying “it’s not a security issue because we’ve simply stated that completely defined and deterministic behavior is not defined and deterministic” is a safety issue. It further perpetuates the current state of adversarial C++ compilers.

>
> IME giving signed integer arithmetic overflow a defined behaviour never improves code quality - it does not fix bugs, it does not "improve security", it does not allow any new coding techniques of merit. It /does/ have an impact on code efficiency, and it /does/ have an impact on code debugging and static error checking - and those last two far outweigh any imaginary "security" improvements.

Integer overflow has defined behavior, in languages without backwards compatibility constraints overflow is defined as trapping, and they provide language specified mechanisms to perform wrapping arithmetic. Yay for them.

In C++ we have refused to address this in any standardized way: there is *no* mechanism to say “I want wrapping” or “I do not want wrapping”, and there is no way to perform wrapping arithmetic at all.

There is one *potentially* unsafe overflow behavior: wrapping.

By insisting on “well we have one potentially unsafe outcome, why not make it even worse by making the compiler actively pretend that well defined deterministic behavior is not defined", we make it so that even intentional overflow by a developer that recognizes the possibility of overflow and knows the exact semantics of that overflow, is now fighting an actively hostile compiler.

Our options are: “overflow may be an error” or “overflow may be an error *and* the compiler may change the semantics of code that is well defined on the target platform”. One of these options has one potential safety problem, the other has more than one. i.e one of these choices is definitionally less safe than the other.

Intentionally adding new foot guns and UB to the language further reinforces the external perception that the C++ WG is not taking language safety seriously.

>
> Unsigned arithmetic is wrapping in C (and therefore C++), because very occasionally you /do/ want wrapping behaviour. So C provides that. The speed impact is not usually an issue because you don't usually do the same kind of arithmetic on unsigned types - and when you use them for things like loop indices you are typically using size_t of the same size as pointers. (And the compiler can assume your arithmetic there doesn't overflow, because memory does not wrap.)

Signed arithmetic is well defined on all hardware, developers know the exact semantics. The reason we insist that signed arithmetic is UB is solely for the purpose of optimizing specific code styles, my view is that the reason that defined overflow for unsigned arithmetic got through is that most benchmarks use signed ints for their induction variables and making overflow UB permits optimizations on machines that only perform 64bit arithmetic, and so making overflow defined adds logic that undermines specific optimizations.

I have certainly _never_ heard a justification for signed overflow being UB that was not “it allows optimizations” - I have never heard any argument that developers expect UB, or there are machines that produce undefined behavior on this overflow.

If this was _really_ an argument about (quoting from prior to this paragraph) "Overflow - whether wrapping or UB, or anything else - is usually a bug in the code” then we would not being saying “therefore we should make it worse”, we would be saying “lets make it possible for a developer to be explicit about whether they expect or do not expect overflow”. The fact that we are still even considering adding new examples of this kind of problem does not make sense.

> Some languages - like Zig - make unsigned integer arithmetic overflow UB as well, using the default operators. And then they provide specific operators for wrapping integer arithmetic for signed and unsigned types, which I think is a good solution. After all, it is the /operations/ that overflow, not the types.

Zig makes many choices that ignore decades of security problems, and is fundamentally a niche language.

Look at the major new languages like rust or swift: overflow has defined behavior.

Older languages that are newer than C/C++: java, .net targeting languages, ada, Delphi, etc all give overflow defined behavior. Their age seems to dictate wrapping vs trapping, but the behavior is guaranteed, deterministic, and does not force a developer to constantly remind themselves that a _fundamental_ operation is UB.

[I was tempted to include JS in this list, just because lol it is technically well defined :D]

>> “It might be possible to optimize this” needs to stop being a justification for UB.
>
> If you were paying attention, I gave multiple reasons for the benefits of UB.

I must have missed them (and I’ve somehow lost part of this thread, as I’m fairly sure that despite what my mail client said, this thread does not *start* with me replying to someone :D) - I saw performance, “this code is already probably wrong”, and I saw “despite developers knowing that integer arithmetic is not arbitrary precision, they expect us to be performing arbitrary precision arithmetic”.

> And if you are not interested in optimising and generating efficient code, pick a language that supports big integers that can't overflow.

Rust generates code with comparable performance to C and C++, and does not permit overflow. For pure numerics swift also produces comparable performance (swift’s object model impacts perf in different ways, but is still able to be use as a systems language without performance problems in practice).

But we are not even arguing about “can overflow occur”, we’re saying “C and C++ have overflowing arithmetic, but the compiler can pretend it does not”.

In that case, you are again just saying “by pretending a computer does not work as it objectively does, I can make code faster” - that is *again* demonstrating that the arguments other than “by pretending the computer operates other than it actually does, we can go faster” are specious.
>
> The argument that should be thrown out, IMHO, is the "undefined bad, defined good" mantra and the mentality that goes with it. It is just another "zero tolerance" rule used to justify zero thinking. It has all the validity of someone claiming their PC is "secure" because they update the OS every week.

Right, except that path simply reinforces “Despite claiming to care about safety and security, the C++ committee objectively does not as they are continuing to add more of the constructs that are responsible for much of the existing safety and security problems in the language. Stop using C++, migrate to newer languages, and don’t start new projects or products in C++”.

UB should not be being used just because it permits optimizations - many of the existing optimizations can be easily solved by minor code changes, and the language should add affordances for those cases where it cannot be.

It is very clear from this comment that you do not actually care about platform security: you’re literally saying “lets continue to add features to C++ that replicate existing concepts that we know lead to security vulnerabilities, and then saying that software is not secure”.

> Write your arithmetic code /correctly/, and it will not overflow.

Ah the classic “write 100% perfect software, 100% of the time, while being aided by a compiler that introduces security vulnerabilities because the obvious behavior of the code has been *defined* as erroneous by definition. Why not have one possible error path, when we could have more?"

> Then it doesn't matter a bugger what the language says about overflows - defined, undefined, saturating, triggering emails to your boss, or whatever - because the overflow never happens.

Overflow does happen. Errors happen. A programming language has a number of choices: it can make errors not exploitable, it can leave the errors as written, or it can pretend that the behavior cannot happen, even when it is expected or the developer knows the machine behavior, and change the plain semantics of the source code - not because the code is wrong on any machine at all, but because the language has just declared in spite of all available evidence that the overflow cannot happen.

> The /only/ situation in which overflow behaviour matters is if your code is buggy and the calculations actually overflow - in which case, defined wrapping is just another flavour of "wrong" that is harder to catch than UB.

The code is not buggy, you have defined it as being erroneous because you don’t like it, the majority of languages have defined overflow semantics - you are not arguing for overflow is an error, you are arguing for *the compiler can assume it is an error, irrespective of developer intent*.

>
>> The fact that we were willing to adopt 2s complement for unsigned arithmetic implies that the real world benefit of leaving this UB for signed arithmetic is extremely limited.
>> This also ignores the increasing prevalence of bounds checks which by definition don’t get to pretend that overflow is not well defined on all hardware.
>
> You are not just clutching at straws now, you are making these straws out of thin air.
>
> Bounds checks can definitely be a good thing - especially during development and fault-finding, but also at run-time for a lot of types of program.

This comment seems to imply a lack of awareness of the current state of software development: In new languages bounds checking is the default, in the standard library we have added hardened preconditions - the only reason that is not the default is because of many committee members opposing it being so. In addition to the hardened runtime proposal, we’ve also published `-fbounds-safety` in clang to make bounds checking automatic in C as well, because there is C code that cannot use the C++ standard library, projects that define their own basic datatypes perform bounds checks as well.

Of course, this is separate from the new systems programming languages where avoiding a bounds check requires significant, and clearly visible and auditable, steps by the developer.

> These checks are mostly entirely independent of overflow behaviour and any particular hardware implementation - a check that a container index is valid is in no sense reliant on wrapping overflow!

Yeah, in this case I am a muppet - I recall there being some weird case we’re codegen is bad but I _think_ that was actually an optimizer bug that I conflated with modern reality.

>
> Except...
>
> If signed integer arithmetic overflow is UB, then a compiler can easily add run-time checks such as with gcc and clang's "-fsanitize=signed-integer-overflow" flag. The developer has a quick, simple and efficient way to check for overflow problems - as long as some ignorant developer has not been using "-fwrapv" to sweep the bugs under the rug to wait for after customer deployment.

No, we’re talking about C++, not a different language - compiler flags that change behavior the spec says UB into defined behavior are making the language no longer C++. Because of that code that is correct with that flag, is not correct C++, specifically because C++ says the impacted code is erroneous.

When discussing the properties of a language feature or operation, you need to confine the discussion to the language, not the behavior of non-standard dialects. When people are discussing the safety profile of C++ it is not acceptable to say “well that problem that you raised is solved if you use this compiler flags to change the language”. That is not a selling point, that is another example of “even the compiler writers know that this part of the specification is bad, but they still defined it that way”.

If the language adds a new feature, and the compiler authors from day 1 feel the need to add modes to change the behavior that language feature has been defined incorrectly.

> I would hope you and others of your mindset have heard the phrase "garbage in, garbage out". The way to avoid it is to stop putting garbage in. It does not help to say "garbage in, modulo reduced garbage out”.

A developer assuming that an operation with defined behavior on every system that has ever existed is not putting garbage in. A specification that says operations with well defined and deterministic behavior is not defined is the source of the initial garbage.

—Oliver



Received on 2025-09-03 19:10:08