Reduce undefined behavior of signed integer literal arithmetic operations

 

Abstract

 

Apply integral promotion on signed integer literal arithmetic operations to reduce undefined behavior.

Background

According to:

basic.fundamental/1 : The range of representable values for a signed integer type is 2<sup>N1</sup> to 2<sup>N1</sup> 1.

basic.fundamental/2 : Overflow for signed arithmetic yields undefined behavior.

expr.pre/4 : If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

Considering the following code, each line has an undefined behavior:

 

  auto a = INT_MAX + 1;

  auto b = -INT_MIN;

  long long c = INT_MAX + 1;

  long long d = -INT_MIN;

 

GCC and Clang can diagnose that `INT_MAX + 1` and `-INT_MIN` have undefined behavior, while MSVC can only diagnose that `INT_MAX + 1` has.

Solution

Add a rule that when the operands of an operator are literals, apply integer promotion to increase the width of the type of the result to be large enough to store the value of the result value. If none of the standard signed integer types is large enough to store the value, allow the use of implementation-defined extended signed integer types. The program is ill-formed if no integer type is large enough to store the value (as with the rules for integer literals(lex.icon/4)).

 

    auto a = INT_MAX + 1; // type of a is long or long long or a extended signed integer type

    auto b = 1;

    auto c = INT_MAX + b; // type of c is still int

    auto d = int{1} + INT_MAX; // still int, the compiler may give a warning

    auto e = LONG_MAX + 1 // maybe equivalent to 2147483647L + 1, type of e is long long or others

 

Furthermore, extend this rule to unsigned integer literals.

 

MSVC actually refuses to compile `-UINT_MAX`, but GCC and Clang allow it.

 

For unsigned overflow, choose a large enough unsigned integer type, for unsigned underflow, choose a large enough signed integer type, the program is ill-formed if there is no integer type that can store its value.

Compatibility

Even if the old code relies on undefined behavior, implementing this change will not change its result.

 

Users may see warnings of possible data loss when converting from large integers to small integers.

 

Using such expressions on templates will get a different type than before, but I guess no one really does this.

 

Since this operation produces a constant result at compile time, it does not affect optimization and is not affected by the platform. The reasons why signed overflow is still undefined behavior described in StackOverFlow are not applicable.

 

For unsigned integer types, it may cause `0 - ULLONG_MAX` to change from well-defined to ill-formed, or make its type the largest integer type that the implementation can represent, thus obtaining a well-defined definition.

Wording

Add new rules in expr.arith.conv or conv.prom to match these cases.