Is signed integer overflow undefined behaviour or implementation defined?

870 views Asked by At
#include <limits.h>

int main(){
 int a = UINT_MAX; 
 return 0;
}

I this UB or implementation defined?

Links saying its UB

https://www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Integer-Overflow-Basics

Allowing signed integer overflows in C/C++

Links saying its Implementation defined

http://www.enseignement.polytechnique.fr/informatique/INF478/docs/Cpp/en/c/language/signed_and_unsigned_integers.html

Conversion rule says:

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Aren't we converting a max unsigned value into a signed value?

The way I have seen it, gcc just truncates the result.

4

There are 4 answers

3
chqrlie On BEST ANSWER

Both references are correct, but they do not address the same issue.

int a = UINT_MAX; is not an instance of signed integer overflow, this definition involves a conversion from unsigned int to int with a value that exceeds the range of type int. As quoted from the École polytechnique's site, the C Standard defines the behavior as implementation-defined.

#include <limits.h>

int main(){
    int a = UINT_MAX;    // implementation defined behavior
    int b = INT_MAX + 1; // undefined behavior
    return 0;
}

Here is the text from the C Standard:

6.3.1.3 Signed and unsigned integers

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

  3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Some compilers have a command line option to change the behavior of signed arithmetic overflow from undefined behavior to implementation-defined: gcc and clang support -fwrapv to force integer computations to be performed modulo the 232 or 264 depending on the signed type. This prevents some useful optimisations, but also prevents some counterintuitive optimisations that may break innocent looking code. See this question for some examples: What does -fwrapv do?

0
Eric Postpischil On

int a = UINT_MAX; does not overflow because no exceptional condition occurs while evaluating this declaration or the expression within it. This code is defined to convert UINT_MAX to the type int for the initialization of a, and the conversion is defined by the rules in C 2018 6.3.1.3.

Briefly, the rules that apply are:

  • 6.7.9 11 says initialization behaves similarly to simple assignment: “… The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply,…”
  • 6.5.16.1 2 says simple assignment performs a conversion: “In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.”
  • 6.3.1.3 3, which covers conversion to a signed integer type when the operand value cannot be represented in the type, says: “either the result is implementation-defined or an implementation-defined signal is raised.”

So, the behavior is defined.

There is a general rule in 2018 6.5 5 about exceptional conditions that occur while evaluating expressions:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

However, this rule never applies in the chain above. While doing the evaluations, including the implied assignment of the initialization, we never get a result out of range of its type. The input to the conversion is out of range of the destination type, int, but the result of the conversion is in range, so there is no out-of-range result to trigger an exceptional condition.

(A possible exception to this is that the C implementation could, I suppose, define the result of the conversion to be out of range of int. I am not aware of any that do, and this is likely not what was intended by 6.3.1.3 3.)

0
dbush On

This in not signed integer overflow:

int a = UINT_MAX; 

It is a conversion from an unsigned to a signed integer type and is implementation defined. This is covered in section 6.3.1.3 of the C standard regarding conversion of signed and unsigned integer types:

1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.6

3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

An example of signed integer overflow would be:

int x = INT_MAX;
x = x + 1;

And this is undefined. In fact section 3.4.3 of the C standard which defines undefined behavior states in paragraph 4:

An example of undefined behavior is the behavior on integer overflow

And integer overflow only applies to signed types as per 6.2.5p9:

The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type

0
supercat On

In the pre-existing "language" (family of dialects) the C Standard was written to describe, implementations would generally either process signed integer overflow by doing whatever the underlying platform did, truncating values to the length of the underlying type (which is what most platforms did) even on platforms which would otherwise do something else, or triggering some form of signal or diagnostic. In K&R's book "The C Programming Language", the behavior is described as "machine-dependent".

Although the authors of the Standard have said in the published Rationale document identified some cases where they expected that implementations for commonplace platforms would behave in commonplace fashion, they didn't want to say that certain actions would have defined behavior on some platforms but not others. Further, characterizing the behavior as "implementation-defined" would have created a problem. Consider something like:

int f1(void);
int f2(int a, int b, int c);

int test(int x, int y)
{
  int test = x*y;
  if (f1())
    f2(test, x, y);
}

If the behavior of integer overflow were "Implementation Defined", then any implementation where it could raise a signal or have other observable side effects would be required to perform the multiplication before calling f1(), even though the result of the multiply would be ignored unless f1() returns a non-zero value. Classifying it as "Undefined Behavior" avoids such issues.

Unfortunately, gcc interprets the classification as "Undefined Behavior" as an invitation to treat integer overflow in ways that aren't bound by ordinary laws of causality. Given a function like:

unsigned mul_mod_32768(unsigned short x, unsigned short y)
{
  return (x*y) & 0x7FFFu;
}

an attempt to call it with x greater than INT_MAX/y may arbitrarily disrupt the behavior of surrounding code, even if the result of the function would not otherwise have been used in any observable fashion.