Why Do We have unsigned and signed int type in C?

2.5k views Asked by At

I am a beginner in C . I have recently learned about 2's Complement and other ways to represent negative number and why 2's complement was the most appropriate one.

What i want to ask is for example,

int a = -3;
unsigned int b = -3; //This is the interesting Part.

Now , for the conversion of int type

The standard says:

6.3.1.3 Signed and unsigned integers

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

The first paragraph can't be used as -3 can't be represented by unsigned int.

Therefore paragraph 2 comes to play and we need to know the maximum value for unsigned int. It can be found as UINT_MAX in limits.h. The maximum value in this case is 4294967295 so the calculation is:

-3 + UINT_MAX + 1 = -3 + 4294967295 + 1 = 4294967293  

Now 4294967293 in binary is 11111111 11111111 11111111 11111101 and -3 in 2's Complement form is 11111111 11111111 11111111 11111101 so they are essentially same bit representation , it would be always same no matter what negative integer i am trying to assign to unsigned int.So isn't unsigned type redundant.

Now i know that printf("%d" , b) is an undefined behavior according to standard, but isn't that what is a reasonable and more intuitive way to do things. As representation will be same if negative are represented as 2's Complement and that is what we have now , and other ways used are rare and most probably will not be in future developments.

So if we could have only one type say int , now if int x = -1 then %d checks for the sign bit and print negative number if sign bit is 1 and %ualways interpret the plain binary digit (bits) as it is . Addition and subtraction are already dealt with because of using 2's complement. So isn't this more intuitive and less complex way to do things.

4

There are 4 answers

0
user3528438 On

I think a major reason is operators and operations depends on the signed-ness.

You've observed add/subtract behaves the same for signed and unsigned types, if signed types uses 2's compliment (and you've been ignoring the fact that this "if" sometimes is not the case.)

There are numerous cases where the compiler needs the signed-ness information to understand the purpose of the program.

1. Integer promotion.

When a narrower type is converted to a wider type, the compiler will generate the code depending the operands' types.

E.g. if you convert signed short to signed int and int is wider than short, the compiler would generate code that does the conversion, and that conversion is different from "unsigned short" to "signed int" (sign extension or not).

2. Arithmetic right shift

-1>>1 can be still -1 if the implementation choose to, but 0xffffffffu>>1 must be 0x7fffffffu

3. Integer division

Similarly, -1/2 is 0, 0xffffffffu/2 is 0x7fffffffu

4. 32bit multiply by 32bit, with 64 bit result:

This is a little hard to explain, so let me use code instead.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(void) {
    // your code goes here
    int32_t a=-1;
    int32_t b=-1;
    int64_t c = (int64_t)a * b;
    printf("signed: 0x%016"PRIx64"\n", (uint64_t)c);

    uint32_t d=(uint32_t)-1;
    uint32_t e=(uint32_t)-1;
    uint64_t f = (uint64_t)d * e;
    printf("unsigned: 0x%016"PRIx64"\n", f);

    return 0;
}

Demo: http://ideone.com/k30nZ9

5. And of course, comparison.


One can design a signed-ness-less language, but then a lot of operators needs to split into two or more versions so that the programmer can express the purpose of the program, e.g. operator / needs to be split into udiv and sdiv, operator * need to be split into umul and smul, integer promotion needs to be explicit, operator > needs to be scmpgt/ucmpgt.........

That would be a horrible language to use, isn't it?


Bonus: All pointers usually have the same bit representation but have different operator [], ->, *, ++, --, +, -.

0
Muhammad Salman On

Well the easiest and general answer is memory maintenance, every variable in C language reserves some memory space in main memory (RAM) when we declare it, for example: unsigned int var; will reserve 2 or 4 bytes and will range from 0 to 65,535 or 0 to 4,294,967,295.

While signed int will have range from -32,768 to 32,767 or -2,147,483,648 to 2,147,483,647.

The point is sometime you just positive numbers which can't be negative for example your age obviously it can't be negative so you would use 'unsigned int'. Similarly when dealing with numbers those can contain negative numbers of the same range as signed int we will than use it. In short a good programming practice is to use appropriate data types according to our need so we can use computer memory effectively and our programs will be more compact.

As far as i know about 2's complement its all about the specific data type or to more specific the right base. We simply cannot determine either it's a 2's complement of a specific number or not. But since computer deals with binary we still have number of bytes in our way for example 2's complement of 7 in 8 bit would be different than in 32 bit and 64 bit.

17
Alexey Frunze On

It's handy to have both for input, output and computation. For example, comparison and division come in signed and unsigned varieties (btw, at the bit level multiplication is the same for unsigned and 2's complement signed types, just like addition and subtraction and both may compile into the same multiplication instruction of the CPU). Further, unsigned operations do not cause undefined behavior in case of overflow (except for division by zero), while signed operations do. Overall, unsigned arithmetic is well defined and unsigned types have a single representation (unlike three different ones for signed types, although, these days in practice there's just one).

There's an interesting twist. Modern C/C++ compilers exploit the fact that signed overflows result in undefined behavior. The logic is that it never happens and therefore some additional optimizations can be done. If it actually happens, the standard says it's undefined behavior, and your buggy program is legally screwed. What this means is that you should avoid signed overflows and all other forms of UB. However, sometimes you can carefully write code that never results in UB, but is a bit more efficient with signed arithmetic than with unsigned.

Please study the undefined, the unspecified and the implementation-defined behaviors. They are all listed at the end of the standard in one of the annexes (J?).

1
Stargateur On

My answer is more abstract, in my opinion in C you should not care about representation of integer in memory. The C abstract this to you, and this is very good.

Declare an integer as unsigned is very useful. That assumes that the value will never be negative. Like floating number handle real number, signed integer handle... integer and unsigned integer handle natural number.

When you create algorithm where negative integer would lead to undefined behavior. You can be sure that your unsigned integer value will never be negative. For example, when you iterate over index of an array. A negative index would lead to undefined behavior.

An other thing is when you create a public API, when one of your function require a size, a length, a weight or whatever that will don't make sense in negative. This helps the user to understand the purpose of this value.


In the other hand, some people disagree because the arithmetic of unsigned doesn't work as people first expect. Because when an unsigned is decremented when is equal to zero, it will pass to a very big value. Some people expect that he will be equal to -1. For example:

// wrong
for (size_t i = n - 1; i >= 0; i--) {
  // important stuff
}

This produces an infinite loop or even worse if n equal zero, the compiler will probably detect it but not all time:

// wrong
size_t min = 0;
for (size_t i = n - 1; i >= min; i--) {
  // important stuff
}

Do this with unsigned integer requires a little trick:

size_t i = n;
while (i-- > 0) {
  // important stuff
}

In my opinion, it's very important to have unsigned integer in a language and C would not be complete without.