How does in assembly does assigning negative number to an unsigned int work?

3.8k views Asked by At

I Learned About 2's Complement and unsigned and signed int. So I Decided to test my knowledge , as far as i know that a negative number is stored in 2's complement way so that addition and subtraction would not have different algorithm and circuitry would be simple.

Now If I Write

int main()
{
  int a = -1 ;
  unsigned int b = - 1 ;

  printf("%d %u \n %d %u" , a ,a , b, b);
}

Output Comes To Be -1 4294967295 -1 4294967295 . Now , i looked at the bit pattern and various things and then i realized that -1 in 2's complement is 11111111 11111111 11111111 11111111 , so when i interpret it using %d , it gives -1 , but when i interpret using %u , it treat it as a positive number and so it gives 4294967295. I Checked the Assembly Of the code is

.LC0:
    .string "%d %u \n %d %u"
main:
    push    rbp
    mov     rbp, rsp
    sub     rsp, 16
    mov     DWORD PTR [rbp-4], -1
    mov     DWORD PTR [rbp-8], -1
    mov     esi, DWORD PTR [rbp-8]
    mov     ecx, DWORD PTR [rbp-8]
    mov     edx, DWORD PTR [rbp-4]
    mov     eax, DWORD PTR [rbp-4]
    mov     r8d, esi
    mov     esi, eax
    mov     edi, OFFSET FLAT:.LC0
    mov     eax, 0
    call    printf
    mov     eax, 0
    leave
    ret

Now here -1 is moved to the register both the times in unsigned and signed . What i want to know if reinterpretation is only that matters , then why do we have two types unsigned and signed , it is printf format string %d and %u that matters ?

Further what really happens when i assign negative number to a unsigned integer (I learned That The initializer converts this value from int to unsigned int. ) but in the assembly code I did not saw such a thing. So what really happens ??

And How does Machine knows when it has to do 2's complement and when not , does it see the negative sign and performs 2's complement?

I have read almost every question and answer you could think this question be duplicate of , but I could not find a satisfactory solution.

5

There are 5 answers

2
John Bode On

Choice of signed integer representation is left to the platform. The representation applies to both negative and non-negative values - for example, if 11012 (-5) is the two's complement of 01012 (5), then 01012 (5) is also the two's complement of 11012 (-5).

The platform may or may not provide separate instructions for operations on signed and unsigned integers. For example, x86 provides different multiplication and division instructions for signed (idiv and imul) and unsigned (div and mul) integers, but uses the same addition (add) and subtraction (sub) instructions for both.

Similarly, x86 provides a single comparison (cmp) instruction for both signed and unsigned integers.

Arithmetic and comparison operations will set one or more status register flags (carry, overflow, zero, etc.). These can be used differently when dealing with words that are supposed to represent signed vs. unsigned values.

As far as printf is concerned, you're absolutely correct that the conversion specifier determines whether the bit pattern 0xFFFF is displayed as -1 or 4294967295, although remember that if the type of the argument does not match up with what the conversion specifier expects, then the behavior is undefined. Using %u to display a negative signed int may or may not give you the expected equivalent unsigned value.

14
Zbynek Vyskovsky - kvr000 On

Both signed and unsigned are pieces of memory and according to operations it matters how they behave.

It doesn't make any difference when adding or subtracting because due to 2-complement the operations are exactly the same.

It matters when we compare two numbers: -1 is lower than 0 while 4294967295 obviously isn't.

About conversion - for the same size it simply takes variable content and moves it to another - so 4294967295 becomes -1. For bigger size it's first signed extended and then content is moves.

How does machine now - according the instruction we use. Machines have either different instructions for comparing signed and unsigned or they provide different flags for it (x86 has Carry for unsigned overflow and Overflow for signed overflow).

Additionally, note that C is relaxed how the signed numbers are stored, they don't have to be 2-complements. But nowadays, all common architectures store the signed like this.

2
supercat On

There are a few differences between signed and unsigned types:

  1. The behaviors of the operators <, <=, >, >=, /, %, and >> are all different when dealing with signed and unsigned numbers.

  2. Compilers are not required to behave predictably if any computation on a signed value exceeds the range of its type. Even when using operators which would behave identically with signed and unsigned values in all defined cases, some compilers will behave in "interesting" fashion. For example, a compiler given x+1 > y could replace it with x>=y if x is signed, but not if x is unsigned.

As a more interesting example, on a system where "short" is 16 bits and "int" is 32 bits, a compiler given the function:

unsigned mul(unsigned short x, unsigned short y) { return x*y; }

might assume that no situation could ever arise where the product would exceed 2147483647. For example, if it saw the function invoked as unsigned x = mul(y,65535); and y was an unsigned short, it may omit code elsewhere that would only be relevant if y were greater than 37268.

0
Tim.Gentle Access On

It seems you seem to have missed the facts that firstly, 0101 = 5 in both signed and unsigned integer values and that secondly, you assigned a negative number to an unsigned int - something your compiler may be smart enough to realise and, therfore, correct to a signed int.

Setting an unsigned int to -5 should technically cause an error because unsigned ints can't store values under 0.

0
yagnesh On

You could understand it better when you try to assign a negative value to a larger sized unsigned integer. Compiler generates the assembly code to do sign extension when transferring small size negative value to larger sized unsigned integer.

see this blog post for assembly level explanation.