Why does 2's complement sign extension work by adding copies of the sign bit?

3.2k views Asked by At

Let's take the example of sign-extending a 16-bit signed number into a 32-bit register, such as mov $+/-5, %ax movswl %ax, %ebx.

There are two possible cases:

  1. High bit is zero (number is positive). This is very easy to understand and intuitive. For example, if I have the number 5, left-padding with zeros is very easy to understand. For example:

                      00000000 00000101    # 5 (represented in 16 bits)
    00000000 00000000 00000000 00000101    # 5 (represented in 32 bits)
    
    
  2. However, the tricky thing for me to understand is when it's a negative number and we sign-extend. Example:

                      11111111 11111011    # -5 (represented in 16 bits)
    11111111 11111111 11111111 11111011    # -5 (represented in 32 bits)
    

Yes, I know that we just fill the upper bits with 1. But what makes that work? Perhaps sort of an explanation on what 'properties' of the binary number makes that possible would help me better understand this.

2

There are 2 answers

0
Peter Cordes On BEST ANSWER

For an n+1-bit 2's complement number:

  • The high bit (sign bit) has place-value -(2^n)
  • The next-highest bit has place value 2^(n-1), and so on (normal binary place value)

For example, in 8-bit 2's complement, the bit-pattern with just the MSB set represents a value of -128 = -(2^7). With the top two bits set, it represents -128 + 64 = -64.


When we extend by 1 bit, the original sign bit is now a "regular" bit with place value +(2^n) instead of -(2^n), so the value represented by the existing bits is now 2^n + 2^n = 2^(n+1) higher than the original value. (Or the same if that bit was zero).

The new sign bit has a place value of -(2^(n+1)), so copying the original sign bit is exactly what we need to balance that change in place-value. (Or leaves it unchanged if zero).

The procedure for one bit of course generalizes by repetition for any number of bits.

(Normally we'd use n = number of total bits, like n=8 instead of n=7 for 8-bit 2's complement. Then the MSB's place value is -2^(n-1), and changing its meaning by extending with a new sign bit adds 2^n if it was set.)


See wikipedia for more about how the bits represent values: https://en.wikipedia.org/wiki/Two%27s_complement#Converting_from_two's_complement_representation - the 2's complement article is quite good but doesn't go into detail about why copying the sign bit works.

You could also try it on paper for some small examples, like sign-extending from 4 to 5 bits. -1 (all-ones) would be a good value to start with, making the math simple. Or 0b1000 (-8) is another good choice.

Google found https://andybargh.com/binary-sign-extension/ which works through one 8-bit example.

0
Hari On

In addition to the excellent answer by Peter Cordes, we will see here about the general case of sign extending an n-bit number to an m-bit number, where m > n.

The n-bit number has the decimal value: n-bit number

While, the m-bit number has the value: m-bit number

The trailing "common terms" represents terms that are the same in both the numbers.

If you subtract the n-bit number from the m-bit number you will get: terms getting combined

As you can see the terms get combined in a pairwise manner, as you pass through them from right to left. At the end, we will have:

zero result

Thus, we can see that the sign extension preserves the value of the number in the general case.

Courtesy: https://kumom.io/articles/sign-extension