Confusion about MIPS I-type instruction sign extend

2.3k views Asked by At

I am learning the MIPS instructions, and when I test the I-type instrtuctions which need to sign extend the immediate, I am confused abou the following outcomes (All of them are run in MARS):

  1. Say we have the source code line ori $s1, $s2, 0xfd10, MARS gives the basic assembler instruction ori $17, $18, 0x0000fd10. This is the expectation since ori should zero extend the 16-bit immediate. If we only change the funct ori to andi, that is the source code line andi $s1, $s2, 0xfd10, MARS gives the almost same basic assembler instruction andi $17, $18, 0x0000fd10. However, unlike ori, andi should use sign extending. So the basic assembler instruction is supposed to be andi $17, $18, 0xfffffd10.

andi also should use zero-extend! Please ignore the first question.

  1. When I try to use slti rt, rs, imm, for example, slti $s1, $s2, 0x8000, MARS refused to execute the line and the error message is "0x8000": operand is out of range. I see no reason that the immediate is out of range. If I change the immediate down a bit, say, slti $s1, $s2, 0x7fff, it worked and the immediate is extended to 0x00007fff. My expectation is that 0x8000 should be extended to 0xffff8000. Is there anything wrong about my understanding?
1

There are 1 answers

7
Peter Cordes On BEST ANSWER

Values in asm source represent the actual number values you want to work with, not just the bit-patterns to be encoded into the instruction.

0x8000 is not the same number as 0xffff8000 so the assembler stops you from having your value munged by sign-extension. If you wanted the final instruction's machine code to encode the value 0xffff8000, you should write 0xffff8000 in the asm source for instructions that sign-extend their immediates.

In our place value writing-system for numbers, there are an infinite number of implicit high 0 digits to the left of the explicit digits. So 0x8000 is the same number as 0x00008000, and that's the number that the assembler is trying to represent as a 16-bit sign-extended immediate.


You're approaching this from the PoV of how I-type instructions are encoded. But assemblers are designed to handle the encoding details for you. That's part of the point of using one. Say you write addiu $t0, $t1, -123 and the assembler encodes -123 as a 16-bit sign-extended immediate.

Say you write ori $t0, $t0, -256 to set all the bits above the low byte. But the assembler rejects that because it's not encodeable as a zero-extended immediate for ori, instead of silently leaving the upper 16 bits unset, like 0x0000ff00. So you don't have to memorize how each instruction treats its immediate; the assembler checks that for you. This is an intentional feature and a good design.

Especially if you had a large program that defines some assemble-time constants and then uses them various ways: if tweaking one of those values resulted in an instruction not being encodeable, you'd want to know about it instead of having silently wrong results.

(And since I used decimal examples, writing numbers as hex numeric literals doesn't change anything about how the assembler should treat them.)


However, unlike ori, andi should use sign extending.

No, in MIPS all 3 bitwise boolean logical instructions (ori/andi/xori) zero-extend their immediates. (Sign-extend would have been more useful in more cases for AND, allowing masks with only a few zeros in the low bits, but that that's not how MIPS is designed. Although that would make truncation to exactly 16 bits more expensive.)

Documentation like https://ablconnect.harvard.edu/files/ablconnect/files/mips_instruction_set.pdf confirms andi zero-extends. I didn't check official MIPS docs, but this info is widespread on the Internet; you could also test to see compilers use it that way to implement uint16_t or whatever.

Also andi vs. addi instruction in MIPS with negative immediate constant (covers MARS with extended pseudo-instructions enabled, so it will construct a full 32-bit value in another register if you use andi with a value that's not encodeable as a 16-bit zero-extended immediate)