I am learning the MIPS instructions, and when I test the I-type instrtuctions which need to sign extend the immediate, I am confused abou the following outcomes (All of them are run in MARS):
- Say we have the source code line
ori $s1, $s2, 0xfd10
, MARS gives the basic assembler instructionori $17, $18, 0x0000fd10
. This is the expectation sinceori
should zero extend the 16-bit immediate. If we only change the functori
toandi
, that is the source code lineandi $s1, $s2, 0xfd10
, MARS gives the almost same basic assembler instructionandi $17, $18, 0x0000fd10
. However, unlikeori
,andi
should use sign extending. So the basic assembler instruction is supposed to beandi $17, $18, 0xfffffd10
.
andi
also should use zero-extend! Please ignore the first question.
- When I try to use
slti rt, rs, imm
, for example,slti $s1, $s2, 0x8000
, MARS refused to execute the line and the error message is"0x8000": operand is out of range
. I see no reason that the immediate is out of range. If I change the immediate down a bit, say,slti $s1, $s2, 0x7fff
, it worked and the immediate is extended to0x00007fff
. My expectation is that0x8000
should be extended to0xffff8000
. Is there anything wrong about my understanding?
Values in asm source represent the actual number values you want to work with, not just the bit-patterns to be encoded into the instruction.
0x8000
is not the same number as0xffff8000
so the assembler stops you from having your value munged by sign-extension. If you wanted the final instruction's machine code to encode the value0xffff8000
, you should write0xffff8000
in the asm source for instructions that sign-extend their immediates.In our place value writing-system for numbers, there are an infinite number of implicit high
0
digits to the left of the explicit digits. So0x8000
is the same number as0x00008000
, and that's the number that the assembler is trying to represent as a 16-bit sign-extended immediate.You're approaching this from the PoV of how I-type instructions are encoded. But assemblers are designed to handle the encoding details for you. That's part of the point of using one. Say you write
addiu $t0, $t1, -123
and the assembler encodes-123
as a 16-bit sign-extended immediate.Say you write
ori $t0, $t0, -256
to set all the bits above the low byte. But the assembler rejects that because it's not encodeable as a zero-extended immediate forori
, instead of silently leaving the upper 16 bits unset, like0x0000ff00
. So you don't have to memorize how each instruction treats its immediate; the assembler checks that for you. This is an intentional feature and a good design.Especially if you had a large program that defines some assemble-time constants and then uses them various ways: if tweaking one of those values resulted in an instruction not being encodeable, you'd want to know about it instead of having silently wrong results.
(And since I used decimal examples, writing numbers as hex numeric literals doesn't change anything about how the assembler should treat them.)
No, in MIPS all 3 bitwise boolean logical instructions (
ori
/andi
/xori
) zero-extend their immediates. (Sign-extend would have been more useful in more cases for AND, allowing masks with only a few zeros in the low bits, but that that's not how MIPS is designed. Although that would make truncation to exactly 16 bits more expensive.)Documentation like https://ablconnect.harvard.edu/files/ablconnect/files/mips_instruction_set.pdf confirms
andi
zero-extends. I didn't check official MIPS docs, but this info is widespread on the Internet; you could also test to see compilers use it that way to implementuint16_t
or whatever.Also andi vs. addi instruction in MIPS with negative immediate constant (covers MARS with extended pseudo-instructions enabled, so it will construct a full 32-bit value in another register if you use
andi
with a value that's not encodeable as a 16-bit zero-extended immediate)