I have an assignment in which I have to explain some things about the following MIPS Assembly code:
.data
x: .word 4711
y: .word 10
z: .word 0x0A91
e: .word 0
.text
.globl main
main:
lw $2, x
lw $3, y
lw $4, z
add $2, $2, $3
sub $3, $2, $4
sw $3, e
li $2, 10
syscall
The first instruction lw $2, x
is separated into two instructions when assembled. The instructions are lui $1, 0x00001001
followed by lw $2, 0x00000000($1)
. I understand that lui moves the hex value 1001 into the upper part of the register and the value stored in $1 at this point is 0x10010000, but I do not understand where the 1001 comes from and what the second instruction means at all. I would really appreacitate any help on the subject.I am using MARS to assemble and run this program.
MIPS instructions are 32 bits longs, and so are the addresses uses by a program.
This implies that the
lw
instruction is unable to specify a full 32-bit address as an immediate. Simply put, the instructionlw $t, var
is not valid (expect for very few cases).In fact, its encoding is
Where the i bits show that only 16 bits are used to specify an address (and that a base register must always be specified, eventually the
$zero
register can be used).So the assembler does this trick: whenever you use a
lw $t, var
it assembles that instruction into two instructions, one that load the upper 16 bits of the address into$at
and alw
that use$at
as a base register with the lower 16 bits of the address as the offset.Note that since the
lw
reads from$at
+ ADDR_L the final address used is ADDR_H << 16 + ADDR_L = ADDR. As expected.There is subtlety here, pointed out by Mike Spivey (Many thanks to him), see below
This kind of instructions, that doesn't map directly into the ISA, are called pseudo-instruction. The
$at
register is reserved for the assembler exactly for implementing them.In MARS you can disable the pseudo instructions by unchecking Settings > Permits extended (pseudo) instructions and format.
While programming without pseudo-instructions will grow annoying pretty quickly, it is worth doing at least once, to fully understand the MIPS architecture.
Mike Spivey correctly noted that the 16-bit offset immediate is sign-extended before being added to the base register.
This calls for a correction of the value I called
ADDR_H
in caseADDR_L
turns out to be negative when interpreted as a 16-bit two's complement number.If this turns out to be true,
ADDR_H
must be incremented.The general formula for
ADDR_H
can be corrected toADDR_H = ADDR >> 16 + ADDR[15]
whereADDR[15]
denotes the value of bit 15 ofADDR
(which is the sign bit ofADDR_L
.