So , I have learnt that when we use the technique of pipelining in CPU , we may have to tackle some hazards such as data dependency between two instructions. I do get for example this data dependecy:
add $t0, $t1, $t2
lw $s1, 0($t0)
lw
does need the correct result of $t0
But, I also saw that, we consider those 2 instructions as data - dependent. Why?
add $t0, $t1, $t2
lw $s1, 4($t0)
Since, lw, does need the correct value of mem[$t0] + 4
which is a different adress, why is this considered a dependency? Maybe, I do not get, what $lw
does there?
I am thinking it as such:
Let's say
addi $t0,$t0, 5
lw $s1, 0($t0) # now we loaded the value 5 to $s1
#BUT WHAT ABOUT THAT
li $t1, 9
addi $t0, $t0, 5
sw $t1, 4($t0) # now we go to &(mem[$t0]+4) and there we store the value of $t1, which is 9,
We needed the adress of t0 not the value of it ( or that's at least what I understand )
Can anybody explain, to me this?
Registers don't have addresses: they have names; they have positions/index in the register file, and, they hold values.
Only memory has addresses.
That
lw
accessesmem[$t0+4]
, and it needs $t0's value so it can do the+
.The
lw
andsw
instructions compute an effective address:Here, the hardware is indexing into the register file, using the index
rs
, which is obtained from a 5-bit field calledrs
from the encoded instruction, thelw
. The value stored in a register is 32-bits so that 5-bit index is used to look up a 32-bit value held there. Thus, this is a read of registerrs
's value. The immediate is sign extended from 16 bits in the instruction field, to 32-bits, before both the register's 32-bit value and the immediate are given to the ALU to be added together.After computing the
ea
, the load does:and after computing the
ea
, the store doesIn order to compute the ea, we need the value of
R[rs]
, i.e. the value held in thers
register.This part of the operation is almost the same as if the sequence:
The
add
andaddi
pair have read after right data dependency. This is an ALU/ALU dependency or EX/EX depending on whether we're talking functional units or pipeline stages. These are a hazard as they are back to back.Since register reading normally happens in the ID (instruction decode) stage, if the last register update of
rs
via the WB (write back) stage was an instruction started 3 cycles earlier, then this ID read ofrs
will pick up the proper value. Anything fewer means that the read in ID will not see the proper value as it hasn't made it to the register yet.Here i4 can read i1's register update without hazard as those operations both occur in cycle 5 — the WB of i1 happens at the beginning of the clock 5 and the ID of i4 is able to see values being written in that same clock.
But if i2 or i3 read the register targeted by i1, then there's a hazard, because their ID stages occur earlier than i1's WB stage (i2's ID is at cycle 3 and i3's ID is at cycle 4, both too early to get the i1's WB at cycle 5).
So, that's how hazards happen. But let's note that the proper value needed by i2's EX stage (at cycle 4) is in the CPU and has already been computed in cycle 3 by i1's EX stage. A forward or bypass substitutes that proper value to override the stale value read in i2's or i3's ID stage.
See more detail of instruction descriptions and encodings: https://inst.eecs.berkeley.edu/~cs61c/resources/MIPS_Green_Sheet.pdf
Look at the BASIC INSTRUCTION FORMATS to see the field encodings, e.g.
lw
andsw
are both I-type instructions, so they have anrs
, anrt
, and animmediate
field.The
lw
instruction has one register source and one register target. (It also has a memory source, but we don't look at memory for data-dependency Read-After-Write hazards, we only look to registers.)The
sw
instruction has two register sources and no register targets. (It also has a memory target, but we don't look at memory for data-dependency Read-After-Write hazards, we only look to registers.)