Assuming no data forwarding, and no hazard detection I've been trying to see if I can optimize this code but since 4 of the 5 statements are data-dependent in some way I keep getting 5 stalls or "NOP" operations. Am I missing something or is there something I can do to decrease the stalls within? code below
- add x15, x12, x11
- ld x13, 8(x15)
- ld x12, 0(x2)
- or x13, x15, x13
- sd x13, 0(x15)
Optimize the lines by reducing the amount of stalls within the pipeline without using data forwarding or more statements