I asked another question minutes ago but, I'm finishing up a project. Part of the bonus is pipe-lining our processor design. I have a simple accumulator based processor with a data-bus and address bus. It has the three basic stages [fetch, decode, execute] and most of the basic functional units that are in simple processors. Like data memory, instruction register, ALU, MAR, MDR, controller(handles that states and control signals), etc.
I know what pipe-lining is but haven't figured out how to implement it at the functional level. I have searched around but nothing simplifies it for what I need it to do or haven't found any examples.
From Instruction Pipeline the classic 5 stages of a risc processor are:
If everything worked in zero time there would not be any need for the pipeline stages but as you may have seen with combinatorial logic a chnage on the input takes time to ripple through. Add in the requirement to load and save data to memory and it can be seen that dealing with every thing in 1 clock cycle would be very hard.
To simplify it think of 3 stages Load from memory, Execute and store to memory.
3 Instructions (adding memory instructions) processor has registers r1,r2,r3
Therefore when reading an instruction from a the program we can see that different parts of it are used at different times, read memory addresses used cycle 1, the type of operation add, subtract multiply would be used in cycle 2 and the store memory address would be used in cycle 3.
The data path has flip-flops inserted to break it up into (pipeline) stages then you need to delay the relevant parts of the decoded instruction word so they hit the function block at the same time as the data it was intended to operate on.