The MIPS Pipeline CSCI206 - Computer Organization & Programming Pipeline Datapath and Control zybook: 11.6 Developed and maintained by the Bucknell University Computer Science Department - 2017 Hazard Summary data - An instruction depends on a data value produced or consumed by another instruction -- Reorder -- orwarding (EX-EX, Mem-EX) control - The execution of an instruction depends on a control decision made by an earlier instruction (e.g., branch) -- Delay slot (nop) -- Compute diff at the ID stage structural - An instruction in the pipeline needs a resource being used by another instruction in the pipeline at the same moment -- Reorder if possible -- Delay EXAMPLES CYCLE 1 CYCLE 2 li v0, 100 add v1, v1, v2 beq v0, v1, loop li v0, 100 D add v1, v1, v2 beq v0, v1, loop 1
CYCLE 3 CYCLE 4 li v0, 100 D E add v1, v1, v2 D beq v0, v1, loop li v0, 100 D E M add v1, v1, v2 D E beq v0, v1, loop - - In cycle 4 branch wants to execute, but needs the new value of v1. It is available at the end of cycle 4. So we have to stall. This stalls everything before this stage in the pipeline, so we cannot fetch the addi. CYCLE 5 CYCLE 6 li v0, 100 D E M W add v1, v1, v2 D E M beq v0, v1, loop - - - - li v0, 100 D E M W add v1, v1, v2 D E M W beq v0, v1, loop - - D In cycle 5, we have the value of v1 in EX. But MIPS only has forwarding EX-EX, MEM-EX, and MEM- MEM. Not EX-ID. So, we have to again stall. (no fetch again) inally in cycle 6 we can decode the new value of v1 (without forwarding). Since we were able to decode, we can also fetch the next instruction in cycle 6. Since branch is resolved in Decode, we don t have to show EMW stages (they are NOPs) CYCLE 7 CYCLE 11 li v0, 100 D E M W add v1, v1, v2 D E M W beq v0, v1, loop - - D D li v0, 100 D E M W add v1, v1, v2 D E M W beq v0, v1, loop - - D D E M W D E M W no hazards for addi ast forward to cycle 11. execution took 11 cycles. IPC = 5 / 11 = 0.45 2
CYCLE 0 CYCLE 0 lw r1, 0(r4) lw r1, 0(r4) lw r2, 400(r4) lw r2, 400(r4) addi r3, r1, r2 addi r3, r1, r2 irst two instructions are hazard free irst two instructions are hazard free addi depends on both r1 and r2. addi depends on both r1 and r2. sw depends on r3 (addi) sw depends on r3 (addi) CYCLE 3 CYCLE 4 lw r1, 0(r4) D E lw r2, 400(r4) D addi r3, r1, r2 lw r1, 0(r4) D E M lw r2, 400(r4) D E addi r3, r1, r2 - - No issues until addi goes to decode Decode in cycle 4 would get both old values (r1, r2) We could forward r1 from MEM to EX in 5 But r1 is not yet available, so we must stall, since D stalls, sw cannot fetch. CYCLE 5 CYCLE 6 lw r1, 0(r4) D E M W lw r2, 400(r4) D E M addi r3, r1, r2 - D lw r1, 0(r4) D E M W lw r2, 400(r4) D E M W addi r3, r1, r2 - D E D Decode in cycle 5. Load new value for r1 (WB in same cycle is OK) Need to forward MEM->EX for r2 in cycle 6. orward MEM->EX for r2. Draw an arrow from previous cycle s M to current cycle s E Decode sw in 6, but we get the old value for r3. But that s OK, sw doesn t need the new value until the start of MEM, we can use a forwarding path 3
CYCLE 7 CYCLE 8 lw r1, 0(r4) D E M W lw r2, 400(r4) D E M W addi r3, r1, r2 - D E M D E D lw r1, 0(r4) D E M W lw r2, 400(r4) D E M W addi r3, r1, r2 - D E M W D E M D E No issues SW fetched the wrong r3, but the new value for r3 is at the output of the MEM stage, so we need a MEM- MEM forward. CYCLE 9 CYCLE 10 lw r1, 0(r4) D E M W lw r2, 400(r4) D E M W addi r3, r1, r2 - D E M W D E M W D E M lw r1, 0(r4) D E M W lw r2, 400(r4) D E M W addi r3, r1, r2 - D E M W D E M W D E M W IPC = 5 / 10 = 0.5 lw r1, 0(sp) lw r1, 0(sp) add r1, r1, r2 add r1, r1, r2 4
CYCLE 1 CYCLE 2 lw r1, 0(sp) add r1, r1, r2 lw r1, 0(sp) D add r1, r1, r2 CYCLE 3 CYCLE 4 lw r1, 0(sp) D E lw r1, 0(sp) D E M add r1, r1, r2 - add r1, r1, r2 - D - - if we decode in 3, add will need r1 to execute in 4. the value isn t available until the end of cycle 4 (lw finishes mem). So we need to stall in cycle 3. We will forward from the output of MEM to the input of EX in the next cycle (r1 for add) CYCLE 5 CYCLE 6 lw r1, 0(sp) D E M W lw r1, 0(sp) D E M W add r1, r1, r2 - D E add r1, r1, r2 - D E M - D - D E We will forward from the output of MEM to the input of EX in the next cycle (r1 for add) sw computes the memory address 0 + sp in EX, so no forward needed. sw needs the new r1 at the beginning of MEM, that will be in cycle 7, we can get it from the output of MEM 5
CYCLE 7 CYCLE 8 lw r1, 0(sp) D E M W lw r1, 0(sp) D E M W add r1, r1, r2 - D E M W add r1, r1, r2 - D E M W - D E M - D E M W sw writes r1 at mem[0+sp] in cycle 7, the value r1 is at the output of MEM so forward it to the input. Done. IPC = 3 / 8 6