Design of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017

Size: px
Start display at page:

Download "Design of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017"

Transcription

1 Design of Digital Circuits Lecture 4: Microprogramming Prof. Onur Mutlu ETH Zurich Spring 27 7 April 27

2 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed Microarchitectures! Pipelining! Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery,! Out-of-Order Execution! Issues in OoO Execution: Load-Store Handling, 2

3 Readings for This Week! P&P, Chapter 4 " Microarchitecture! P&P, Revised Appendix C " Microarchitecture of the LC-3b " Appendix A (LC-3b ISA) will be useful in following this! H&H, Chapter 7.4 (keep reading)! Optional " Maurice Wilkes, The Best Way to Design an Automatic Calculating Machine, Manchester Univ. Computer Inaugural Conf., 95. 3

4 Multi-Cycle Microarchitectures 4

5 Remember: Multi-Cycle Microarchitecture AS = Architectural (programmer visible) state at the beginning of an instruction Step : Process part of instruction in one clock cycle Step 2: Process part of instruction in the next clock cycle AS = Architectural (programmer visible) state at the end of a clock cycle 5

6 One Example Multi-Cycle Microarchitecture 6

7 Carnegie Mellon Remember: Single-Cycle MIPS Processor Jump 3:26 5: MemtoReg Control MemWrite Unit Branch ALUControl 2: Op ALUSrc Funct RegDst RegWrite PCSrc PC' PC A RD Instruction Memory Instr 25:2 2:6 A A2 A3 WD3 WE3 Register File RD RD2 SrcA SrcB ALU Zero ALUResult WriteData A WE RD Data Memory WD ReadData Result PCJump 4 + PCPlus4 2:6 5: 5: Sign Extend WriteReg 4: SignImm <<2 + PCBranch 27: 3:28 25: <<2 7

8 Carnegie Mellon Remember: Complete Mul9-cycle Processor IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: RegDst MemtoReg A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 8

9 Carnegie Mellon Control Unit Control Unit Opcode 5: Main Controller (FSM) MemtoReg RegDst IorD PCSrc ALUSrcB : ALUSrcA IRWrite MemWrite PCWrite Branch RegWrite Multiplexer Selects Register Enables ALUOp : Funct 5: ALU Decoder ALUControl 2: 9

10 Carnegie Mellon Main Controller FSM: Fetch Reset S: Fetch IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm

11 Carnegie Mellon Main Controller FSM: Fetch S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm

12 Carnegie Mellon Main Controller FSM: Decode S: Fetch S: Decode IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B X SrcA XXX Zero X XX ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 2

13 Carnegie Mellon Main Controller FSM: Address Calcula9on S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero X ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 3

14 Carnegie Mellon Main Controller FSM: Address Calcula9on S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero X ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 4

15 Carnegie Mellon Main Controller FSM: lw S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead IorD = S4: Mem Writeback RegDst = MemtoReg = RegWrite 5

16 Carnegie Mellon Main Controller FSM: sw S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite IorD = IorD = MemWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 6

17 Carnegie Mellon Main Controller FSM: R-Type S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 7

18 Carnegie Mellon Main Controller FSM: beq S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 8

19 Carnegie Mellon Complete Mul9-cycle Controller FSM S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 9

20 Carnegie Mellon Main Controller FSM: addi S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S9: ADDI Execute Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 2

21 Carnegie Mellon Main Controller FSM: addi S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S9: ADDI Execute ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 2

22 Carnegie Mellon Extended Func9onality: j PCEn IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB : ALUControl 2: BranchPCWrite PCSrc : PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: A A2 A3 WD3 WE3 Register File RD RD2 A B 3:28 4 <<2 SrcA SrcB ALU Zero ALUResult ALUOut PCJump <<2 27: SignImm 5: Sign Extend 25: (jump) 22

23 Carnegie Mellon Control FSM: j S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = J Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S: Jump S9: ADDI Execute ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 23

24 Carnegie Mellon Control FSM: j S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = J Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S: Jump S9: ADDI Execute PCSrc = PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 24

25 Carnegie Mellon Mul9-cycle Performance: CPI! Instruc9ons take different number of cycles: # 3 cycles: beq, j # 4 cycles: R-Type, sw, addi # 5 cycles: lw Realis9c?! CPI is weighted average, e.g. SPECINT2 benchmark: # 25% loads # % stores # % branches # 2% jumps # 52% R-type! Average CPI = (. +.2) 3 +(.52 +.) 4 +(.25) 5 =

26 Carnegie Mellon Mul9-cycle Performance: Cycle Time! Mul9-cycle cri9cal path: T c = IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: RegDst MemtoReg A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 26

27 Carnegie Mellon Mul9-cycle Performance: Cycle Time! Mul9-cycle cri9cal path: T c = t pcq + t mux + max(t ALU + t mux, t mem ) + t setup IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: RegDst MemtoReg A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 27

28 Carnegie Mellon Mul9cycle Performance Example Element Parameter Delay (ps) Register clock-to-q t pcq_pc 3 Register setup t setup 2 MulOplexer t mux 25 ALU t ALU 2 Memory read t mem 25 Register file read t RFread 5 Register file setup t RFsetup 2 T c = 28

29 Carnegie Mellon Mul9cycle Performance Example Element Parameter Delay (ps) Register clock-to-q t pcq_pc 3 Register setup t setup 2 MulOplexer t mux 25 ALU t ALU 2 Memory read t mem 25 Register file read t RFread 5 Register file setup t RFsetup 2 T c = t pcq_pc + t mux + max(t ALU + t mux, t mem ) + t setup = [ ] ps = 325 ps 29

30 Mul9-cycle Performance Example! For a program with billion instruc9ons execu9ng on a mul9-cycle MIPS processor # CPI = 4.2 # T c = 325 ps Carnegie Mellon! Execu/on Time = (# instruc9ons) CPI T c = ( 9 )(4.2)(325-2 ) = 33.9 seconds! This is slower than the single-cycle processor (92.5 seconds). Why?! Did we break the stages in a balanced manner?! Overhead of register setup/hold paid many 9mes! How would the results change with different assump9ons on memory latency and instruc9on mix? 3

31 Carnegie Mellon Recall: Single-Cycle Performance Example! Example: For a program with billion instrucoons execuong on a single-cycle MIPS processor: Execu/on Time = (# instrucoons) CPI T c = ( 9 )()(925-2 s) = 92.5 seconds 3

32 Carnegie Mellon Review: Single-Cycle MIPS Processor Jump 3:26 5: MemtoReg Control MemWrite Unit Branch ALUControl 2: Op ALUSrc Funct RegDst RegWrite PCSrc PC' PC A RD Instruction Memory Instr 25:2 2:6 A A2 A3 WD3 WE3 Register File RD RD2 SrcA SrcB ALU Zero ALUResult WriteData A WE RD Data Memory WD ReadData Result PCJump 4 + PCPlus4 2:6 5: 5: Sign Extend WriteReg 4: SignImm <<2 + PCBranch 27: 3:28 25: <<2 32

33 Carnegie Mellon Review: Mul9-Cycle MIPS Processor IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn MemtoReg RegDst PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: A A2 A3 WD3 WE3 Register File RD RD2 A B 3:28 4 <<2 SrcA SrcB ALU Zero ALUResult ALUOut PCJump <<2 27: 5: Sign Extend ImmExt 25: (Addr) 33

34 Carnegie Mellon Review: Mul9-Cycle MIPS FSM S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = J Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S: Jump S9: ADDI Execute PCSrc = PCWrite ALUSrcA = ALUSrcB = ALUOp = What is the shortcoming of this design? Op = LW S3: MemRead IorD = Op = SW S5: MemWrite IorD = MemWrite S7: ALU Writeback RegDst = MemtoReg = RegWrite S: ADDI Writeback RegDst = MemtoReg = RegWrite What does this design assume about memory? S4: Mem Writeback RegDst = MemtoReg = RegWrite 34

35 Carnegie Mellon What If Memory Takes > One Cycle?! Stay in the same memory access state un9l memory returns the data! Memory Ready? bit is an input to the control logic that determines the next state 35

36 Another Example: Microprogrammed Multi-Cycle Microarchitecture 36

37 How Do We Implement This?! Maurice Wilkes, The Best Way to Design an Automatic Calculating Machine, Manchester Univ. Computer Inaugural Conf., 95.! An elegant implementation: " The concept of microcoded/microprogrammed machines 37

38 Recall: A Basic Multi-Cycle Microarchitecture! Instruction processing cycle divided into states! A stage in the instruction processing cycle can take multiple states! A multi-cycle microarchitecture sequences from state to state to process an instruction! The behavior of the machine in a state is completely determined by control signals in that state! The behavior of the entire processor is specified fully by a finite state machine! In a state (clock cycle), control signals control two things:! How the datapath should process the data! How to generate the control signals for the (next) clock cycle 38

39 Microprogrammed Control Terminology! Control signals associated with the current state " Microinstruction! Act of transitioning from one state to another " Determining the next state and the microinstruction for the next state " Microsequencing! Control store stores control signals for every possible state " Store for microinstructions for the entire FSM! Microsequencer determines which set of control signals will be used in the next clock cycle (i.e., next state) 39

40 R IR[5:] BEN Example Control Microsequencer Structure 6 Simple Design of the Control Structure Control Store 2 6 x Microinstruction 9 26 (J, COND, IRD)

41 What Happens In A Clock Cycle?! The control signals (microinstruction) for the current state control two things: " Processing in the data path " Generation of control signals (microinstruction) for the next cycle " See Supplemental Figure (next-next slide)! Datapath and microsequencer operate concurrently! Question: why not generate control signals for the current cycle in the current cycle? " This could lengthen the clock cycle " Why could it lengthen the clock cycle? " See Supplemental Figure 2 4

42 Example uprogrammed Control & Datapath Read Appendix C On website 42

43 A Clock Cycle 43

44 A Bad Clock Cycle! 44

45 A Simple LC-3b Control and Datapath Read Appendix C On website 45

46 What Determines Next-State Control Signals?! What is happening in the current clock cycle " See the 9 control signals coming from Control block! What are these for?! The instruction that is being executed " IR[5:] coming from the Data Path! Whether the condition of a branch is met, if the instruction being processed is a branch " BEN bit coming from the datapath! Whether the memory operation is completing in the current cycle, if one is in progress " R bit coming from memory 46

47 A Simple LC-3b Control and Datapath 47

48 The State Machine for Multi-Cycle Processing! The behavior of the LC-3b uarch is completely determined by " the 35 control signals and " additional 7 bits that go into the control logic from the datapath! 35 control signals completely describe the state of the control structure! We can completely describe the behavior of the LC-3b as a state machine, i.e. a directed graph of " Nodes (one corresponding to each state) " Arcs (showing flow from each state to the next state(s)) 48

49 An LC-3b State Machine! Patt and Patel, Appendix C, Figure C.2! Each state must be uniquely specified " Done by means of state variables! 3 distinct states in this LC-3b state machine " Encoded with 6 state variables! Examples " State 8,9 correspond to the beginning of the instruction processing cycle " Fetch phase: state 8, 9 $ state 33 $ state 35 " Decode phase: state 32 49

50 MAR <! PC PC <! PC + 2 8, 9 MDR <! M 33 R R IR <! MDR 35 To 8 RTI ADD 32 BEN<!IR[] & N + IR[] & Z + IR[9] & P [IR[5:2]] BR To To To 8 DR<!SR+OP2* set CC DR<!SR&OP2* set CC 5 AND XOR TRAP SHF LEA LDB LDW STW STB JSR JMP [BEN] 22 PC<!PC+LSHF(off9,) To 8 9 DR<!SR XOR OP2* set CC 2 PC<!BaseR To 8 To 8 MAR<!LSHF(ZEXT[IR[7:]],) 5 4 [IR[]] To 8 R MDR<!M[MAR] R7<!PC R PC<!MDR R7<!PC PC<!BaseR 2 R7<!PC To 8 PC<!PC+LSHF(off,) To 8 3 DR<!SHF(SR,A,D,amt4) set CC To 8 To 8 4 DR<!PC+LSHF(off9, ) set CC 2 MAR<!B+off6 6 MAR<!B+LSHF(off6,) 7 MAR<!B+LSHF(off6,) 3 MAR<!B+off6 To NOTES B+off6 : Base + SEXT[offset6] PC+off9 : PC + SEXT[offset9] *OP2 may be SR2 or SEXT[imm5] ** [5:8] or [7:] depending on MAR[] MDR<!M[MAR[5:] ] R R 3 DR<!SEXT[BYTE.DATA] set CC MDR<!M[MAR] 27 R DR<!MDR set CC R MDR<!SR 6 M[MAR]<!MDR R R MDR<!SR[7:] 7 M[MAR]<!MDR** R R To 8 To 8 To 8 To 9

51 This FSM Implements the LC-3b ISA! P&P Appendix A (revised): " content/dam/ethz/ special-interest/infk/instinfsec/system-securitygroup-dam/education/ Digitaltechnik_7/lecture/ pp-appendixa.pdf 5

52 LC-3b State Machine: Some Questions! How many cycles does the fastest instruction take?! How many cycles does the slowest instruction take?! Why does the BR take as long as it takes in the FSM?! What determines the clock cycle time? 52

53 LC-3b Datapath! Patt and Patel, Appendix C, Figure C.3! Single-bus datapath design " At any point only one value can be gated on the bus (i.e., can be driving the bus) " Advantage: Low hardware cost: one bus " Disadvantage: Reduced concurrency if instruction needs the bus twice for two different things, these need to happen in different states! Control signals (26 of them) determine what happens in the datapath in one clock cycle " Patt and Patel, Appendix C, Table C. 53

54 MEMORY OUTPUT INPUT KBDR ADDR. CTL. LOGIC MDR INMUX MAR L L MAR[] MAR[] DATA.SIZE R DATA.SIZE D D.. M MDR AR 2 KBSR MEM.EN R.W MIO.EN GatePC GateMARMUX LD.CC SR2MUX SEXT SEXT [8:] [:] SEXT SEXT [5:] 6 +2 PC LD.PC [7:] LSHF [4:] GateALU 6 SHF GateSHF 6 IR[5:] LOGIC 6 6 GateMDR N Z P SR2 OUT SR OUT REG FILE MARMUX R ADDR2MUX 2 ZEXT & LSHF 3 3 ALU ALUK 2 A B ADDRMUX PCMUX 2 SR DR SR2 LD.REG IR LD.IR CONTROL DDR DSR MIO.EN LOGIC LOGIC SIZE DATA. WE WE [] WE LOGIC

55 IR[:9] DR IR[:9] IR[8:6] SR DRMUX SRMUX (a) Remember the MIPS datapath (b) IR[:9] N Z P Logic BEN (c)

56

57 LC-3b Datapath: Some Questions! How does instruction fetch happen in this datapath according to the state machine?! What is the difference between gating and loading? " Gating: Enable/disable an input to be connected to the bus! Combinational: during a clock cycle " Loading: Enable/disable an input to be written to a register! Sequential: e.g., at a clock edge (assume at the end of cycle)! Is this the smallest hardware you can design? 57

58 LC-3b Microprogrammed Control Structure! Patt and Patel, Appendix C, Figure C.4! Three components: " Microinstruction, control store, microsequencer! Microinstruction: control signals that control the datapath (26 of them) and help determine the next state (9 of them)! Each microinstruction is stored in a unique location in the control store (a special memory structure)! Unique location: address of the state corresponding to the microinstruction " Remember each state corresponds to one microinstruction! Microsequencer determines the address of the next microinstruction (i.e., next state) 58

59 R IR[5:] BEN Microsequencer 6 Simple Design of the Control Structure Control Store 2 6 x Microinstruction 9 26 (J, COND, IRD)

60 COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State

61 J IRD Cond LD.MDR LD.IR LD.BEN LD.REG LD.CC LD.MAR GatePC GateMDR GateALU LD.PC GateMARMUX GateSHF PCMUX DRMUX SRMUX ADDRMUX ADDR2MUX MARMUX ALUK MIO.EN R.W DATA.SIZE LSHF (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State 2) (State 2) (State 22) (State 23) (State 24) (State 25) (State 26) (State 27) (State 28) (State 29) (State 3) (State 3) (State 32) (State 33) (State 34) (State 35) (State 36) (State 37) (State 38) (State 39) (State 4) (State 4) (State 42) (State 43) (State 44) (State 45) (State 46) (State 47) (State 48) (State 49) (State 5) (State 5) (State 52) (State 53) (State 54) (State 55) (State 56) (State 57) (State 58) (State 59) (State 6) (State 6) (State 62) (State 63)

62 LC-3b Microsequencer! Patt and Patel, Appendix C, Figure C.5! The purpose of the microsequencer is to determine the address of the next microinstruction (i.e., next state) " Next state could be conditional or unconditional! Next state address depends on 9 control signals (plus 7 data signals) 62

63 COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State

64 The Microsequencer: Some Questions! When is the IRD signal asserted?! What happens if an illegal instruction is decoded?! What are condition (COND) bits for?! How is variable latency memory handled?! How do you do the state encoding? " Minimize number of state variables (~ control store size) " Start with the 6-way branch " Then determine constraint tables and states dependent on COND 64

65 An Exercise in Microprogramming 65

66 Handouts! 7 pages of Microprogrammed LC-3b design! infk/inst-infsec/system-security-group-dam/education/ Digitaltechnik_7/lecture/lc3b-figures.pdf 66

67 A Simple LC-3b Control and Datapath 67

68 MAR <! PC PC <! PC + 2 8, 9 MDR <! M 33 R R IR <! MDR 35 To 8 RTI ADD 32 BEN<!IR[] & N + IR[] & Z + IR[9] & P [IR[5:2]] BR To To To 8 DR<!SR+OP2* set CC DR<!SR&OP2* set CC 5 AND XOR TRAP SHF LEA LDB LDW STW STB JSR JMP [BEN] 22 PC<!PC+LSHF(off9,) To 8 9 DR<!SR XOR OP2* set CC 2 PC<!BaseR To 8 To 8 MAR<!LSHF(ZEXT[IR[7:]],) 5 4 [IR[]] To 8 R MDR<!M[MAR] R7<!PC R PC<!MDR R7<!PC PC<!BaseR 2 R7<!PC To 8 PC<!PC+LSHF(off,) To 8 3 DR<!SHF(SR,A,D,amt4) set CC To 8 To 8 4 DR<!PC+LSHF(off9, ) set CC 2 MAR<!B+off6 6 MAR<!B+LSHF(off6,) 7 MAR<!B+LSHF(off6,) 3 MAR<!B+off6 To NOTES B+off6 : Base + SEXT[offset6] PC+off9 : PC + SEXT[offset9] *OP2 may be SR2 or SEXT[imm5] ** [5:8] or [7:] depending on MAR[] MDR<!M[MAR[5:] ] R R 3 DR<!SEXT[BYTE.DATA] set CC MDR<!M[MAR] 27 R DR<!MDR set CC R MDR<!SR 6 M[MAR]<!MDR R R MDR<!SR[7:] 7 M[MAR]<!MDR** R R To 8 To 8 To 8 To 9

69 GateMARMUX GatePC LD.PC PC ZEXT & LSHF MARMUX 6 6 LSHF PCMUX ADDRMUX LD.REG 3 SR2 6 SR2 OUT REG FILE SR OUT 3 3 DR SR [7:] 2 ADDR2MUX [:] SEXT [8:] SEXT SR2MUX [5:] [4:] SEXT SEXT CONTROL R LD.IR IR 6 LD.CC N Z P 2 B A ALUK ALU SHF 6 IR[5:] LOGIC 6 6 GateALU 6 GateSHF GateMDR MAR LD. MAR A Simple Datapath Can Become Very Powerful LOGIC MDR DATA.SIZE MAR[] 6 LD. MDR MIO.EN WE WE WE LOGIC MEMORY MEM.EN R [] R.W DATA. SIZE ADDR. CTL. LOGIC 2 MIO.EN INPUT KBDR KBSR DDR OUTPUT DSR 6 6 LOGIC DATA.SIZE MAR[] INMUX

70 State Machine for LDW Microsequencer COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State Fill in the microinstructions for the 7 states for LDW State 8 () State 33 () State 35 () State 32 () State 6 () State 25 () State 27 ()

71 IR[:9] DR IR[:9] IR[8:6] SR DRMUX SRMUX (a) (b) IR[:9] N Z P Logic BEN (c)

72

73 R IR[5:] BEN Microsequencer 6 Simple Design of the Control Structure Control Store 2 6 x Microinstruction 9 26 (J, COND, IRD)

74 COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State

75 J IRD Cond LD.MDR LD.IR LD.BEN LD.REG LD.CC LD.MAR GatePC GateMDR GateALU LD.PC GateMARMUX GateSHF PCMUX DRMUX SRMUX ADDRMUX ADDR2MUX MARMUX ALUK MIO.EN R.W DATA.SIZE LSHF (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State 2) (State 2) (State 22) (State 23) (State 24) (State 25) (State 26) (State 27) (State 28) (State 29) (State 3) (State 3) (State 32) (State 33) (State 34) (State 35) (State 36) (State 37) (State 38) (State 39) (State 4) (State 4) (State 42) (State 43) (State 44) (State 45) (State 46) (State 47) (State 48) (State 49) (State 5) (State 5) (State 52) (State 53) (State 54) (State 55) (State 56) (State 57) (State 58) (State 59) (State 6) (State 6) (State 62) (State 63)

76 End of the Exercise in Microprogramming 76

77 Design of Digital Circuits Lecture 4: Microprogramming Prof. Onur Mutlu ETH Zurich Spring 27 7 April 27

CMU Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining

CMU Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining CMU 18-447 Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining Instructor: Prof Onur Mutlu TAs: Rachata Ausavarungnirun, Kevin Chang, Albert Cho, Jeremie

More information

Department of Electrical and Computer Engineering The University of Texas at Austin

Department of Electrical and Computer Engineering The University of Texas at Austin Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Fall 2004 Yale Patt, Instructor Aater Suleman, Huzefa Sanjeliwala, Dam Sunwoo, TAs Exam 1, October 6, 2004 Name:

More information

Review: Single-Cycle Processor. Limits on cycle time

Review: Single-Cycle Processor. Limits on cycle time Review: Single-Cycle Processor Jump 3:26 5: MemtoReg Control Unit LUControl 2: Op Funct LUSrc RegDst PCJump PC 4 uction + PCPlus4 25:2 2:6 2:6 5: 5: 2 3 WriteReg 4: Src Src LU Zero LUResult Write + PC

More information

CSE Computer Architecture I

CSE Computer Architecture I Execution Sequence Summary CSE 30321 Computer Architecture I Lecture 17 - Multi Cycle Control Michael Niemier Department of Computer Science and Engineering Step name Instruction fetch Instruction decode/register

More information

CPU DESIGN The Single-Cycle Implementation

CPU DESIGN The Single-Cycle Implementation CSE 202 Computer Organization CPU DESIGN The Single-Cycle Implementation Shakil M. Khan (adapted from Prof. H. Roumani) Dept of CS & Eng, York University Sequential vs. Combinational Circuits Digital circuits

More information

Computer Architecture Lecture 8: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/4/2013

Computer Architecture Lecture 8: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/4/2013 18-447 Computer Architecture Lecture 8: Pipelining Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/4/2013 Reminder: Homework 2 Homework 2 out Due February 11 (next Monday) LC-3b microcode ISA

More information

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle Computer Engineering Department CC 311- Computer Architecture Chapter 4 The Processor: Datapath and Control Single Cycle Introduction The 5 classic components of a computer Processor Input Control Memory

More information

CENG3420 Lab 3-1: LC-3b Datapath

CENG3420 Lab 3-1: LC-3b Datapath CENG3420 Lab 3-1: LC-3b Datapath Bei Yu Department of Computer Science and Engineering The Chinese University of Hong Kong byu@cse.cuhk.edu.hk Spring 2018 1 / 22 Overview Introduction Lab3-1 Assignment

More information

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk ECE290 Fall 2012 Lecture 22 Dr. Zbigniew Kalbarczyk Today LC-3 Micro-sequencer (the control store) LC-3 Micro-programmed control memory LC-3 Micro-instruction format LC -3 Micro-sequencer (the circuitry)

More information

Combinational vs. Sequential. Summary of Combinational Logic. Combinational device/circuit: any circuit built using the basic gates Expressed as

Combinational vs. Sequential. Summary of Combinational Logic. Combinational device/circuit: any circuit built using the basic gates Expressed as Summary of Combinational Logic : Computer Architecture I Instructor: Prof. Bhagi Narahari Dept. of Computer Science Course URL: www.seas.gwu.edu/~bhagiweb/cs3/ Combinational device/circuit: any circuit

More information

Spiral 1 / Unit 3

Spiral 1 / Unit 3 -3. Spiral / Unit 3 Minterm and Maxterms Canonical Sums and Products 2- and 3-Variable Boolean Algebra Theorems DeMorgan's Theorem Function Synthesis use Canonical Sums/Products -3.2 Outcomes I know the

More information

L07-L09 recap: Fundamental lesson(s)!

L07-L09 recap: Fundamental lesson(s)! L7-L9 recap: Fundamental lesson(s)! Over the next 3 lectures (using the IPS ISA as context) I ll explain:! How functions are treated and processed in assembly! How system calls are enabled in assembly!

More information

Project Two RISC Processor Implementation ECE 485

Project Two RISC Processor Implementation ECE 485 Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor

More information

Designing MIPS Processor

Designing MIPS Processor CSE 675.: Introdction to Compter Architectre Designing IPS Processor (lti-cycle) Presentation H Reading Assignment: 5.5,5.6 lti-cycle Design Principles Break p eection of each instrction into steps. The

More information

Implementing the Controller. Harvard-Style Datapath for DLX

Implementing the Controller. Harvard-Style Datapath for DLX 6.823, L6--1 Implementing the Controller Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 6.823, L6--2 Harvard-Style Datapath for DLX Src1 ( j / ~j ) Src2 ( R / RInd) RegWrite MemWrite

More information

Topics: A multiple cycle implementation. Distributed Notes

Topics: A multiple cycle implementation. Distributed Notes COSC 22: Compter Organization Instrctor: Dr. Amir Asif Department of Compter Science York University Handot # lticycle Implementation of a IPS Processor Topics: A mltiple cycle implementation Distribted

More information

EC 413 Computer Organization

EC 413 Computer Organization EC 413 Computer Organization rithmetic Logic Unit (LU) and Register File Prof. Michel. Kinsy Computing: Computer Organization The DN of Modern Computing Computer CPU Memory System LU Register File Disks

More information

CPSC 3300 Spring 2017 Exam 2

CPSC 3300 Spring 2017 Exam 2 CPSC 3300 Spring 2017 Exam 2 Name: 1. Matching. Write the correct term from the list into each blank. (2 pts. each) structural hazard EPIC forwarding precise exception hardwired load-use data hazard VLIW

More information

TEST 1 REVIEW. Lectures 1-5

TEST 1 REVIEW. Lectures 1-5 TEST 1 REVIEW Lectures 1-5 REVIEW Test 1 will cover lectures 1-5. There are 10 questions in total with the last being a bonus question. The questions take the form of short answers (where you are expected

More information

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2 Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

More information

A Second Datapath Example YH16

A Second Datapath Example YH16 A Second Datapath Example YH16 Lecture 09 Prof. Yih Huang S365 1 A 16-Bit Architecture: YH16 A word is 16 bit wide 32 general purpose registers, 16 bits each Like MIPS, 0 is hardwired zero. 16 bit P 16

More information

3. (2) What is the difference between fixed and hybrid instructions?

3. (2) What is the difference between fixed and hybrid instructions? 1. (2 pts) What is a "balanced" pipeline? 2. (2 pts) What are the two main ways to define performance? 3. (2) What is the difference between fixed and hybrid instructions? 4. (2 pts) Clock rates have grown

More information

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished? 1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished? 2. (2 )What are the two main ways to define performance? 3. (2 )What

More information

4. (3) What do we mean when we say something is an N-operand machine?

4. (3) What do we mean when we say something is an N-operand machine? 1. (2) What are the two main ways to define performance? 2. (2) When dealing with control hazards, a prediction is not enough - what else is necessary in order to eliminate stalls? 3. (3) What is an "unbalanced"

More information

[2] Predicting the direction of a branch is not enough. What else is necessary?

[2] Predicting the direction of a branch is not enough. What else is necessary? [2] What are the two main ways to define performance? [2] Predicting the direction of a branch is not enough. What else is necessary? [2] The power consumed by a chip has increased over time, but the clock

More information

[2] Predicting the direction of a branch is not enough. What else is necessary?

[2] Predicting the direction of a branch is not enough. What else is necessary? [2] When we talk about the number of operands in an instruction (a 1-operand or a 2-operand instruction, for example), what do we mean? [2] What are the two main ways to define performance? [2] Predicting

More information

Figure 4.9 MARIE s Datapath

Figure 4.9 MARIE s Datapath Term Control Word Microoperation Hardwired Control Microprogrammed Control Discussion A set of signals that executes a microoperation. A register transfer or other operation that the CPU can execute in

More information

Instruction register. Data. Registers. Register # Memory data register

Instruction register. Data. Registers. Register # Memory data register Where we are headed Single Cycle Problems: what if we had a more complicated instrction like floating point? wastefl of area One Soltion: se a smaller cycle time have different instrctions take different

More information

ECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)

ECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference) ECE 3401 Lecture 23 Pipeline Design Control State Register Combinational Control Logic New/ Modified Control Word ISA: Instruction Specifications (for reference) P C P C + 1 I N F I R M [ P C ] E X 0 PC

More information

Outcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers

Outcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers -2. -2.2 piral / Unit 2 Basic Boolean Algebra Logic Functions Decoders Multipleers Mark Redekopp Outcomes I know the difference between combinational and sequential logic and can name eamples of each.

More information

Enrico Nardelli Logic Circuits and Computer Architecture

Enrico Nardelli Logic Circuits and Computer Architecture Enrico Nardelli Logic Circuits and Computer Architecture Appendix B The design of VS0: a very simple CPU Rev. 1.4 (2009-10) by Enrico Nardelli B - 1 Instruction set Just 4 instructions LOAD M - Copy into

More information

Review. Combined Datapath

Review. Combined Datapath Review Topics:. A single cycle implementation 2. State Diagrams. A mltiple cycle implementation COSC 22: Compter Organization Instrctor: Dr. Amir Asif Department of Compter Science York University Handot

More information

Processor Design & ALU Design

Processor Design & ALU Design 3/8/2 Processor Design A. Sahu CSE, IIT Guwahati Please be updated with http://jatinga.iitg.ernet.in/~asahu/c22/ Outline Components of CPU Register, Multiplexor, Decoder, / Adder, substractor, Varity of

More information

Lecture 13: Sequential Circuits, FSM

Lecture 13: Sequential Circuits, FSM Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines Reminder: midterm on Tue 2/28 will cover Chapters 1-3, App A, B if you understand all slides, assignments,

More information

Lecture 13: Sequential Circuits, FSM

Lecture 13: Sequential Circuits, FSM Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines 1 Clocks A microprocessor is composed of many different circuits that are operating simultaneously if each

More information

Lecture 3, Performance

Lecture 3, Performance Lecture 3, Performance Repeating some definitions: CPI Clocks Per Instruction MHz megahertz, millions of cycles per second MIPS Millions of Instructions Per Second = MHz / CPI MOPS Millions of Operations

More information

Lecture 3, Performance

Lecture 3, Performance Repeating some definitions: Lecture 3, Performance CPI MHz MIPS MOPS Clocks Per Instruction megahertz, millions of cycles per second Millions of Instructions Per Second = MHz / CPI Millions of Operations

More information

61C In the News. Processor Design: 5 steps

61C In the News. Processor Design: 5 steps www.eetimes.com/electronics-news/23235/thailand-floods-take-toll-on--makers The Thai floods have already claimed the lives of hundreds of pele, with tens of thousands more having had to flee their homes

More information

CSCI-564 Advanced Computer Architecture

CSCI-564 Advanced Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA

More information

CS 52 Computer rchitecture and Engineering Lecture 4 - Pipelining Krste sanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste! http://inst.eecs.berkeley.edu/~cs52!

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital

More information

Formal Verification of Systems-on-Chip

Formal Verification of Systems-on-Chip Formal Verification of Systems-on-Chip Wolfgang Kunz Department of Electrical & Computer Engineering University of Kaiserslautern, Germany Slide 1 Industrial Experiences Formal verification of Systems-on-Chip

More information

CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.

CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5. CHPTER 9 2008 Pearson Education, Inc. 9-. log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C 7 Z = F 7 + F 6 + F 5 + F 4 + F 3 + F 2 + F + F 0 N = F 7 9-3.* = S + S = S + S S S S0 C in C 0 dder

More information

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati CS222: Processor Design: Multi Cycle Design Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati Mid Semester Exam Multi Cycle design Outline Clock periods in single cycle and

More information

CMP N 301 Computer Architecture. Appendix C

CMP N 301 Computer Architecture. Appendix C CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

UNIVERSITY OF WISCONSIN MADISON

UNIVERSITY OF WISCONSIN MADISON CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON Prof. Gurindar Sohi TAs: Minsub Shin, Lisa Ossian, Sujith Surendran Midterm Examination 2 In Class (50 minutes) Friday,

More information

ALU A functional unit

ALU A functional unit ALU A functional unit that performs arithmetic operations such as ADD, SUB, MPY logical operations such as AND, OR, XOR, NOT on given data types: 8-,16-,32-, or 64-bit values A n-1 A n-2... A 1 A 0 B n-1

More information

Simple Instruction-Pipelining. Pipelined Harvard Datapath

Simple Instruction-Pipelining. Pipelined Harvard Datapath 6.823, L8--1 Simple ruction-pipelining Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. I fetch decode & eg-fetch execute memory Clock period

More information

Simple Instruction-Pipelining. Pipelined Harvard Datapath

Simple Instruction-Pipelining. Pipelined Harvard Datapath 6.823, L8--1 Simple ruction-pipelining Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. fetch decode & eg-fetch execute

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 19: Adder Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L19

More information

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University Prof. Mi Lu TA: Ehsan Rohani Laboratory Exercise #4 MIPS Assembly and Simulation

More information

Formal Verification of Systems-on-Chip Industrial Practices

Formal Verification of Systems-on-Chip Industrial Practices Formal Verification of Systems-on-Chip Industrial Practices Wolfgang Kunz Department of Electrical & Computer Engineering University of Kaiserslautern, Germany Slide 1 Industrial Experiences Formal verification

More information

Logic and Computer Design Fundamentals. Chapter 8 Sequencing and Control

Logic and Computer Design Fundamentals. Chapter 8 Sequencing and Control Logic and Computer Design Fundamentals Chapter 8 Sequencing and Control Datapath and Control Datapath - performs data transfer and processing operations Control Unit - Determines enabling and sequencing

More information

Computer Architecture ELEC2401 & ELEC3441

Computer Architecture ELEC2401 & ELEC3441 Last Time Pipeline Hazard Computer Architecture ELEC2401 & ELEC3441 Lecture 8 Pipelining (3) Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Structural Hazard Hazard Control

More information

CMP 338: Third Class

CMP 338: Third Class CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does

More information

State and Finite State Machines

State and Finite State Machines State and Finite State Machines See P&H Appendix C.7. C.8, C.10, C.11 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register

More information

From Sequential Circuits to Real Computers

From Sequential Circuits to Real Computers 1 / 36 From Sequential Circuits to Real Computers Lecturer: Guillaume Beslon Original Author: Lionel Morel Computer Science and Information Technologies - INSA Lyon Fall 2017 2 / 36 Introduction What we

More information

ENEE350 Lecture Notes-Weeks 14 and 15

ENEE350 Lecture Notes-Weeks 14 and 15 Pipelining & Amdahl s Law ENEE350 Lecture Notes-Weeks 14 and 15 Pipelining is a method of processing in which a problem is divided into a number of sub problems and solved and the solu8ons of the sub problems

More information

CSc 256 Midterm 2 Fall 2010

CSc 256 Midterm 2 Fall 2010 CSc 256 Midterm 2 Fall 2010 NAME: 1a)YouaregivenaMIPSbranchinstruction: x:beq$12,$0,y Theaddressofthelabel"y"is0x40013c.Thememorylocationat"x"contains: addresscontents 0x4002080001000110000000????????????????...whichrepresentsthebeqinstruction;the????...????arethe

More information

Microprocessor Power Analysis by Labeled Simulation

Microprocessor Power Analysis by Labeled Simulation Microprocessor Power Analysis by Labeled Simulation Cheng-Ta Hsieh, Kevin Chen and Massoud Pedram University of Southern California Dept. of EE-Systems Los Angeles CA 989 Outline! Introduction! Problem

More information

COVER SHEET: Problem#: Points

COVER SHEET: Problem#: Points EEL 4712 Midterm 3 Spring 2017 VERSION 1 Name: UFID: Sign here to give permission for your test to be returned in class, where others might see your score: IMPORTANT: Please be neat and write (or draw)

More information

CMP 334: Seventh Class

CMP 334: Seventh Class CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative

More information

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3 Digital Logic: Boolean Algebra and Gates Textbook Chapter 3 Basic Logic Gates XOR CMPE12 Summer 2009 02-2 Truth Table The most basic representation of a logic function Lists the output for all possible

More information

Computer Architecture

Computer Architecture Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines

More information

Clock T FF1 T CL1 T FF2 T T T FF T T FF T CL T FF T CL T FF T T FF T T FF T CL. T cyc T H. Clock T FF T T FF T CL T FF T T FF T CL.

Clock T FF1 T CL1 T FF2 T T T FF T T FF T CL T FF T CL T FF T T FF T T FF T CL. T cyc T H. Clock T FF T T FF T CL T FF T T FF T CL. etup TA 60 c, vcc 3v Hold TA 30 c, vcc 5v Tkew TL TL TH FF FF 2 T cyc T H T L Clock TpdFF 2 TpdCL2Tetup FF Tcyc TL 2 2 TpdFF TpdCL Tetup FF2 TH 2 T FF T T FF T CL Hold L cd cd T FF T T FF T CL Hold L cd

More information

Unit 6: Branch Prediction

Unit 6: Branch Prediction CIS 501: Computer Architecture Unit 6: Branch Prediction Slides developed by Joe Devie/, Milo Mar4n & Amir Roth at Upenn with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi,

More information

Simple Instruction-Pipelining (cont.) Pipelining Jumps

Simple Instruction-Pipelining (cont.) Pipelining Jumps 6.823, L9--1 Simple ruction-pipelining (cont.) + Interrupts Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Src1 ( j / ~j ) Src2 ( / Ind) Pipelining Jumps

More information

EECS Components and Design Techniques for Digital Systems. FSMs 9/11/2007

EECS Components and Design Techniques for Digital Systems. FSMs 9/11/2007 EECS 150 - Components and Design Techniques for Digital Systems FSMs 9/11/2007 Sarah Bird Electrical Engineering and Computer Sciences University of California, Berkeley Slides borrowed from David Culler

More information

Appendix B. Review of Digital Logic. Baback Izadi Division of Engineering Programs

Appendix B. Review of Digital Logic. Baback Izadi Division of Engineering Programs Appendix B Review of Digital Logic Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Elect. & Comp. Eng. 2 DeMorgan Symbols NAND (A.B) = A +B NOR (A+B) = A.B AND A.B = A.B = (A +B ) OR

More information

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control The MIPS Pipeline CSCI206 - Computer Organization & Programming Pipeline Datapath and Control zybook: 11.6 Developed and maintained by the Bucknell University Computer Science Department - 2017 Hazard

More information

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary EECS50 - Digital Design Lecture - Shifters & Counters February 24, 2003 John Wawrzynek Spring 2005 EECS50 - Lec-counters Page Register Summary All registers (this semester) based on Flip-flops: q 3 q 2

More information

Basic Computer Organization and Design Part 3/3

Basic Computer Organization and Design Part 3/3 Basic Computer Organization and Design Part 3/3 Adapted by Dr. Adel Ammar Computer Organization Interrupt Initiated Input/Output Open communication only when some data has to be passed --> interrupt. The

More information

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it...

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it... // CS 2, Fall 2! CS 2, Fall 2! We built the instrument. Now we read music and play it... A simple implementa/on uc/on uct r r 2 Write r Src Src Extend 6 Mem Next: path 7-2 CS 2, Fall 2! signals CS 2, Fall

More information

Design at the Register Transfer Level

Design at the Register Transfer Level Week-7 Design at the Register Transfer Level Algorithmic State Machines Algorithmic State Machine (ASM) q Our design methodologies do not scale well to real-world problems. q 232 - Logic Design / Algorithmic

More information

Fall 2011 Prof. Hyesoon Kim

Fall 2011 Prof. Hyesoon Kim Fall 2011 Prof. Hyesoon Kim Add: 2 cycles FE_stage add r1, r2, r3 FE L ID L EX L MEM L WB L add add sub r4, r1, r3 sub sub add add mul r5, r2, r3 mul sub sub add add mul sub sub add add mul sub sub add

More information

ECE 448 Lecture 6. Finite State Machines. State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code. George Mason University

ECE 448 Lecture 6. Finite State Machines. State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code. George Mason University ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code George Mason University Required reading P. Chu, FPGA Prototyping by VHDL Examples

More information

Design of Sequential Circuits

Design of Sequential Circuits Design of Sequential Circuits Seven Steps: Construct a state diagram (showing contents of flip flop and inputs with next state) Assign letter variables to each flip flop and each input and output variable

More information

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1 Building a Computer I wonder where this goes? B LU MIPS Kit Quiz # on /3, open book and notes (This is the last lecture covered) Comp 4 Fall 7 /4/7 L6- Building a Computer THIS IS IT! Motivating Force

More information

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner CPS 4 Computer Organization and Programming Lecture : Gates, Buses, Latches. Robert Wagner CPS4 GBL. RW Fall 2 Overview of Today s Lecture: The MIPS ALU Shifter The Tristate driver Bus Interconnections

More information

CSE Computer Architecture I

CSE Computer Architecture I Single cycle Conrol Implemenaion CSE 332 Compuer Archiecure I l x I Lecure 7 - uli Cycle achines i i [ ] I I r l ichael Niemier Deparmen of Compuer Science and Engineering I ] i X.S. Hu 5- X.S. Hu 5-2

More information

CPE100: Digital Logic Design I

CPE100: Digital Logic Design I Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu CPE100: Digital Logic Design I Final Review http://www.ee.unlv.edu/~b1morris/cpe100/ 2 Logistics Tuesday Dec 12 th 13:00-15:00 (1-3pm) 2 hour

More information

State & Finite State Machines

State & Finite State Machines State & Finite State Machines Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Appendix C.7. C.8, C.10, C.11 Big Picture: Building a Processor memory inst register file

More information

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1> Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building

More information

CSE370: Introduction to Digital Design

CSE370: Introduction to Digital Design CSE370: Introduction to Digital Design Course staff Gaetano Borriello, Brian DeRenzi, Firat Kiyak Course web www.cs.washington.edu/370/ Make sure to subscribe to class mailing list (cse370@cs) Course text

More information

Lecture: Pipelining Basics

Lecture: Pipelining Basics Lecture: Pipelining Basics Topics: Performance equations wrap-up, Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4:

More information

From Sequential Circuits to Real Computers

From Sequential Circuits to Real Computers From Sequential Circuits to Real Computers Lecturer: Guillaume Beslon Original Author: Lionel Morel Computer Science and Information Technologies - INSA Lyon Fall 2018 1 / 39 Introduction I What we have

More information

Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1

Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1 Computer Architecture ECE 361 Lecture 5: The Design Process & Design 361 design.1 Quick Review of Last Lecture 361 design.2 MIPS ISA Design Objectives and Implications Support general OS and C- style language

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Chapter 3. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 3 <1>

Chapter 3. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 3 <1> Chapter 3 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 3 Chapter 3 :: Topics Introduction Latches and Flip-Flops Synchronous Logic Design Finite

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN

More information

ECE 341. Lecture # 3

ECE 341. Lecture # 3 ECE 341 Lecture # 3 Instructor: Zeshan Chishti zeshan@ece.pdx.edu October 7, 2013 Portland State University Lecture Topics Counters Finite State Machines Decoders Multiplexers Reference: Appendix A of

More information

Logic Design. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

Logic Design. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson Logic Deign CS 270: Mathematical Foundation of Computer Science Jeremy Johnon Logic Deign Objective: To provide an important application of propoitional logic to the deign and implification of logic circuit.

More information

Name: ID# a) Complete the state transition table for the aforementioned circuit

Name:   ID# a) Complete the state transition table for the aforementioned circuit UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences EECS150 Fall 2001 Prof. Subramanian Final Examination 1) You are to design a sequential circuit with two JK FFs A and

More information

CPE100: Digital Logic Design I

CPE100: Digital Logic Design I Chapter 3 Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu http://www.ee.unlv.edu/~b1morris/cpe1/ CPE1: Digital Logic Design I Section 14: Dr. Morris Sequential Logic Design Chapter 3 Chapter

More information

COE 328 Final Exam 2008

COE 328 Final Exam 2008 COE 328 Final Exam 2008 1. Design a comparator that compares a 4 bit number A to a 4 bit number B and gives an Output F=1 if A is not equal B. You must use 2 input LUTs only. 2. Given the following logic

More information

Load. Load. Load 1 0 MUX B. MB select. Bus A. A B n H select S 2:0 C S. G select 4 V C N Z. unit (ALU) G. Zero Detect.

Load. Load. Load 1 0 MUX B. MB select. Bus A. A B n H select S 2:0 C S. G select 4 V C N Z. unit (ALU) G. Zero Detect. 9- Write D data Load eable A address A select B address B select Load R 2 2 Load Load R R2 UX 2 3 UX 2 3 2 3 Decoder D address 2 Costat i Destiatio select 28 Pearso Educatio, Ic.. orris ao & Charles R.

More information

CSE140: Design of Sequential Logic

CSE140: Design of Sequential Logic CSE4: Design of Sequential Logic Instructor: Mohsen Imani Flip Flops 2 Counter 3 Up counter 4 Up counter 5 FSM with JK-Flip Flop 6 State Table 7 State Table 8 Circuit Minimization 9 Circuit Timing Constraints

More information

Chapter 3 Digital Logic Structures

Chapter 3 Digital Logic Structures Chapter 3 Digital Logic Structures Original slides from Gregory Byrd, North Carolina State University Modified by C. Wilcox, M. Strout, Y. Malaiya Colorado State University Computing Layers Problems Algorithms

More information

Arithmetic and Logic Unit First Part

Arithmetic and Logic Unit First Part Arithmetic and Logic Unit First Part Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras ALU1-1 Typical

More information