CSE Computer Architecture I

Similar documents
Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

CPU DESIGN The Single-Cycle Implementation

L07-L09 recap: Fundamental lesson(s)!

Designing MIPS Processor

Design of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017

CPSC 3300 Spring 2017 Exam 2

Spiral 1 / Unit 3

Project Two RISC Processor Implementation ECE 485

Instruction register. Data. Registers. Register # Memory data register

Topics: A multiple cycle implementation. Distributed Notes

Implementing the Controller. Harvard-Style Datapath for DLX

Processor Design & ALU Design

Simple Instruction-Pipelining (cont.) Pipelining Jumps

Formal Verification of Systems-on-Chip

[2] Predicting the direction of a branch is not enough. What else is necessary?

Review. Combined Datapath

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?

[2] Predicting the direction of a branch is not enough. What else is necessary?

3. (2) What is the difference between fixed and hybrid instructions?

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk

4. (3) What do we mean when we say something is an N-operand machine?

A Second Datapath Example YH16

Review: Single-Cycle Processor. Limits on cycle time

EC 413 Computer Organization

Formal Verification of Systems-on-Chip Industrial Practices

Outcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers

TEST 1 REVIEW. Lectures 1-5

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati

Simple Instruction-Pipelining. Pipelined Harvard Datapath

Enrico Nardelli Logic Circuits and Computer Architecture

Figure 4.9 MARIE s Datapath

61C In the News. Processor Design: 5 steps

Simple Instruction-Pipelining. Pipelined Harvard Datapath

CSE Computer Architecture I

COVER SHEET: Problem#: Points

CSCI-564 Advanced Computer Architecture

Computer Architecture

Lecture 13: Sequential Circuits, FSM

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

Clock T FF1 T CL1 T FF2 T T T FF T T FF T CL T FF T CL T FF T T FF T T FF T CL. T cyc T H. Clock T FF T T FF T CL T FF T T FF T CL.

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it...

Lecture 13: Sequential Circuits, FSM


Designing Single-Cycle MIPS Processor

CMP 338: Third Class

Department of Electrical and Computer Engineering The University of Texas at Austin

CMP 334: Seventh Class

UNIVERSITY OF WISCONSIN MADISON

中村宏 工学部計数工学科 情報理工学系研究科システム情報学専攻 計算システム論第 2 1

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

Arithmetic and Logic Unit First Part

Computer Architecture Lecture 8: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/4/2013

System Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address

Microprocessor Power Analysis by Labeled Simulation

Logic and Computer Design Fundamentals. Chapter 8 Sequencing and Control

CSc 256 Midterm 2 Fall 2010

Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1

Computer Architecture

CprE 281: Digital Logic

Logic Design. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

Computer Architecture ELEC2401 & ELEC3441

CMU Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining

CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.

Table of Content. Chapter 11 Dedicated Microprocessors Page 1 of 25

Computer Science. Questions for discussion Part II. Computer Science COMPUTER SCIENCE. Section 4.2.

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

Combinational vs. Sequential. Summary of Combinational Logic. Combinational device/circuit: any circuit built using the basic gates Expressed as

Basic Computer Organization and Design Part 3/3

Verilog HDL:Digital Design and Modeling. Chapter 11. Additional Design Examples. Additional Figures

Lecture 3, Performance

Lecture 3, Performance

ECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)

Implementing Absolute Addressing in a Motorola Processor (DRAFT)

Department of Electrical and Computer Engineering University of Wisconsin - Madison. ECE/CS 752 Advanced Computer Architecture I.

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

From Sequential Circuits to Real Computers

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner

ALU A functional unit

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

CMP N 301 Computer Architecture. Appendix C

COE 328 Final Exam 2008

Name: ID# a) Complete the state transition table for the aforementioned circuit

Load. Load. Load 1 0 MUX B. MB select. Bus A. A B n H select S 2:0 C S. G select 4 V C N Z. unit (ALU) G. Zero Detect.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

From Sequential Circuits to Real Computers

Introduction to Computer Engineering. CS/ECE 252, Fall 2012 Prof. Guri Sohi Computer Sciences Department University of Wisconsin Madison

Digital Design. Register Transfer Specification And Design

Digital Logic. CS211 Computer Architecture. l Topics. l Transistors (Design & Types) l Logic Gates. l Combinational Circuits.

2

Section 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic

Systems I: Computer Organization and Architecture

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3

CA Compiler Construction

Fundamentals of Computer Systems

On my honor, as an Aggie, I have neither given nor received unauthorized aid on this academic work

Binary addition example worked out

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Transcription:

Execution Sequence Summary CSE 30321 Computer Architecture I Lecture 17 - Multi Cycle Control Michael Niemier Department of Computer Science and Engineering Step name Instruction fetch Instruction decode/register fetch Action for R-type instructions Action for memory-reference instructions IR = Mem[PC], PC = PC + 4 A =RF [IR[25:21]], B = RF [IR[20:16]], Action for branches ALUOut = PC + (sign-extend (IR[1:-0]) << 2) Action for jumps Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A =B) then PC = PC [31:28] computation, branch/ (IR[15:0]) PC = ALUOut (IR[25:0]<<2) jump completion Memory access or R-type RF [IR[15:11]] = Load: MDR = Mem[ALUOut] completion ALUOut or Store: Mem[ALUOut]= B Memory read completion Load: RF[IR[20:16]] = MDR 5-1 5-2 A Multiple Cycle Datapath Multiple Cycle Design P C A d d r e s s M e m o r y D a t a I n s t r u c t i o n o r d a t a I n s t r u c t i o n r e g i s t e r M e m o r y d a t a r e g i s t e r D a t a R e g i s t e r A R e g i s t e r s R e g i s t e r B R e g i s t e r W!!Where do we need to insert mux s?!!any other functional units? A B A L U A L U O u t!! Break up the instructions into steps, each step takes a cycle "! balance the amount of work to be done "! restrict each cycle to use only one major functional unit!! At the end of a cycle "! store values for use in later cycles (easiest thing to do) "! introduce additional internal registers P!C! M! u! x! 1! A!d!d!r!e!s!s! W! r!i!t!e! d!a!t!a! M!e!m!o!r!y! M!e!m!D!a!t!a! I!n!s!t!r!u!c!t!i!o!n! [!2!5!!2!1!]! I!n!s!t!r!u!c!t!i!o!n! [!2!!1!6!]! I!n!s!t!r!u!c!t!i!o!n! M! [!1!5!! ]! I!n!s!t!r!u!c!t!i!o!n! u! I!n!s!t!r!u!c!t!i!o!n! [!1!5!!1!1! ]! x! r!e!g! i!s!t!e!r! 1! I!n!s!t!r!u!c!t!i!o!n! [!1!5!!]! M!e!m!o!r!y! d!a!t!a! r!e!g!i!s!t!e!r! M! u! x! 1! R!e!a!d! r!e!g! i!s!t!e!r!1! R!e!a!d! R!e!a!d! r!e!g! i!s!t!e!r!2! d!a!t!a!1! R!e!g! i!s!t!e!r!s! W! r!i!t!e! r!e!g! i!s!t!e!r! R!e!a!d! d!a!t!a!2! W! r!i!t!e! d!a!t!a! 1!6! 3!2! S! i!g!n! e!x!t!e!n!d! S!h! i!f!t! l!e!f!t!2! A! B! 4! M! u! x! 1! 1!M! u! 2! x! 3! Z!e!r!o! A!L!U! A!L!U! r!e!s!u! l!t! A!L!U!O!u!t! 5-3 5-4

Control Signals Implementing the Control!! PC: PCWrite, PCWriteCond, PCSource!! Memory: IorD, MemRead, MemWrite!! Instruction Register: IRWrite!! Register File: RegWrite, MemtoReg, RegDst!! ALU: ALUSrcA, ALUSrcB, ALUOp,!! Value of control signals is dependent upon: "!what instruction is being executed "!which step is being performed!! How to represent all the information? "!finite state diagram "!microprogramming!!realization of a control unit is independent of the representation used "! Control outputs: random logic, ROM, PLA "! Next-state function: same as above or an explicit sequencer 5-5 5-6 Finite State Diagram Microprogramming as an Alternative 2 3 M e m o r y a d r e s s c o m p u t a t i o n A L U S r c A = 1 A L U S r c B = 10 A L U O p = 0 4 ' ) ( O p = ' L W M e m o r y a c c e s s M e m R e a d I o r D = 1 R e g D s t = 0 R e g W r i t e M e m t o R e g = 1 5 W r t e - b a c k s t e p S t a r t M e m o r y a c c e s s M e m W r i t e I o r D = 1 7 I n s t r u c t i o n f e t c h 0 M e m R e a d A L U S r c A = 0 I o r D = 0 I R W r i t e A L U S r c B = 01 A L U O p = 0 P C W r t e P C S o u r c e = 0 6 E x e c u t o n A L U S r c A = 1 A L U S r c B = 0 A L U O p = 10 R e g D s t = 1 R e g W r i t e M e m t o R e g = 0 B r a n c h c o m p l e t i o n 8 A L U S r c A = 1 A L U S r c B = 0 A L U O p = 01 P C W r t e C o n d P C S o u r c e = 01 R - t y p e c o m p l e t o n I n s t r u c t i o n d e t c o d e / r e g i s t e r f e tc h 1 i 9 A L U S r c A = 0 A L U S r c B = 1 A L U O p = 0 ' ) ( O p = J J u m p c o m p l e t o n P C W r i t e P C S o u r c e = 10!! Control unit can easily reach thousands of states with hundreds of different sequences. "! A large set of instructions and/or instruction classes (x86) "! Different implementations with different cycles per instruction!! Flexibility may be needed in the early design phase!! How about borrowing the ideas from what we just learned? "! Treat the set of control signals to be asserted in a state as an instruction to be executed (referred to as microinstructions) "! Treat state transitions as an instruction sequence "! Define formats (mnemonics) "! Specify control signals symbolically using microinstructions 5-7 5-8

Microprogramming as an Alternative (cont d)!! Each state => one microinstruction!! State transitions => microinstruction sequencing!! Setting up control signals => executing microinstructions!! To specify control, we just need to write microprograms (or microcode) S0 S1 S2 S3 Microinst Microinst Microinst N>=0 Microinst N<0 Microinstruction Format (1)!! Group the control signals according to how they are used!! For the 5-cycle MIPS organization: "! Memory: IorD, MemRead, MemWrite "! Instruction Register: IRWrite "! PC: PCWrite, PCWriteCond, PCSource "! Register File: RegWrite, MemtoReg, RegDst "! ALU: ALUSrcA, ALUSrcB, ALUOp!! Group them as follows: "! Memory (for both Memory and Instruction Register) "! PC write control (for PC) "! Register control (for Register File) "! ALU control "! SRC1 "! SRC2 "! Sequencing (for ALU) "! 5-9 5-10 Microinstruction Format (2) Field name! Value! Signals active! Comment! Add! ALU control! Sub! ALUOp = 0 Cause the ALU to add.! ALUOp = 01! Cause the ALU to subtract; this implements the compare for branches.! Func code!aluop = 1 Use the instruction's func to determine ALU control.! SRC1! PC! ALUSrcA = Use the PC as the first ALU input.! A! ALUSrcA = 1! Register A is the first ALU input.! B! ALUSrcB= 0Register B is the second ALU input.! SRC2! 4! ALUSrc = 01! Use 4 as the second ALU input.! Extend! Extshft! Read! ALUSrcB= 1Use output of the sign ext unit as the 2nd ALU input.! ALUSrcB= 11! Use output of shift-by-two unit as the 2nd ALU input.! Read two registers using the rs and rt fields of the IR! and putting the data into registers A and B.! Write RegWrite,! Write a register using the rd field of the IR as the Register! ALU! RegDst = 1,! register number and the contents of the ALUOut as the data.! control! MemtoReg= Write MDR! RegWrite,! Write a register using the rt field of the IR as the RegDst = 0,! register number and the contents of the MDR as the data. MemtoReg=1! 5-11 Microinstruction Format (3) Field name! Value! Signals active! Comment! Read PC! MemRead,! Read memory using the PC as address; lord = write result into IR (and the MDR).! Memory! Read ALU! MemRead,! Read memory using the ALUOut as address; lord = 1! write result into MDR.! PC write control! Write ALU! MemWrite,! ALU! ALUOutcond! jump address! lord = 1! Write memory using the ALUOut as address, contents of B as the data.! PCSource 0 Write the output of the ALU into the PC.! PCWrite! PCSource=01,!If the Zero output of the ALU is active, write the PC with the contents of the register ALUOut.! PCWriteCond! PCSource=10,!Write the PC with the jump address from the instruction.! PCWrite! Seq! AddrCtl = 11! Choose the next microinstruction sequentially.! Sequencing!Fetch! AddrCtl = 0 Go to the first microinstruction to a new instruction.! Dispatch 1! AddrCtl = 01! Dispatch using the ROM 1.! Dispatch 2! AddrCtl = 1 Dispatch using the ROM 2.! 5-12

Sample Microinstruction (1)!!IFetch: IR = Mem[PC], PC = PC+4 PCWrite: 1 PCWriteCond: IorD: 0 MemRead: 1 IRWrite: 1 MemtoReg: PCSource: ALUOp: 00 ALUSrcB: 01 ALUSrcA: 0 RegWrite: RegDst: AddrCtrl: 11 Microinstruction: ALUctrl SRC1 SRC2 RegCtrl Memory PCWrite Sequen Add PC 4 -- ReadPC ALU Seq Sample Microinstruction (2)!! Decode: A= RF[IR[25:21]], B= RF[IR[20:16]], PCWrite: PCWriteCond: IorD: MemRead: IRWrite: MemtoReg: PCSource: ALUOp: 00 ALUSrcB: 11 ALUSrcA: 0 RegWrite: RegDst: AddrCtrl: 01 ALUOut = PC + Sign_Ext(IR[15:0]) << 2); Microinstruction: ALUctrl SRC1 SRC2 RegCtrl Memory PCWrite Sequen! Add PC ExtShf Read #$%&!'! 5-13 5-14 Sample Microinstruction (3)!! BEQ1: -> Ifetch, PCWrite: PCWriteCond: 1 IorD: MemRead: IRWrite: MemtoReg: PCSource: 01 ALUOp: 01 ALUSrcB: 00 ALUSrcA: 1 RegWrite: RegDst: AddrCtrl: 00 if (A=B) then PC= ALUout Microinstruction: ALUctrl SRC1 SRC2 RegCtrl Memory PCWrite Sequen! Sub A B ALUout-cond Fetch! Exercise: Compute Memory Address!! Mem1: -> MRead/RegWrite, PCWrite: PCWriteCond: IorD: MemRead: IRWrite: MemtoReg: PCSource: ALUOp: 00 ALUSrcB: 10 ALUSrcA: 1 RegWrite: RegDst: AddrCtrl: 10 ALUOut = A + Sign_Ext(IR[15:0]); Microinstruction: ALUctrl SRC1 SRC2 RegCtrl Memory PCWrite Sequen Add A Extend Disp 2! 5-15 5-16

Put It All Together Control Implementations Label ALU control SRC1 SRC2 Register control Memory PCWrite control Sequencing Fetch Add PC 4 Read PC ALU Seq Add PC Extshft Read Dispatch 1 Mem1 Add A Extend Dispatch 2 LW2 Read ALU Seq Write MDR Fetch SW2 Write ALU Fetch Rformat1 Func code A B Seq Write ALU Fetch BEQ1 Sub A B ALUOut-cond Fetch JUMP1 Jump address Fetch! What would a microassembler do?!! The big picture: O p 5 O p 4 ()*+,)-!.)/$ How to implement the control logic?! Random logic, memory-baed, mux-based,... O p 3 O p 2 3*&2+! O p 1 O p 0 3*%+,20+$)*!45/$%+5,! 1&0)65!7$5-6! S 3 12+&2+! S 2 S 1 S 0 S t a t e r e g i s t e r P C W r i t e P C W r i t e C o n d I o r D M e m R e a d M e m W r i t e I R W r i t e M e m t o R e g P C S o u r c e A L U O p A L U S r c B A L U S r c A R e g W r i t e R e g D s t N S 3 N S 2 N S 1 N S 0 5-17 5-18 Memory-Based Implementation!!Important factors to consider when using a memory: "! How many address lines? 6 bits for opcode, 4 bits for state = 10 address lines "! How many output bits? 16 datapath-control outputs, 4 state bits = 20 outputs "! So the ROM size is 2 10 x 20 = 20K bits!!how many entries (or addresses) contain distinct values? "! many outputs are the same or don t cares so can be rather wasteful Alternatives for Control Implementation!!Use hardwired random logic "! Efficient especially if you have a good CAD tool (not Xilins, ok) "! Not as flexible as memory-based!!can we do better? "! Use an explicit sequencer so avoid storing unused entries 5-19 5-20

Explicit Sequencer Address Select Logic!!How many input lines? "!4!!How many output lines? "!No. of control outs + No. of AddrCtrl '! ;665,! :.;!!),!41<! 3*&2+! 8+9+5! 12+&2+! ;66,5%%!85-50+! ;66,(+,-! :(=,$+5! :(=,$+5()*6! 3),#! <5>4596! <5>=,$+5! 34=,$+5! <5>+)45/! :(8)2,05! ;.?1&! ;.?8,0@! ;.?8,0;! 45/$%=,$+5! 45/$%#%+! 3*%+,20+$)*!45/$%+5,!1:!A$5-6! Dispatch ROM 1 Op Opcode name Value 000000 R-format 0110 000010 jmp 1001 000100 beq 1000 100011 lw 0010 101011 sw 0010 1 Adder Dispatch ROM 2 PLA or ROM State MUX 3! 2! 1! O!p! Instruction Reg OP field Dispatch ROM 1 Addr Ctrl Addr Select Logic Dispatch ROM 2 Op Opcode name Value 100011 lw 0011 101011 sw 0101 5-21 5-22 Address Control Action The Complete Design State No. Address-control action Value of AddrCtl 0 Use incremented state 3 1 Use dispatch ROM 1 1 2 Use dispatch ROM 2 2 3 Use incremented state 3 4 Replace state number by 0 0 5 Replace state number by 0 0 6 Use incremented state 3 7 Replace state number by 0 0 8 Replace state number by 0 0 9 Replace state number by 0 0 C o n t r o l u n i t 1 A d d e r ()*+,)-!8+),5! O u t p u t s I n p u t M i c r o p r o g r a m c o u n t e r A d d r e s s s e l e c t l o g i c O p [ 5 0 ] P C W r i t e P C W r i t e C o n d I o r D M e m R e a d M e m W r i t e I R W r i t e M e m t o R e g P C S o u r c e A L U O p A L U S r c B A L U S r c A R e g W r i t e R e g D s t A d d r C t l D a t a p a t h! I n s t r u c t i o n r e g i s t e r o p c o d e f i e l d How to implement the control store? PLA, ROM? 5-23 5-24

Microcode: Trade-offs!! Distinction between specification and implementation is sometimes blurred!! Specification Advantages: "! Easy to design and write "! Design architecture and microcode in parallel!! Implementation (off-chip ROM) Advantages "! Easy to change since values are in memory "! Can emulate other architectures "! Can make use of internal registers!! Implementation disadvantages, SLOWER now that: "! Control is implemented on same chip as processor "! ROM is no longer faster than RAM "! No need to go back and make changes Exceptions!! Exceptions: unexpected events from within the processor "! arithmetic overflow "! undefined instruction "! switching from user program to OS!! Interrupts: unexpected events from outside of the processor "! I/O request!! Consequence: alter the normal flow of instruction execution!! Key issues: "! detection "! action #!save the address of the offending instruction in the EPC #!transfer control to OS at some specified address!! Exception type indication: "! status register "! interrupt vector 5-25 5-26 Exception Handling Datapath with Exception Handling!! Types of exceptions considered: "! undefined instruction "! arithmetic overflow!! MIPS implementation: "! EPC: 32-bit register, EPCWrite "! Cause register: 32-bit register, CauseWrite #!undefined instruction: Cause register = 0 #!arithmetic overflow: Cause register = 1 "! IntCause: 1 bit control "! Exception Address: C0000000 (hex)!! Detection: "! undefined instruction: op value with no next state "! arithmetic overflow: overflow from ALU!! Action: "! set EPC and Cause register "! set PC to Exception Address 5-27 5-28

FSM with Exception Handling 5-29