Designing MIPS Processor

Similar documents
Topics: A multiple cycle implementation. Distributed Notes

Instruction register. Data. Registers. Register # Memory data register

Review. Combined Datapath

Designing Single-Cycle MIPS Processor

CSE Computer Architecture I

Lecture 12: Pipelined Implementations: Control Hazards and Resolutions

CPU DESIGN The Single-Cycle Implementation

Lecture 9: Control Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

Concepts Introduced. Digital Electronics. Logic Blocks. Truth Tables

L07-L09 recap: Fundamental lesson(s)!

中村宏 工学部計数工学科 情報理工学系研究科システム情報学専攻 計算システム論第 2 1

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Computer Architecture Lecture 5: ISA Wrap-Up and Single-Cycle Microarchitectures

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

CPU DESIGN The Single-Cycle Implementation

Pipeline Datapath. With some slides from: John Lazzaro and Dan Garcia

Design of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017

Processor Design & ALU Design

Project Two RISC Processor Implementation ECE 485

Problem Class 4. More State Machines (Problem Sheet 3 con t)

Pipeline Datapath. With some slides from: John Lazzaro and Dan Garcia

Chapter 4 Supervised learning:

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

CSE Computer Architecture I

Fast Path-Based Neural Branch Prediction

UNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL

Outcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers

61C In the News. Processor Design: 5 steps

Section 7.4: Integration of Rational Functions by Partial Fractions

CPSC 3300 Spring 2017 Exam 2

Implementing the Controller. Harvard-Style Datapath for DLX

[2] Predicting the direction of a branch is not enough. What else is necessary?

CDS 110b: Lecture 1-2 Introduction to Optimal Control

Lecture Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2

Chapter 3 MATHEMATICAL MODELING OF DYNAMIC SYSTEMS

3. (2) What is the difference between fixed and hybrid instructions?

CSCI-564 Advanced Computer Architecture

Computer Architecture

[2] Predicting the direction of a branch is not enough. What else is necessary?

Technical Note. ODiSI-B Sensor Strain Gage Factor Uncertainty

Spiral 1 / Unit 3

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

EC 413 Computer Organization

FRTN10 Exercise 12. Synthesis by Convex Optimization

4. (3) What do we mean when we say something is an N-operand machine?

EXPT. 5 DETERMINATION OF pk a OF AN INDICATOR USING SPECTROPHOTOMETRY

FRÉCHET KERNELS AND THE ADJOINT METHOD

Department of Industrial Engineering Statistical Quality Control presented by Dr. Eng. Abed Schokry

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?

On the circuit complexity of the standard and the Karatsuba methods of multiplying integers

Assignment Fall 2014

Reducing Conservatism in Flutterometer Predictions Using Volterra Modeling with Modal Parameter Estimation

A Note on Johnson, Minkoff and Phillips Algorithm for the Prize-Collecting Steiner Tree Problem

Model Predictive Control Based Energy Management Algorithm for a Hybrid Excavator

CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.

Pulses on a Struck String

Chapter 2 Difficulties associated with corners

Introdction Finite elds play an increasingly important role in modern digital commnication systems. Typical areas of applications are cryptographic sc

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati

TEST 1 REVIEW. Lectures 1-5

1. Tractable and Intractable Computational Problems So far in the course we have seen many problems that have polynomial-time solutions; that is, on

Review: Single-Cycle Processor. Limits on cycle time

Scheduling parallel jobs to minimize the makespan

Classify by number of ports and examine the possible structures that result. Using only one-port elements, no more than two elements can be assembled.

Online Solution of State Dependent Riccati Equation for Nonlinear System Stabilization

Step-Size Bounds Analysis of the Generalized Multidelay Adaptive Filter

Lab Manual for Engrd 202, Virtual Torsion Experiment. Aluminum module

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk

A Second Datapath Example YH16

VIBRATION MEASUREMENT UNCERTAINTY AND RELIABILITY DIAGNOSTICS RESULTS IN ROTATING SYSTEMS

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it...

PC1 PC4 PC2 PC3 PC5 PC6 PC7 PC8 PC

Math 144 Activity #10 Applications of Vectors

A New Approach to Direct Sequential Simulation that Accounts for the Proportional Effect: Direct Lognormal Simulation

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

INPUT-OUTPUT APPROACH NUMERICAL EXAMPLES

Microprocessor Power Analysis by Labeled Simulation

Dynamic Optimization of First-Order Systems via Static Parametric Programming: Application to Electrical Discharge Machining

PREDICTABILITY OF SOLID STATE ZENER REFERENCES

Formal Verification of Systems-on-Chip

Clock T FF1 T CL1 T FF2 T T T FF T T FF T CL T FF T CL T FF T T FF T T FF T CL. T cyc T H. Clock T FF T T FF T CL T FF T T FF T CL.

COMPARATIVE STUDY OF ROBUST CONTROL TECHNIQUES FOR OTTO-CYCLE MOTOR CONTROL

Convergence analysis of ant colony learning

Figure 4.9 MARIE s Datapath

We automate the bivariate change-of-variables technique for bivariate continuous random variables with

ECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)

Reflections on a mismatched transmission line Reflections.doc (4/1/00) Introduction The transmission line equations are given by

Discontinuous Fluctuation Distribution for Time-Dependent Problems

Calculations involving a single random variable (SRV)

Lecture 13: Sequential Circuits, FSM

Elements of Coordinate System Transformations

I block CLK 1 CLK 2. Oscillator - Delay block. circuit. US Al. Jun.28,2011 P21 P11 P22 P12. PlN P2N. (19) United States

Decoder Error Probability of MRD Codes

u P(t) = P(x,y) r v t=0 4/4/2006 Motion ( F.Robilliard) 1

III. Demonstration of a seismometer response with amplitude and phase responses at:

Enrico Nardelli Logic Circuits and Computer Architecture

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

Essentials of optimal control theory in ECON 4140

FEA Solution Procedure

Lecture: Corporate Income Tax - Unlevered firms

Transcription:

CSE 675.: Introdction to Compter Architectre Designing IPS Processor (lti-cycle) Presentation H Reading Assignment: 5.5,5.6 lti-cycle Design Principles Break p eection of each instrction into steps. The nmber of steps and the tasks in each step are instrction dependent. Each step takes one clock cycle. Balance the amont of work to be done in each clock cycle. Restrict each cycle to se only one major fnctional nit in the data path, or if more than one major fnctional nit sed they shold be sed in parallel. ajor nits are memory, register file and ALU, since we assme that they introdce the most significant delays dring eection of instrctions. We assme all other delays in the wiring is negligible. g. babic Presentation H

lti-cycle Design Principles (cont.) Dring eection of any instrction, we may be resing fnctional nits, bt in different steps (clock cycles), e.g. Single memory can be sed for instrction and data, ALU will be sed to compte not only tasks it performed in the single-cycle design (e.g. lw & sw addresses and R-type instrction calclations), bt it will be sed to increment PC (by ) and to calclate branch target address. Control signals will not be determined solely by the instrction in eection (i.e. its op-code and/or fnction code) bt also by the particlar clock cycle the instrction is being eected in. At the end of each cycle dring instrction eection store intermediate vales for se in later cycles. For that prpose, introdce additional internal registers. g. babic Presentation H Elaboration on Work Balance in Each Step Dring any given step it is not allowed to have a serial combination of sage of the major fnctional nits; for eample: It is not allowed that in one step contents of registers are read from the register file and then those contents are sed as operands for ALU in the same step, or It is not allowed that in one step ALU performs a fnction on some operands and its reslt is sed as an address for memory read or write in the same step. This principle is introdced to avoid that any step reqires too mch time, implying that clock cycles have to be of that nnecessary length. Notice that two of the major fnctional nits are allowed to be sed in parallel, e.g. reading contents from a register file and the ALU performing a fnction on nrelated data at the same time. g. babic Presentation H

Five Steps In Instrction Eection ajor steps in eection of an instrction are: Instrction Fetch Instrction Decode and Register Fetch Eection, emory Address Comptation, or Branch Completion emory Access or R-type instrction completion Write-back step Not every instrction will have all those steps Or instrctions will take -5 steps, i.e. -5 clock cycles. The first two steps are common to all instrctions. g. babic Presentation H 5 lti-cycle Datapath High Level View Figre 5.5 The se of shared fnctional nits reqires new temporary registers that hold data between clock cycles of the same instrction. The additional registers are: Instrction register (IR), emory data register (DR), A and B registers, ALUot register. 6

lti-cycle Datapath Detailed View I o r D e m e m I R R e g D s t R e g A L U S r c A P C A d d r e s s e m o r y e m D a t a [ 5 ] [ 6 ] [ 5 ] [ 5 ] R e g i s t e r s A B Z e r o A L U A L U r e s l t A L U O t [ 5 ] e m o r y 6 S i g n e t e n d S h i f t l e f t A L U c o n t r o l [ 5 ] Figre 5.7 with additions in red e m t o R e g A L U S r c B A L U O p g. babic Presentation H 7 lti-cycle Datapath and Control P C C o n d P C S o r c e P C I o r D e m O t p t s A L U O p A L U S r c B e m C o n t r o l A L U S r c A e m t o R e g R e g P C A d d r e s s e m o r y e m D a t a [ - 6 ] [ 5 ] [ 6 ] [ 5 ] [ 5 ] e m o r y I R [ 5 ] O p [ 5 ] R e g D s t [ 5 ] 6 8 S h i f t 6 R e g i s t e r s S i g n e t e n d S h i f t l e f t A B A L U c o n t r o l l e f t P C [ - 8 ] Z e r o A L U A L U r e s l t J m p a d d r e s s [ - ] A L U O t [ 5 ] Figre 5.8 g. babic Presentation H 8

Step : Instrction Fetch Use PC to get instrction and pt it in the Instrction Register, i.e. IR emory[pc]; IorD=, emread, IRWrite Increment the PC by and pt the reslt back in the PC, i.e. PC [PC] + ; ALUSrcA=, ALUSrcB=, ALUOp=, PCSorce=, PCWrite Here are rles for signals that are omitted: If signal for m is not stated, it is don t care If ALU signals are not stated, they are don t care If emread, emwrite, RegWrite, IRWrite, PCWrite or PCWriteCond is not stated, it is nasserted, i.e. logical. g. babic Presentation H 9 Step : Instrction Decode & Register Fetch We aren't setting any control lines based on the instrction type, since we are bsy "decoding" it in or control logic. Read registers rs and rt in case we need them: A Reg[IR[5-]]; B Reg[IR[-6]]; Done atomatically Compte the branch address in case the instrction is a branch: ALUOt PC + (sign-etend(ir[5-]) << ); ALUSrcA=, ALUSrcB=, ALUOp= g. babic Presentation H 5

Step : Eecte, em Addr, Branch ALU is performing one of three fnctions, based on instrction type emory Reference (lw or sw): ALUOt A + sign-etend(ir[5-]); ALUSrcA=, ALUSrcB=, ALUop= R-type: ALUOt A op B; ALUSrcA=, ALUSrcB=, ALUp= Branch on Eqal: if (A==B) PC ALUOt; ALUSrcA=, ALUSrcB=, ALUp= PCSorce=, PCWriteCond Note: beq instrction is done, ths this instrction reqires clock cycles to eecte. g. babic Presentation H Steps and 5: Instrction Dependent Step : R-type and emory Access Loads and stores access memory DR emory[aluot] (load); or emory[aluot] B (store); R-type instrctions finish Reg[IR[5-]] ALUOt; IorD=, emread IorD=, emwrite RegDst=, emtoreg=, RegWrite Register write actally takes place at the end of the cycle on the falling edge Store and R-type instrctions are done in clock cycles Step 5: Write back (load only) RegDst=, emtoreg=, Reg[IR[-6]] DR RegWrite g. babic Presentation H 6

Smmary of Instrction Eections Figre 5. Step name Action for R-type instrctions Action for memory-reference instrctions Action for branches Action for jmps Instrction fetch IR emory[pc] PC PC + Instrction decode/register fetch A Reg [IR[5-]] B Reg [IR[-6]] ALUOt PC + (sign-etend (IR[5-]) << ) Eection, address ALUOt A op B ALUOt A + sign-etend if (A ==B) then PC PC [-8] II comptation, branch/ (IR[5-]) PC ALUOt (IR[5-]<<) jmp completion emory access or R-type Reg [IR[5-]] Load: DR emory[aluot] completion ALUOt or Store: emory [ALUOt] B emory read completion Load: Reg[IR[-6]] DR Note: Jmp instrction added: PCSorce=, PCWrite g. babic Presentation H Implementing Control Vales of control signals are dependent on: what instrction is being eected and which step (i.e. clock cycle) is being performed. Use the information we ve accmlated to specify a finite state machine FS: specify the finite state machine graphically, or se microprogramming. Then, an implementation can be derived from specification. g. babic Presentation H 7

Finite State achines N e t s t a t e Figre B.. C r r e n t s t a t e N e t - s t a t e f n c t i o n C l o c k I n p t s O t p t f n c t i o n O t p t s A crrent state is kept in the Crrent state register. Net state fnction and Otpt fnction are determined by Crrent state and Inpts. In or case, Otpt fnction is based only on Crrent state. g. babic Presentation H 5 Finite State achine Graph for Control Unit e m o r y a d d r e s s c o m p t a t i o n A L U S r c A = A L U S r c B = A L U O p = d e c o d e / f e t c h f e t c h e m A L U S r c A = I o r D = A L U S r c A = S t a r t I R A L U S r c B = A L U S r c B = A L U O p = A L U O p = P C ( O p = ' L W ' ) o r ( O p = ' S W ' ) P C S o r c e = ( O p = R - t y p e ) B r a n c h E e c t i o n c o m p l e t i o n ( O p = ' B E Q ' ) J m p c o m p l e t i o n 6 8 9 A L U S r c A = A L U S r c A = A L U S r c B = P C A L U S r c B = A L U O p = P C S o r c e = A L U O p = P C C o n d P C S o r c e = ( O p = ' J ' ) ( O p = ' L W ' ) e m o r y a c c e s s e m o r y a c c e s s R - t y p e c o m p l e t i o n 5 7 Figre 5.8 e m I o r D = e m I o r D = R e g D s t = R e g e m t o R e g = - b a c k s t e p R e g D s t = R e g e m t o R e g = g. babic Presentation H 6 8

Implementation of FS for Control Unit P C P C C o n d I o r D e m e m C o n t r o l l o g i c I R e m t o R e g P C S o r c e O t p t s A L U O p A L U S r c B A L U S r c A R e g R e g D s t Figre C.. I n p t s N S N S N S N S O p 5 O p O p O p O p O p S S S S o p c o d e f i e l d S t a t e g. babic Presentation H 7 IPS Interrpt Processing We are implementing processing of only two eceptions: illegal op- code and integer overflow. When any of the eceptions occrs, IPS processor processes the eception (as any other interrpt) in the following steps: Step : EPC register gets a vale eqal to address of a falty instrction. Step.: PC 88 6 Case register a code of the eception illegal op-code = integer overflow = Step. Processor is now rnning in Kernel mode. Note: we are not implementing step. g. babic Presentation H 8 9

lti-cycle Datapath for Eception Handling P C C o n d P C O t p t s I o r D C a s e I n t C a s e E P C P C S o r c e e m A L U O p e m e m t o R e g I R W r t e C o n t r o l O p [ 5 ] A L U S r c B A L U S r c A R e g R e g D s t Overflow P C A d d r e s s e m o r y e m D a t a [ 5 ] 6 8 Add zeros [ - 6 ] ALU [ - 8 ] [ 5 ] [ 6 ] [ 5 ] [ 5 ] e m o r y [ 5 ] 6 R e g i s t e r s S i g n e t e n d S h i f t l e f t A B A L U c o n t r o l Z e r o A L U A L U r e s l t J m p a d d r e s s [ - ] 8 O 8 A L U O t E P C C a s e [ 5 ] Figre 5.9 with corrections in red g. babic Presentation H 9 FS Graph with Eception Handling e m o r y a d d r e s s c o m p t a t i o n A L U S r c A = A L U S r c B = A L U O p = S t a r t ( O p = ' L W ' ) o r ( O p = ' S W ' ) 6 d e c o d e / f e t c h R e g i s t e r f e t c h e m A L U S r c A = I o r D = A L U S r c A = I R A L U S r c B = A L U S r c B = A L U O p = A L U O p = P C P C S o r c e = E e c t i o n A L U S r c A = A L U S r c B = A L U O p = 8 ( O p = R - t y p e ) B r a n c h c o m p l e t i o n A L U S r c A = A L U S r c B = A L U O p = P C C o n d P C S o r c e = ( O p = ' B E Q ' ) 9 ( O p = ' J ' ) ALUOp= ( O p = o t h e r ) J m p c o m p l e t i o n P C P C S o r c e = ALUSrcA= ALUSrcB= ( O p = ' L W ' ) e m o r y a c c e s s 5 e m I o r D = e m o r y a c c e s s R - t y p e c o m p l e t i o n I n t C a s e = 7 C a s e R e g D s t = A L U S r c A = e m R e g O v e r f l o w A L U S r c B = I o r D = e m t o R e g = A L U O p = E P C P C P C S o r c e = I n t C a s e = C a s e A L U S r c A = A L U S r c B = A L U O p = E P C P C P C S o r c e = - b a c k s t e p O v e r f l o w R e g e m t o R e g = R e g D s t = Figre 5. with additions in red g. babic Presentation H