EE 361L Fall 2010 Pipelined MIPS L0 (PMIPS L0) and Pipelined MIPS L (PMIPS L)

Similar documents
CS:APP Chapter 4 Computer Architecture Pipelined Implementation

Overview. Real-World Pipelines: Car Washes. Computational Example. 3-Way Pipelined Version. Pipeline Diagrams

Michela Taufer CS:APP

Idea Divide process into independent stages Move objects through stages in sequence At any given Fmes, mulfple objects being processed

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

EECE 301 Signals & Systems Prof. Mark Fowler

u 3 = u 3 (x 1, x 2, x 3 )

That is, we start with a general matrix: And end with a simpler matrix:

EXST Regression Techniques Page 1

EEO 401 Digital Signal Processing Prof. Mark Fowler

Differentiation of Exponential Functions

Chapter 6 Folding. Folding

UNTYPED LAMBDA CALCULUS (II)

Addition of angular momentum

Einstein Equations for Tetrad Fields

Higher order derivatives

Aim To manage files and directories using Linux commands. 1. file Examines the type of the given file or directory

CS 361 Meeting 12 10/3/18

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0

Hydrogen Atom and One Electron Ions

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

1997 AP Calculus AB: Section I, Part A

Addition of angular momentum

Physical Organization

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list.

Probability Translation Guide

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

Random Access Techniques: ALOHA (cont.)

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES. 1. Statement of results

A Propagating Wave Packet Group Velocity Dispersion

Quasi-Classical States of the Simple Harmonic Oscillator

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

1997 AP Calculus AB: Section I, Part A

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon.

EEO 401 Digital Signal Processing Prof. Mark Fowler

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

(1) Then we could wave our hands over this and it would become:

1 Minimum Cut Problem

Strongly Connected Components

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

cycle that does not cross any edges (including its own), then it has at least

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

General Notes About 2007 AP Physics Scoring Guidelines

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Brief Introduction to Statistical Mechanics

MCE503: Modeling and Simulation of Mechatronic Systems Discussion on Bond Graph Sign Conventions for Electrical Systems

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Introduction to the quantum theory of matter and Schrödinger s equation

What are those βs anyway? Understanding Design Matrix & Odds ratios

Construction of asymmetric orthogonal arrays of strength three via a replacement method


Week 3: Connected Subgraphs

First derivative analysis

10. The Discrete-Time Fourier Transform (DTFT)

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

Classical Magnetic Dipole

Alpha and beta decay equation practice

Final Exam Solutions

MA 262, Spring 2018, Final exam Version 01 (Green)

Problem Set #2 Due: Friday April 20, 2018 at 5 PM.

Schematic of a mixed flow reactor (both advection and dispersion must be accounted for)

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

Linear-Phase FIR Transfer Functions. Functions. Functions. Functions. Functions. Functions. Let

Integration by Parts

Elements of Statistical Thermodynamics

ECE 2210 / 00 Phasor Examples

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim.

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages:

2008 AP Calculus BC Multiple Choice Exam

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

As the matrix of operator B is Hermitian so its eigenvalues must be real. It only remains to diagonalize the minor M 11 of matrix B.

6.1 Integration by Parts and Present Value. Copyright Cengage Learning. All rights reserved.

Lie Groups HW7. Wang Shuai. November 2015

Computing and Communications -- Network Coding

Gradebook & Midterm & Office Hours

Recall that by Theorems 10.3 and 10.4 together provide us the estimate o(n2 ), S(q) q 9, q=1

4. (5a + b) 7 & x 1 = (3x 1)log 10 4 = log (M1) [4] d = 3 [4] T 2 = 5 + = 16 or or 16.

Derangements and Applications

SCHUR S THEOREM REU SUMMER 2005

Section 11.6: Directional Derivatives and the Gradient Vector

PHA 5127 Answers Homework 2 Fall 2001

2.3 Matrix Formulation

Search sequence databases 3 10/25/2016

The second condition says that a node α of the tree has exactly n children if the arity of its label is n.

3 2x. 3x 2. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems

Chapter 13 Aggregate Supply

LR(0) Analysis. LR(0) Analysis

Basic Polyhedral theory

The Matrix Exponential

Chapter 10. The singular integral Introducing S(n) and J(n)

1 Input-Output Stability

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

Transcription:

EE 361L Fall 2010 iplind S L0 (S L0) and iplind S L (S L) Last updatd: Novmbr 8, 2010 1. ntroduction S L0 and S L ar piplind vrsions of SL (for S Lit). Appndix A has a dscription of th SL procssor. S L0 and S L hav fwr instructions. Ths instructions ar typ, addi, lw, sw, and bq. Th othr instructions (j, jal and jr) ar not implmntd. S L0 and S L hav fiv piplin stas just as th piplind S in haptr 4 in th txtbook as shown in Fiur 1. Th S L0 only allows on instruction to b procssd at a tim. This mans that durin th xcution of an instruction, th instruction ftch (F) sta must stall and th controllr must insrt bubbls into th piplin. This is xplaind in Sction 2. Thus, S L0 dosn t rally piplin its instructions. S L dos som piplinin as dscribd in Sction 3. arts of th datapath ar th sam as in th sinl cycl S xcpt th controllr, piplin sta ristrs, and prhaps a modification to th loic. Fiur 1. Fiv sta iplin S (Fi. 4.51 in txtbook). All th squntial componnts ar synchronizd with th positiv clock d xcpt th ristr fil, which is synchronizd with th nativ clock d.

2. S L0 For this procssor, only on instruction is bin procssd at a tim, ach instruction takin xactly four clock cycls. Th followin ar th stats: Stat 0, nstruction Ftch: = +2, and insrt Bubbl into / sta. Stat 1, nstruction cod: holds valu, and th controllr sts control sinals to / sta that is dpndnt on th opcod in F/. Stat 2, Bubbl: holds its valu, and insrt Bubbl into / sta. Stat 3, Bubbl: holds its valu, and insrt Bubbl into / sta. Fiur 3 shows an xampl of bq prorssin throuh th computr. Not that at Stat 2, th bq instruction is at th /E piplin ristr. At this point, Src (= EBranch & EALUZro) has th outcom of th condition of bq, i.., Src = 1 mans th computr should branch. f Src = 1 thn this will caus th to load in th tart branch addrss of bq at Stat 3. bq ycl 1, Stat 0 ycl 2, Stat 1 bq Stall N F ycl 3, Stat 2 Stall N Bub F bq E E F F ycl 4, Stat 3 Stall N Bub Bub E E bq Fiur 3. rorssion of bq instruction in S L0. Hr bub is a bubbl, and N is nxt instruction aftr bq. Loic: Not that th loic must b abl to load th tart branch addrss into th whil th is in th stalld stat. Think about how you can modify th loic by addin a 2:1 multiplxr. A solution is providd in Appndix B. ontrollr: This is a squntial circuit with four stats. t rpatdly os throuh stats 0, 1, 2, 3, 0,... Th controllr inputs th opcod of th instruction from th F/ ristr. Whn th controllr insrts a bubbl, it must disabl all th control lins. Th critical control lins to disabl ar th ons that can chan stat valus in ristrs or mmory. Ths ristrs and mmory ar locatd in th ristr fil,

proram countr and data mmory. Thus, to insrt a bubbl mans that th ristr writ and mmory writ must b disabld. W must also prvnt th from bin loadd inadvrtntly. call that durin a stall, th controllr will st th to a stalld stat. Thn th only way th can b inadvrtntly st to th wron valu is du to an inadvrtnt branch. To prvnt this, st Branch= 0. iplin istrs: Ths ristrs ar basically just a bank of flip flops. Thr should b nouh flip flops to load all th critical sinal information. As an xampl, th / ristr has th followin filds: : {Writ,mto} : {mad,mwrit,branch} : {ALUSrc,ALUOp,st} 16 bit output of th sin xtnsion Outputs of th rad data from th ristr fil Two ristr filds (5 bits ach) from th instruction Thr ar two mmoris: instruction mmory that has th proram, and data mmory which has A. For EE 361L lab 5, you will implmnt S L0 in an FGA. Also, your computr will nd nput/output (O). Th O will b connctd to othr circuits on th iilnt Basys boards. n particular, th O of your computr will b attachd to a svn smnt dlay and slidin switchs. Your computr will nd two O ports, on for th display and th othr for th switchs. Th port for th svn smnt display is 0xfffa, and th port for th slidin switchs is 0xfff0. Ths addrsss ar in th addrss spac of data mmory. So you will accss th O usin th lw and sw instructions. Th svn smnt display can b writtn to, and th slidin switchs can b rad from. Th slidin switchs ar labld SW1 and SW0 and ar th last two bits of th data rad from 0xfff0. Th A portion of th data mmory has addrss 64 up to 126. Not that sinc th mmory is bytaddrssabl and ach mmory word is now 16 bits (= 2 byts), thn ach mmory word has addrsss divisibl by two. Th rason why th A has addrsss from 64 to 126 rathr than 0 to 62 is du to th fact that m rusin this mmory modul that was usd in anothr projct. Anyway, it s ood practic to work with mmory that s at an odd location sinc in ral systms, mmory dos not always start from addrss 0.

3. S L W will piplin th instructions furthr but avoid any data or control hazards. Th procssor will hav simpl branch prdiction, and in particular branch untakn (which mans it prdicts that th branch will not b takn). Hr ar th two parts of th procssor to modify. Simpl Branch rdiction, Branch Untakn: W introduc an additional bit into th F/ piplin ristr calld Valid,.., call it FValid. t indicats if th instruction in th ristr is valid. f it is invalid thn th controllr assums it is a nop. By dfault, FValid = 1. Whn a branch is xcutd at th sta, it can caus Src to bcom 1. This mans that th computr will jump to th tart branch addrss, i.., = tart branch addrss. Thn all th instructions in th piplin should b flushd out as follows: ut bubbls or clar th piplin ristrs / and /E, and invalidat th valu in th F/ ristr by sttin FValid = 0. Th followin fiur illustrats what happns whn a branch is takn. ycl 1 bq ycl 2 Tart nvalid Bubbl Branch Bubbl Addrss bq F E F E iplinin and Avoidin ata Hazards: Thr ar two typs of data hazards, thos that dal with ristr fil valus and thos that dal with data mmory valus. u to th load/stor architctur of th 5 sta S, thr ar no data hazards throuh data mmory. Thus, w will only concrn ourslvs with data hazards throuh th ristr fil. An xampl is as follows: add $1,$2,$3 sw $4,10($1) addi $5,$1,37 Not that addi and sw ar dpndnt on add via ristr $1. Thus, th addi and sw should b prvntd from ntrin th piplin until $1 is proprly updatd.

nstructions that writ to a ristr ar labld data hazard instructions. Th ristr that thy writ to will b rfrrd to as th dstination ristr. Amon th fiv instructions (addi, typ, lw, sw, and bq), th datahazard instructions ar addi, typ, and lw. On way to avoid data hazards is to prvnt instructions in th F/ ristr from ntrin th piplin if thr is a data hazard du to a data hazard instruction alrady downstram in th piplin. For xampl, considr th followin two instructions. add $1,$2,$3 sw $4,10($1) Not that sw is dpndnt on add via ristr $1. Now suppos sw is at th F/ piplin ristr. f add is at th / or /E piplin ristrs thn w should stall sw from ntrin into th procssor piplin, othrwis thr will b an rror. Howvr, if add is in th E/ thn th nw valu of $1 will b writtn back into th ristr fil on th nativ clock d, and will b availabl for sw. Thus, sw can b snt into th piplin procssor. or nrally, if thr is an instruction A in th F/ piplin ristr, it should b prvntd from ntrin th piplin procssor if Thr is a data hazard instruction B at / or /E and th dstination ristr for B is on of th sourc ristrs of A. To kp track of all this, th piplin ristrs / and /E must b xpandd. n particular, / nds nw ristrs o atahazard, which is a sinl bit indicatin whthr th instruction at this sta is a datahazard instruction o st, which is a 3 bit valu that is th dstination ristr of th instruction. Similarly, /E nds nw ristrs EataHazard and Est. To prvnt an instruction from ntrin th piplin from F/, th controllr must th kp th from incrmntin by 2 (howvr, it should allow a branch from a downstram piplin sta), and to insrt a bubbl into th / piplin ristr. Not that a bubbl implis sttin atahazard = 0 bcaus it is a nop and cannot caus a data rror. Othrwis, th controllr should incrmnt th by 2 and snd th control sinals of th instruction in F/ into /. Hr is an alorithm for th controllr to follow. t has two lvls of procssin. Th first lvl is to idntify all th ristrs that should b compard to dtct a data hazard. Lt ompar b a 4 bit valu whr if (atahazard == 1 && st!= 0) ompar = st ls ompar = 8 Not that ristrs that can lad to data hazards ran from 1 7 (ristr $0 cannot lad to a data hazard), so a valu 8 implis thr is no data hazard at this sta.

Similarly, lt Eompar b a 4 bit valu whr if (EataHazard == 1 && Est!= 0) Eompar = Est ls Eompar = 8 Lt Src1 and Src2 b 4 bit valus that idntify sourc ristrs for th instruction in th F/ ristr. call that som ristrs hav on sourc ristr (addi and lw), whil othrs hav two sourc ristrs (typ, sw, and bq). f thr is only on sourc ristr, Src2 = 9, which indicats thr is no scond sourc ristr. Not that th ristr indics can b found th machin instruction in F/. W rfr to th first ristr, scond, and third ristr filds in th machin instruction by FFild1, FFild2, and FFild3. if (FValid == 0) {Src1 = 9; Src2 = 9;} ls { if (FOpcod == typ or SW or BEQ) { Src1 = FFild1; Src2 = FFild2; } ls if (FOpcod == LW or A) { Src1 = FFild1; Src2 = 9; } ls { Src1 = 9; Src2 = 9; } } Th scond lvl of procssin is to compar ths ristr indics: if ((Src1 == ompar) (Src2 == ompar) (Src1 == Eompar) (Src2 == Eompar)) { Stall and insrt bubbl; } ls { = +2; insrt instruction into ; } Appndix shows th piplind procssor xcutin a proram that multiplis 3 and 5. Your task is to implmnt this piplind SL. To modify your dsin from art 1, considr th followin: You must modify your piplin ristrs so it now has FValid, atahazard, EataHazard, st, and Est. Your controllr in art 1 was a squntial circuit with four stats, which can b stord in a 2 bit ristr. But now your controllr in art 2 will b a combinational circuit. This mans it has no clock input. On th othr

hand, it must hav additional inputs for th followin ristrs: FValid, atahazard, EataHazard, st, Est, and th opcod and ristr filds from th instruction stord in F/. Not that th controllr and piplin ristrs will rspond to inputs st and Src. st is th hihst priority input, thn Src. o st: Whn this is assrtd, = 0, FValid = 0, atahazard = 0, EataHazard = 0, Branch = 0, and EBranch = 0. You may hav to clar othr ristrs too. o Src: You hav to modify your piplin ristrs so you can clar and E, as wll as st FValid to 0. For Subprojct 4, rdsin your piplin procssor to b abl to xcut th proram in 1.V, which is th multiply proram in Appndix. will hav th tstbnch and output trac availabl soon. t ll probably look lik th tstbnch for Subprojct 1.

Appndix A SL scription: ts data and addrss buss ar 16 bits wid rathr than 8. All instructions and oprands (xcpt AS charactrs) ar 16 bits, i.., words ar 16 bits. mory is byt addrssabl and is oranizd as Bi Endian. Not that mmory addrsss of words ar divisibl by two. Gnral urpos istrs Nam # Usa rsrvd on call? $zro 0 th constant 0 n.a. $v0 $v1 1 2 valus for rsults and xprssion valuation No $t0 $t2 3 5 tmporaris No $sp 6 stack pointr Ys $ra 7 rturn addrss No nstruction Formats Nam Filds ommnts Fild Siz 3 bits 3 bits 3 bits 3 bits 4 bits ALL S L instructions 16 bits format op rs rt rd funct Arithmtic instruction format format op rs rt Addrss/ immdiat Transfr, branch, immdiat format J format op tart addrss Jump instruction format

nstructions: Nam Format Exampl ommnts 3 bits 3 bits 3 bits 3 bits 4 bits add 0 2 3 1 3 add $1,$2,$3 sub 0 2 3 1 2 sub $1,$2,$3 and 0 2 3 1 6 and $1,$2,$3 or 0 2 3 1 7 or $1,$2,$3 slt 0 2 3 1 4 slt $1,$2,$3 jr 0 7 0 0 5 jr $7 lw 5 2 1 100 lw $1,100($2) sw 6 2 1 100 sw $1,100($2) bq 2 1 2 (offst to 100)/2 bq $1,$2,100 addi 3 2 1 100 addi $1,$2,100 j J 7 5000 j 10000 jal J 1 5000 jal 10000 Loads rturn addrss in $7 Not that jal loads th rturn addrss into $7, and that $0 is always qual to zro.

Appndix B Not that th loic has to b abl to load th tart branch addrss into th whil th is in th stalld stat. This can b don as follows. Tak a look at Fiur 1 and th circuits at th. Not that th input has a 2:1 multiplxr with slct input Src. For all instructions, xcpt bq, Src = 0 so that is incrmntd by 4. Howvr, whn th bq is bin xcutd thn Src may qual 1 if th branch tst is tru. Thn th branch tart addrss is loadd from input 1 of th multiplxr. W can modify this by insrtin a nw 2:1 multiplxr. Th output of this multiplxr is attachd to input 0 of th old multiplxr. Th nw multiplxr has its input 0 connctd to th addr circuit that is qual to +4. Slctin this input would incrmnt th. Th nw multiplxr s input 1 is connctd to th output of. Slctin this input would kp th valu th sam. Th slct input of this nw multiplxr should b calld Stall. Stall is connctd to th controllr. Whn th controllr wants to stall th instruction ftch, it sts Stall to 1. Thn th valu stays th sam. f th controllr dosn t want to stall th instruction ftch thn it sts Stall to 0. Thn th would normally b incrmntd by 4 unlss thr is a branch.

Appndix 0: L0 addi $2,$0,3 // addi1 2: add $4,$0,$0 4: L1 bq $2,$0,L0 // bq1 6: addi $4,$4,5 // addi2 8: addi $2,$2, 1 // addi3 10: bq $0,$0,L1 // bq2 12: Nxt instruction // N1 som random instruction 14: Nxt nstruction 2 // N2 som random instruction 16: Nxt instruction 3 // N3 som random instruction, tc mory Output F/ (src r1, src r2) / [atahaz, st] /E [atahaz, st] E/ ommnts: Actions by controllr and datapath 0 addi1 2 add addi1 (0, ) 4 bq1 add (0,0) addi1 [1,2] 6 addi2 bq1 (2,0) add [1,4] addi1 [1,2] Stall: hazard via $2 6 addi2 bq1 (2,0) Bubbl add [1,4] addi1 8 addi3 addi2 (4, ) bq1 [, ] Bubbl add 10 bq2 addi3 (2, ) addi2 [1,4] bq1 [, ] Bubbl Not branchin 12 N1 bq2 (0,0) addi3 [1,2] addi2 [1,4] bq1 14 N2 N1 bq2 [0, ] addi3 [1,2] addi2 16 N3 N2 N1 bq2 [0, ] addi3 Branchin to addrss 4, Src = 1 4 bq1 nvalid Bubbl Bubbl bq2 6 addi2 bq1 (2,0) Bubbl Bubbl Bubbl 8 addi3 addi2 (4, ) bq1 [, ] Bubbl Bubbl 10 bq2 addi3 (2, ) addi2 [1,4] bq1 [, ] Bubbl Not branchin 12 N1 bq2 (0,0) addi3 [1,2] addi2 [1,4] bq1 14 N2 N1 bq2 [0, ] addi3 [1,2] addi2 16 N3 N2 N1 bq2 [0, ] addi3 Branchin to addrss 4, Src = 1 4 bq1 nvalid Bubbl Bubbl bq2 6 addi2 bq1 (2,0) Bubbl Bubbl Bubbl 8 addi3 addi2 (4, ) bq1 [, ] Bubbl Bubbl 10 bq2 addi3 (2, ) addi2 [1,4] bq1 [, ] Bubbl Not branchin 12 N1 bq2 (0,0) addi3 [1,2] addi2 [1,4] bq1 14 N2 N1 bq2 [0, ] addi3 [1,2] addi2 16 N3 N2 N1 bq2 [0, ] addi3 Branchin to addrss 4, Src = 1 4 bq1 nvalid Bubbl Bubbl bq2 6 addi2 bq1 (2,0) Bubbl Bubbl Bubbl 8 addi3 addi2 (4, ) bq1[, ] Bubbl Bubbl 10 bq2 addi3 (2, ) addi2 [1,4] bq1 [, ] Bubbl Branchin to addrss 0, Src = 1 0 addi1 nvalid Bubbl Bubbl bq1 2 add addi1 (0, ) Bubbl Bubbl Bubbl