Pipeline Datapath. With some slides from: John Lazzaro and Dan Garcia
|
|
- Lenard Richard
- 5 years ago
- Views:
Transcription
1 Pipeline Datapath With some slides from: John Lazzaro and Dan Garcia
2 The single cycle CPU Instrction [25 ] Shift Jmp address [3 ] left PC+ [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg Op em Src Reg Shift left 2 reslt PC address Instrction [3 ] Instrction Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers 2 register Zero reslt ress Data Instrction [5 ] 6 Sign etend control Instrction [5 ]
3 lticycle implementation with Control PC ress emory emdata Instrction [25 2] Instrction [2 6] Instrction [5 ] Instrction register Instrction [5 ] emory register PCCond PC IorD Otpts em em emtoreg IR Control Op [5 ] Instrction [5 ] PCSorce Op SrcB 6 SrcA Reg RegDst Instrction [25 ] Shift left 2 Instrction [3-26] PC [3-28] register register 2 Registers register Sign etend 2 Shift left 2 A B 2 3 control Zero reslt Jmp address [3-] Ot 2 Instrction [5 ]
4 Advantages of mlticycle Save time: architectre instrctions take different nmber of clock cycles Save components (ths cost, power, heat): Can rese components for different parts of the cycle
5 Today: Pipelined architectre Better throghpt Improves overall delay/latency Better se of the hardware (ths power, heat) 5
6 Performance Eqation Seconds Program = Instrctions Program Cycles Instrction Seconds Cycle Goal is to optimize eection time, not individal eqation terms. achines are optimized with respect to program workloads. The CPI of the program. Reflects the program s instrction mi. Clock period. Optimize jointly with machine CPI. 6
7 Performance Eqation Seconds Program = Instrctions Program Cycles Instrction Seconds Cycle Goal is to optimize eection time, not individal eqation terms. achines are optimized with respect to program workloads. The CPI of the program. Reflects the program s instrction mi. Clock period. Optimize jointly with machine CPI. How to save in overall rntime? 7
8 Performance Eqation Seconds Program = Instrctions Program Cycles Instrction Seconds Cycle Goal is to optimize eection time, not individal eqation terms. achines are optimized with respect to program workloads. The CPI of the program. Reflects the program s instrction mi. Clock period. Optimize jointly with machine CPI. How to save in overall rntime? New idea: Pipelining 8
9 PC instrction registers Data מבנה ה - path rd rs rt + imm. Instrction Fetch 2. Decode/ Register 3. Eecte. emory 5. Back 9
10 Gotta Do Landry Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, fold, and pt away Washer takes 3 mintes A B C D Dryer takes 3 mintes Folder takes 3 mintes Stasher takes 3 mintes to pt clothes into drawers
11 Seqential Landry 6 P A T a s k O r d e r A B C D Time Seqential landry takes 8 hors for loads
12 Pipelined Landry 2 2 A 6 P T a s k O r d e r A B C D Pipelined landry takes 3.5 hors for loads! Time 2
13 General Definitions Latency: time to completely eecte a certain task for eample, time to read a sector from disk is disk access time or disk latency Throghpt: amont of work that can be done over a period of time 3
14 Pipelining Lessons (/2) T a s k O r d e r 6 P A B C D Time Pipelining doesn t help latency of single task, it helps throghpt of entire workload ltiple tasks operating simltaneosly sing different resorces Potential speedp = Nmber pipe stages Time to fill pipeline and time to drain it redces speedp: 2.3X v. X in this eample
15 Pipelining Lessons (2/2) T a s k O r d e r 6 P A B C D Time Sppose new Washer takes 2 mintes, new Stasher takes 2 mintes. How mch faster is pipeline? Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages also redces speedp 5
16 Inspiration: Atomobile assembly line Assembly line moves on a steady clock. Each station does the same task on each car. The clock Car body shell erge station Bolting station Car chassis 6
17 Inspiration: Atomobile assembly line Simpler station tasks more cars per hor. Simple tasks take less time, clock is faster. 7
18 Inspiration: Atomobile assembly line Line speed limited by slowest task. ost efficient if all tasks take same time to do 8
19 Inspiration: Atomobile assembly line Simpler tasks, comple car long line! These lines go 2 7, and rarely sht down. Why? 9
20 Lessons from car assembly lines Faster line movement yields more cars per hor off the line. Faster line movement reqires more stages, each doing simpler tasks. To maimize efficiency, all stages shold take same amont of time (if not, workers in fast stages are idle) Filling, flshing, and stalling assembly line are all bad news. 2
21 PC instrction registers Data מבנה ה - path rd rs rt + imm. Instrction Fetch 2. Decode/ Register 3. Eecte. emory 5. Back 22
22 Pipelined Eection Representation Time IFtch Dcd Eec em IFtch Dcd Eec em IFtch Dcd Eec em IFtch Dcd Eec em IFtch Dcd Eec em IFtch Dcd Eec em Every instrction mst take same nmber of 23 steps, also called pipeline stages, so some will go idle sometimes
23 Key Analogy: The instrction is the car Pipeline Stage # Stage #2 Stage #3 Stage # Stage #5 Instrction Fetch IR IR IR IR Controls hardware in stage 2 Controls hardware in stage 3 Controls hardware in stage Controls hardware in stage 5 Data-stationary control 2
24 Representation #: Timeline IF (Fetch) ID (Decode) EX () E IR IR IR IR Good for visalizing pipeline fills. Sample Program I: I2: I3: I: I5: ADD R,R3,R2 AND R6,R5,R SUB R,R9,R8 XOR R3,R2,R OR R7,R6,R5 Time: t t2 t3 t t5 t6 t7 t8 Inst I: I2: I3: I: I5: I6: IF ID IF EX ID IF Pipeline is fll E EX ID IF E EX ID IF E EX ID IF E EX ID 25 E EX
25 Representation #2: Resorce Usage IF (Fetch) ID (Decode) EX () E IR IR IR IR Good for visalizing pipeline stalls. Sample Program I: I2: I3: I: I5: ADD R,R3,R2 AND R6,R5,R SUB R,R9,R8 XOR R3,R2,R OR R7,R6,R5 Time: t t2 t3 t t5 t6 t7 t8 Stage IF: ID: EX: E: : I I2 I I3 I2 I Pipeline is fll I I3 I2 I I5 I I3 I2 I I6 I5 I I3 I2 I7 I6 I5 I I3 26 I8 I7 I6 I5 I
26 I n s t r. O r d e r Graphical Pipeline Representation (In Reg, right half highlight read, left half write) Time (clock cycles) Load Store Sb Or I$ Reg I$ Reg I$ D$ Reg I$ Reg D$ Reg I$ Reg D$ Reg Reg D$ Reg D$ Reg 28
27 Eample Sppose 2 ns for access, 2 ns for operation, and ns for register file read or write Nonpipelined Eection: lw : IF + Reg + + emory + Reg = = 8 ns add: IF + Reg + + Reg = = 6 ns Pipelined Eection: a(if, Reg,,emory, Reg) = 2 ns 29
28 חלוקה לשלבים IF: Instrction fetch ID: Instrction decode/ register file read EX: Eecte/ address calclation E: emory access : back Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero reslt ress Data 6 Sign etend 3
29 Instrction הוספת הרגיסטרים IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction register register 2 Registers 2 register Zero reslt ress Data 6 Sign etend
30 I n s t r c t i o n l w I n s t r c t i o n f e t c h IF/ID I F / I D I D / E X E X / E E / W B A d d A d d A d d r e s l t S h i f t l e f t 2 P C A d d r e s s I n s t r c t i o n m e m o r y R e a d r e g i s t e r R e a d r e g i s t e r 2 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a R e a d d a t a 2 A L U Z e r o A L U r e s l t A W r i t e d a t a R e a d d a t a 6 S i g n e t e n d
31 I n s t r c t i o n l w I F / I D I n s t r c t i o n d e c o d e ID/EX I D / E X E X / E E / W B A d d A d d A d d r e s l t S h i f t l e f t 2 P C A d d r e s s I n s t r c t i o n m e m o r y R e a d r e g i s t e r R e a d r e g i s t e r 2 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a R e a d d a t a 2 A L U Z e r o A L U r e s l t A d d r e s s W r i t e d a t a D a t a m e m o r y R e a d d a t a 6 S i g n e t e n d 3 2 3
32 35 Instrction ress reslt Shift left 2 Instrction IF/ID EX/E PC Registers 2 register register 2 6 Sign etend register reslt Zero ID/EX E/ Eection lw ress Data EX/E
33 l w e m o r y E/ I F / I D I D / E X E X / E E / W B A d d A d d A d d r e s l t S h i f t l e f t 2 P C A d d r e s s I n s t r c t i o n m e m o r y I n s t r c t i o n R e a d r e g i s t e r R e a d r e g i s t e r 2 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a R e a d d a t a 2 A L U Z e r o A L U r e s l t A d d r e s s W r i t e d a t a D a t a m e m o r y R e a d d a t a 6 S i g n e t e n d
34 I n s t r c t i o n l w W r i t e b a c k I F / I D I D / E X E X / E E / W B A d d A d d A d d r e s l t S h i f t l e f t 2 P C A d d r e s s I n s t r c t i o n m e m o r y R e a d r e g i s t e r R e a d r e g i s t e r 2 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a R e a d d a t a 2 A L U Z e r o A L U r e s l t A d d r e s s D a t a W r i t e d a t a m e m o r y R e a d d a t a 6 S i g n e t e n d
35 Instrction תיקון!!! correction A IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction register register 2 Registers 2 register Zero reslt ress Data 6 Sign etend 38 Keep the right Rd all the way!
36 Instrction So here is the pdated CPU; IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction register register 2 Registers 2 register Zero reslt ress Data 6 Sign etend 39
37 Instrction What abot the control wires? PCSrc IF/ID ID/EX EX/E E/ Reg Shift left 2 reslt Branch PC ress Instrction register register 2 Registers 2 register Instrction [5 ] 6 Sign etend Src 6 control Zero reslt ress em Data em emtoreg Instrction [2 6] Instrction [5 ] RegDst Op
38 קווי הבקרה Eection/ress Calclation stage control lines emory access stage control lines -back stage control lines Instrction Reg Dst Op Op Src Branch em em Reg write em to Reg R-format lw sw X X beq X X Instrction Control EX 2 IF/ID ID/EX EX/E E/
39 Instrction emtoreg em Reg Datapath with Control PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress Instrction register register 2 Registers register 2 Shift left 2 reslt Src Zero reslt Branch ress Data Instrction 6 [5 ] Sign etend 6 control em Instrction [2 6] Instrction [5 ] RegDst Op 3
40 דוגמא A demonstration of a seqence of instrctions: Lw $,2($) Sb $,$2,$3 And $2,$,$5 Or $3,$6,$7 $,$8,$9
41 Instrction emtoreg em Reg Instrction emtoreg em Reg IF: lw $, 2($) ID: before<> EX: before<2> E: before<3> : before<> IF/ID Control ID/EX EX EX/E E/ PC ress Instrction register register 2 Registers 2 register Shift left 2 reslt Src Zero reslt Branch ress Data Instrction [5 ] Sign etend control em Clock Instrction [2 6] Instrction [5 ] RegDst Op IF: sb $, $2, $3 ID: lw $, 2($) EX: before<> E: before<2> : before<3> IF/ID lw Control ID/EX EX EX/E E/ PC ress Instrction X 2 register register 2 Registers $X 2 register Instrction [5 ] Sign etend $ 2 Shift left 2 reslt control Src Zero reslt Branch ress Data em 5 Clock 2 X Instrction [2 6] Instrction [5 ] X RegDst Op
42 Instrction emtoreg em Reg Instrction emtoreg em Reg IF: and $2, $, $5 ID: sb $, $2, $3 EX: lw $,... E: before<> : before<2> IF/ID sb Control ID/EX EX EX/E E/ PC ress Instrction 2 3 register $2 register 2 Registers $3 2 register Shift left 2 $ reslt Src Zero reslt Branch ress Data X Instrction [5 ] Sign etend X 2 control em Clock 3 X Instrction [2 6] Instrction [5 ] X RegDst Op IF: or $3, $6, $7 ID: ID: and and $2, $2, $2, $, $3 $5 EX: sb $,... E: lw $,... : before<> IF/ID and Control ID/EX EX EX/E E/ PC ress Instrction 5 X Shift left 2 register $ $2 register 2 Registers $5 $3 2 register Instrction [5 ] Sign etend X reslt control Src Zero reslt Branch ress Data em 6 Clock X 2 Instrction [2 6] Instrction [5 ] X 2 RegDst Op
43 Instrction emtoreg em Reg Instrction emtoreg em Reg IF: add $, $8, $9 ID: or $3, $6, $7 EX: and $2,... E: sb $,... : lw $,... IF/ID or Control ID/EX EX EX/E E/ PC ress Instrction 6 7 register $6 register 2 Registers $7 2 register Shift left 2 $ $5 reslt Src Zero reslt Branch ress Data X Instrction [5 ] Sign etend X control em Clock 5 X 3 Instrction [2 6] Instrction [5 ] X 3 2 RegDst Op IF: after<> ID: add $, $8, $9 EX: or $3,... E: and $2,... : sb $,... IF/ID add Control ID/EX EX EX/E E/ PC ress Instrction 8 9 X register register 2 Registers $9 2 register Instrction [5 ] Sign etend $8 X Shift left 2 $6 $7 reslt control Src Zero reslt Branch ress Data em 7 Clock 6 X Instrction [2 6] Instrction [5 ] X 3 RegDst Op 2
44 The internal strctre of the Register File Rd reg (= Rs) 5 write Rd reg 2 (= Rt) 5 2 Wr reg (= Rd) 5 E Reg שתי היציאות קוראות בו זמנית ערכים של שני רגיסטרים שונים כותבים לאחד הרגיסטרים האחרים )בעליית השעון הבאה( 8
45 Instrction emtoreg em Reg Instrction emtoreg em Reg IF: after<2> ID: after<> EX: add $,... E: or $3,... : and $2,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction 2 register register 2 Registers 2 register Shift left 2 $8 $9 reslt Src Zero reslt Branch ress Data Instrction [5 ] Sign etend control em Clock 7 Instrction [2 6] Instrction [5 ] RegDst Op 3 2 IF: after<3> ID: after<2> EX: after<> E: add $,... : or $3,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction 3 register register 2 Registers 2 register Instrction [5 ] Sign etend Shift left 2 reslt control Src Zero reslt Branch ress Data em 9 Clock 8 Instrction [2 6] Instrction [5 ] RegDst Op 3
46 e m t o R e g e m W r i t e R e g W r i t e I F : a f t e r < > I D : a f t e r < 3 > E X : a f t e r < 2 > E : a f t e r < > W B : a d d $,... I F / I D C o n t r o l I D / E X W B E X E X / E W B E / W B W B A d d P C A d d r e s s I n s t r c t i o n m e m o r y I n s t r c t i o n R e a d r e g i s t e r R e a d R e a d d a t a r e g i s t e r 2 R e g i s t e r s R e a d d a t a 2 W r i t e r e g i s t e r W r i t e d a t a S h i f t l e f t 2 A d d A d d r e s l t A L U S r c Z e r o A L U A L U r e s l t B r a n c h A d d r e s s D a t a m e m o r y W r i t e d a t a R e a d d a t a I n s t r c t i o n [ 5 ] S i g n e t e n d A L U c o n t r o l e m R e a d C l o c k 9 I n s t r c t i o n [ 2 6 ] I n s t r c t i o n [ 5 ] R e g D s t A L U O p 5
47 Pipeline Hazard: atching socks in later load 2 2 A 6 P T a s k O r d e r A B C D E F bbble Time A depends on D; stall since folder tied p 5
48 Problems Limits to pipelining: Hazards prevent net instrction from eecting dring its designated clock cycle Strctral hazards: HW cannot spport this combination of instrctions (single person to fold and pt clothes away) Data hazards: Instrction depends on reslt of prior instrction still in the pipeline Control hazards: Pipelining of branches & other instrctions stall the pipeline ntil the hazard bbbles in the pipeline 52
49 An eample for hazards: sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) 53
50 An eample for hazards: sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) An eample for hazards: Register $2 is pdated only at the phase, i.e., the 5th clock cycle (actally at the end of the 5th clock cycle). However, we try to se it at the 3rd clock cycle when we read $2 at the decode phase of the and instrction 5
51 Graphic representation of hazards: Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC CC 5 CC 6 I Reg CC 7 CC 8 CC 9 / D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg 55
52 Solving hazards by adding nops sb $2, $, $3 nop nop nop and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) 56
53 Solving hazards by adding nops P r o g r a m e e c t i o n o r d e r ( i n i n s t r c t i o n s ) s b $ 2, $, $ 3 T i m e ( i n c l o c k c y c l e s ) V a l e o f r e g i s t e r $ 2 : C C C C 2 C C 3 C C C C 5 C C 6 I R e g D R e g C C 7 C C 8 C C 9 / C C C C C C nop I R e g D R e g nop I R e g D R e g nop I R e g D R e g a n d $ 2, $ 2, $ 5 I R e g D R e g o r $ 3, $ 6, $ 2 I R e g D R e g a d d $, $ 2, $ 2 I R e g D R e g 57 s w $ 5, ( $ 2 ) I R e g D R e g
54 We cold earn ck cycle if GPR is transparent P r o g r a m e e c t i o n o r d e r ( i n i n s t r c t i o n s ) s b $ 2, $, $ 3 T i m e ( i n c l o c k c y c l e s ) V a l e o f r e g i s t e r $ 2 : C C C C 2 C C 3 C C C C 5 C C 6 I R e g D R e g C C 7 C C 8 C C 9 / C C C C C C nop I R e g D R e g nop I R e g D R e g a n d $ 2, $ 2, $ 5 I R e g D R e g o r $ 3, $ 6, $ 2 I R e g D R e g a d d $, $ 2, $ 2 We cold earn ck cycle if GPR is transparent, i.e, we cold see the write to the GPR at the GPR otpts (if the write address eqals the read address), i.e., dring Ck #5. I R e g D R e g s w $ 5, ( $ 2 ) I R e g D R g e 58
55 The internal strctre of the Register File Rd reg (= Rs) 5 write Rd reg 2 (= Rt) 5 2 Wr reg (= Rd) 5 E Reg שתי היציאות קוראות בו זמנית ערכים של שני רגיסטרים שונים כותבים לאחד הרגיסטרים האחרים )בעליית השעון הבאה( 59
56 The internal strctre of the modified Register File. We bypass the inpt (the write ) to the read otpt whenever Rs=Rd/Rt (i.e., whenever read reg=write reg bt not zero). We bypass the inpt (the write ) to the read 2 otpt whenever Rt=Rd/Rt (i.e., whenever read reg2=write reg, bt not zero). Rd reg (= Rs) 5 Wr reg 5 write write Rd reg 2 (= Rt) Wr reg 5 5 write 2 Wr reg (= Rd) 5 E Reg 6
57 After doing that change we only need 2 nops sb $2, $, $3 nop nop and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) After the change the of an early instrction can happen at the same time with the read reg (decode) phase of a newer instrction (3 with two other instrctions in between). In case we have a hazard, we need to add only two nop instrctions. Unfortnately, this happens too often. We need a better soltion! 6
58 Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC CC 5 CC 6 I Reg CC 7 CC 8 CC 9 / D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) 62 I Reg D Reg
59 גניבת הערכים Forwarding Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 Vale of register $2 : / Vale of EX/E : X X X 2 X X X X X Vale of E/ : X X X X 2 X X X X Program eection order (in instrctions) sb $2, $, $3 I Reg D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg 63
60 Forwarding (done at the eecte phase) Instrction ID/EX EX/E Control E/ IF/ID EX PC Instrction Registers Data IF/ID.RegisterRs Rs IF/ID.RegisterRt Rt IF/ID.RegisterRt IF/ID.RegisterRd Rt Rd EX/E.RegisterRd Forwarding nit E/.RegisterRd If ID/EX.Rs=EX/E.Rd, i.e., the Rd of the previos instrction eqals the Rs of the crrent instrction (which is in the decode phase), then we se the ot of the previos instrction instead of the otpt of the GPR. If ID/EX.Rs=E/.Rd, i.e., the Rd of the previos instrction eqals the Rs of the crrent instrction (which is in the decode phase), then we se the ot of the previos instrction instead of the otpt of the GPR. [ similarly, compare also ID/EX.Rt to E/.Rd ] 6 Similarly, compare also ID/EX.Rt to EX/E.Rd and to E/.Rd
61 Data hazard from previos instrction: Src A: If (ID/EX.Rs = = EX/E.Rd) se the Ot instead of Rs I.e., if Rs of the crrent eecting instrction = = Rd of the previos instrction The actal eqations are: if ((EX/E.Reg = = )&& (EX/E.Rd <> )&& (ID/EX.Rs = = EX/E.Rd)) => ForwardA=, Src B: If (ID/EX.Rt = = EX/E.Rd) se the Ot instead of Rt I.e., if Rt of the crrent eecting instrction = = Rd of the previos instrction The actal eqations are: if ((EX/E.Reg = = )&& (EX/E.Rd <> )&& (ID/EX.Rt = = EX/E.Rd)) => ForwardB=, 65
62 Data hazard from 2 instrctions back: Src A: If (ID/EX.Rs = = E/.Rd) se the GPR write instead of Rs I.e., if Rs of the crrent eecting instrction = = Rd of 2 instrctions ago The actal eqations are: if ((E/.Reg = = )&& (E/.Rd <> )&& (ID/EX.Rs = = E/.Rd)) => ForwardA=, Src B: If (ID/EX.Rt = = E/.Rd) se the GPR write instead of Rt I.e., if Rt of the crrent eecting instrction = = Rd of 2 instrctions ago The actal eqations are: if ((E/.Reg = = )&& (E/.Rd <> )&& (ID/EX.Rt = = E/.Rd)) => ForwardB=, Doble hazard: If there is a hazard from previos inst and the instrction before that?we shold chhose the from the previos instrction, it is p to date ( newer )! 66
63 דוגמא An eample for forwarding Sb $2, $, $3 And $, $2, $5 needs forwarding from the previos instrction Or $, $, $2 needs forwarding from two instrctions back $9, $, $2 needs forwarding from 3 instrctions back (thr the transparent GPR) Here we discss the $2 register only (The first two cases are handled in the eecte phase, the last one, in the decode phase). 67
64 דוגמא An eample for forwarding Sb $2, $, $3 And $, $2, $5 Or $, $, $2 needs forwarding from the previos instrction $9, $, $2 needs forwarding from the previos instrction Here we discss the $ register and there are two case (the 2nd one in prple) 68
65 Instrction Instrction or $, $, $2 and $, $2, $5 sb $2, $, $3 before<> before<2> ID/EX EX/E Control E/ IF/ID EX Sb $2, $, $3 And $, $2, $5 PC Clock 3 Or $, $, $2 Instrction add $9, $, $2 2 5 Registers $2 $5 2 5 $ $3 3 2 or $, $, $2 and $, $2, $5 ID/EX Forwarding nit sb $2,... Data before<> $9, $, $2 Control EX/E E/ IF/ID EX PC Instrction 6 Registers $ $2 $2 $5 Data Forwarding nit 2 69 Clock Since Rs=2 and Rd of previos inst. was 2, we se ot instead of Rs
66 Instrction Instrction after<> add $9, $, $2 or $, $, $2 and $,... sb $2,... ID/EX EX/E Control E/ IF/ID EX PC Instrction 2 2 Registers $ $2 $ $2 Data Forwarding nit Clock 5 after<2> after<> add $9, $, $2 or $,... and $,... ID/EX EX/E Control E/ IF/ID EX $ PC Instrction Registers $2 Data 2 9 Clock 6 Forwarding nit 7 In ble we see forwarding from two instrctions back (em->eec.), in red, from previos instrction (->Eec.), in prple, from 3 instrctions back (->Decode).
67 לא תמיד הפתרון עובד - lw The soltion does not work for (in lw we do not have the in the pipe!, it comes from the!) Program eection order (in instrctions) lw $2, 2($) Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 I Reg D Reg CC 7 CC 8 CC 9 and $, $2, $5 I Reg D Reg or $8, $2, $6 I Reg D Reg add $9, $, $2 I Reg D Reg slt $, $6, $7 I Reg D Reg If the previos instrction was lw to a register and we try to se the register in the crrent instrction, we have a problem, since we cannot go back in time! One soltion is to avoid sch cases by adding a nop (by the Assembler) whenever Rt of the lw is eqal to Rs or Rt of the following instrction. 7
68 Another h/w soltion is to add Bbbles, i.e., add nop by hardware Program eection order (in instrctions) Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 CC lw $2, 2($) I Reg D Reg and $, $2, $5 I Reg Reg D Reg or $8, $2, $6 add $9, $, $2 slt $, $6, $7 I nop I Reg D Reg bbble I Reg D Reg I Reg D Reg 72 We need to hold IF/ID for one ck cycle and insert a nop: into ID/EX. This has the same effect as adding a nop instrction by the Assembler.
69 Instrction PC IF/ID Rs, Rt of crrent inst. IF/ID Hazard detection nit Hazard detection nit Control ID/EX.em ID/EX EX identifies lw EX/E E/ PC Instrction Registers Data IF/ID.RegisterRs Rt from prev. inst. IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRd ID/EX.RegisterRt Rt Rd Rs Rt Forwarding nit EX/E.RegisterRd E/.RegisterRd We need to hold the IF/ID and PC for one ck cycle and insert a nop: into ID/EX. This has the same effect as adding a nop instrction by the Assembler. If (ID/EX.emRd)&& ( (ID/EX.Rt= =IF/ID.Rs) (ID/EX.Rt= =IF/ID.Rt) ) we mst stall the pipeline. This means that prev. inst was lw and it was to the crrent Rs or Rt. (of corse if one of them is not sed, don t stall) 73 Holding means freeze the IF/ID and the PC for clock cycle Hold the IF/ID by not giving a IF/IDWrire signal and do not increment the PC (which already points at the ne instrction) by not giving the PC signal. Inserting a nop is by clearing all control signals.
70 דוגמא An eample for lw hazard detection lw $2, 2($) And $, $2, $5 Or $, $, $2 $9, $, $2 7
71 Instrction PC IF/ID Instrction PC IF/ID and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID X Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ before<3> PC Instrction X Registers $ $X Data ID/EX.RegisterRt X 2 Forwarding nit Clock 2 or $, $, $2 and $, $2, $5 2 5 Hazard detection nit ID/EX.em ID/EX lw $2, 2($) EX/E before<> before<2> IF/ID Control EX E/ PC Instrction 2 5 Registers $2 $5 $ $X Data ID/EX.RegisterRt X Forwarding nit 75 Clock 3
72 Instrction PC IF/ID Instrction PC IF/ID or $, $, $2 IF/ID and $, $2, $5 2 5 Hazard detection nit Control ID/EX.em ID/EX EX bbble EX/E lw $2,... E/ before<> PC Instrction 2 5 Registers $2 $5 $2 $5 Data ID/EX.RegisterRt Forwarding nit 2 Clock add $9, $, $2 IF/ID or $, $, $2 2 Hazard detection nit Control ID/EX.em ID/EX EX and $, $2, $5 bbble lw $2,... EX/E E/ PC Instrction 2 2 Registers $ $2 $2 $5 Data ID/EX.RegisterRt Forwarding nit 2 76 Clock 5 The lw instrction is in the phase. $2 is being written. We can se $2 in the Eecte phase of the and instrction, with the help of forwarding.
73 Instrction PC IF/ID Instrction PC IF/ID after<> add $9, $, $2 or $, $, $2 and $,... bbble 2 Hazard detection nit ID/EX.em ID/EX EX/E IF/ID Control EX E/ PC Instrction 2 Registers $ $2 $ $2 Data 2 2 ID/EX.RegisterRt 9 Forwarding nit Clock 6 after<2> IF/ID after<> Hazard detection nit Control ID/EX.em ID/EX EX add $9, $, $2 or $,... and $,... EX/E E/ $ PC Instrction Registers $2 Data 2 ID/EX.RegisterRt 9 Forwarding nit 77 Clock 7
74 Branch Hazards 78
75 Instrction emtoreg em Reg Jst to remind s how branch is handled we show again the Datapath with Control PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress Instrction register register 2 Registers 2 register Shift left 2 reslt Src Zero reslt Branch ress Data Instrction 6 [5 ] Sign etend 6 control em Instrction [2 6] Instrction [5 ] RegDst Op 79
76 Branch Hazards Program eection order (in instrctions) Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 beq $, $3, 7 I Reg D Reg and $2, $2, $5 I Reg D Reg 8 or $3, $6, $2 I Reg D Reg 52 add $, $2, $2 I Reg D Reg 72 lw $, 5($7) Here we calc.rs-rt I Reg D Reg These 3 instrctions shold be killed before they do harm, I.e., change any register. Here we decide to branch (switching the address to the PC and issing PC Cond) 8 In cc5 we already se the new PC calclated by the branch. (PC=72)
77 ontrol Hazard: Branching (/7) I n s t r. O r d e r beq Instr Instr 2 Instr 3 Instr Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg I$ Where do we do the compare for the branch? I$ Reg D$ Reg Reg D$ Reg I$ Reg D$ Reg 8
78 Control Hazard: Branching (2/7) We pt branch decision-making hardware in stage therefore two more instrctions after the branch will always be fetched, whether or not the branch is taken Desired fnctionality of a branch if we do not take the branch, don t waste any time and contine eecting normally if we take the branch, don t eecte any instrctions after the branch, jst go to the desired label 82
79 Control Hazard: Branching (3/7) Initial Soltion: Stall ntil decision is made insert no-op instrctions: those that accomplish nothing, jst take time Drawback: branches take 3 clock cycles each (assming comparator is pt in stage) 83
80 Control Hazard: Branching (/7) Optimization #: move asynchronos comparator p to Stage 2 as soon as instrction is decoded (Opcode identifies is as a branch), immediately make a decision and set the vale of the PC (if necessary) Benefit: since branch is complete in Stage 2, only one nnecessary instrction is fetched, so only one no-op is needed Side Note: This means that branches are idle in Stages 3, and 5. 8
81 85 PC Instrction Registers EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID Sign etend Control = Shift left 2
82 Control Hazard: Branching (5/7) I n s t r. O r d e r Insert a single no-op (bbble) add beq lw Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg bb ble I$ Reg D$ Reg Impact: 2 clock cycles per branch instrction slow 86
83 Control Hazard: Branching (6/7) Optimization #2: Redefine branches Old definition: if we take the branch, none of the instrctions after the branch get eected by accident New definition: whether or not we take the branch, the single instrction immediately following the branch gets eected (called the branch-delay slot) 87
84 Control Hazard: Branching (7/7) Notes on Branch-Delay Slot Worst-Case Scenario: can always pt a noop in the branch-delay slot Better Case: can find an instrction preceding the branch which can be placed in the branch-delay slot withot affecting flow of the program re-ordering instrctions is a common method of speeding p programs compiler mst be very smart in order to find instrctions to do this sally can find sch an instrction at least 5% 88 of the time Jmps also have a delay slot
85 Eample: Nondelayed vs. Delayed Branch Nondelayed Branch Delayed Branch or $8, $9,$ add $,$2,$3 add $,$2,$3 sb $, $5,$6 beq $, $, Eit or $, $,$ sb $, $5,$6 beq $, $, Eit or $8, $9,$ or $, $,$ Eit: Eit: 89
86 Qestion (/2) Assme instr/clock, delayed branch, 5 stage pipeline, forwarding, interlock on nresolved load hazards (after 3 loops, so pipeline fll) Loop: lw $t, ($s) add $t, $t, $s2 sw $t, ($s) addi $s, $s, - bne $s, $zero, Loop nop How many pipeline stages (clock cycles) per loop iteration to eecte this code?
87 Answer (/2) Assme instr/clock, delayed branch, 5 stage pipeline, forwarding, interlock on nresolved load hazards. 3 iterations, so pipeline fll. 2. ( hazard so stall) Loop:. lw $t, ($s) 3. add $t, $t, $s2. sw $t, ($s) 5. addi $s, $s, - 6. bne $s, $zero, Loop 7. nop (delayed branch so eec. nop) How many pipeline stages (clock cycles) per loop iteration to eecte this code?
88 Qestion (2/2) Assme instr/clock, delayed branch, 5 stage pipeline, forwarding, interlock on nresolved load hazards (after 3 loops, so pipeline fll). Rewrite this code to redce pipeline stages (clock cycles) per loop to as few as possible. Loop: lw $t, ($s) add $t, $t, $s2 sw $t, ($s) addi $s, $s, - bne $s, $zero, Loop nop How many pipeline stages (clock cycles) per loop iteration to eecte this code?
89 A (2/2) How long to eecte? Rewrite this code to redce clock cycles per loop to as few as possible: (no hazard since etra cycle) Loop:. lw $t, ($s) 2. addi $s, $s, - 3. add $t, $t, $s2 bne $s, $zero, Loop. (modified sw to pt past addi) 5. sw $t, +($s) How many pipeline stages (clock cycles) per loop iteration to eecte yor revised code? (assme pipeline is fll)
90 Peer Instrction A. Thanks to pipelining, I have redced the time it took me to wash my shirt. B. Longer pipelines are always a win (since less work per stage & a faster clock). C. We can rely on compilers to help s avoid hazards by reordering instrs. ABC : FFF 2: FFT 3: FTF : FTT 5: TFF 6: 9 TFT 7: TTF 8: TTT
91 Peer Instrction Answer A. Throghpt better, not eection time B. longer pipelines do sally mean faster clock, bt branches case problems! C. they happen too often & delay too long. Forwarding! (e.g, em ) F A L S E A. Thanks to pipelining, I have redced the time it took me to wash my shirt. B. Longer pipelines are always a win (since less work per stage F & a A faster L clock). S E C. We can rely on compilers to help s avoid hazards by reordering instrs. F A L S E ABC : FFF 2: FFT 3: FTF : FTT 5: TFF 6: 95 TFT 7: TTF 8: TTT
92 The sitation was better if we some how moved the branch address calclation one ck earlier. This is easy to do since sign etension and shift are only wires. We jst need to move the branch address register to the left. Rverything happens ck earlier and so we ll have to kill only two instrctions. Instrction Reg em emtoreg Net, we ll add a fast comparator which will compare Rs and Rt at the same ck cycle of the decode phase. (Instead of sing the to calc. Rs-Rt, we ll bilt a simple and fast or circit). This means etra h/w bt now we earned one more ck cycle. So, we have to kill only a single instrction. Killing an instrction also called flshing the pipeline, is easily done by clreaing the IF/ID register of the instrction following the branch (if the branch is sccessfl) PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress Instrction register register 2 Registers 2 register Shift left 2 reslt Src Zero reslt Branch ress Data Instrction 6 [5 ] Sign etend Instrction [2 6] Instrction [5 ] 6 control RegDst Op em 96
93 97 Flshing PC Instrction Registers EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID Sign etend Control = Shift left 2
94 דוגמא An eample for flshing sb $, $, $8 beq $, $3, 7 and $2, $2, $5 lw $, 5($7) 98
95 Data hazards: * Forward from previos instrction * Forward from two instrctions ago Smmary of hazards * (Forward thr transparent GPR = from 3 instrctions ago) * If we cannot forward, (after lw) we stall the pipe by inserting a nop and freezing IF/ID and PC for ck cycle Control hazards: * If branch is sccessfl we flsh the instrction following the branch (which is at the IF/ID register. We jst clear the register) Notes: In the real IPS CPU, no flsh was employed. This give the compiler the opportnity to pt sefl instrctions following the branch. This eplains why the simlator always performs the instrction following the branch.this is called a delayed branch. Also, in the real IPS CPU no lw stall was sed. Again this give some freedom to the compiler to choose whether to pt a nop following lw or some sefl instrction. This is called a delayed load. 99
96 עוד פתרונות Sperpipelining means more than 5 stages of pipelining יותר שלבים Dynamic pipeline schedling change the order of eecting instrctions to fill gaps if possible (= instead of bbbles) מיקבול שלבים Sperscalar- Performing two instrctions simltaneosly. This means fetch two instrctions together, decode them at the same time(have more inpts and otpts in the GPR), eecte, i.e., almost doble the hardware ltithreading lticores
Pipeline Datapath. With some slides from: John Lazzaro and Dan Garcia
Pipeline path With some slides from: John Lazzaro and Dan Garcia Gotta Do Landry Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, fold, and pt away Washer takes 3 mintes A B C D Dryer
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.
Pipelined Datapath Lectre notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 Reading (2) Pipeline Performance Assme time for stages is v ps for register read or write
More informationDesigning MIPS Processor
CSE 675.: Introdction to Compter Architectre Designing IPS Processor (lti-cycle) Presentation H Reading Assignment: 5.5,5.6 lti-cycle Design Principles Break p eection of each instrction into steps. The
More informationInstruction register. Data. Registers. Register # Memory data register
Where we are headed Single Cycle Problems: what if we had a more complicated instrction like floating point? wastefl of area One Soltion: se a smaller cycle time have different instrctions take different
More informationLecture 12: Pipelined Implementations: Control Hazards and Resolutions
18-447 Lectre 12: Pipelined Implementations: Control Hazards and Resoltions S 09 L12-1 James C. Hoe Dept of ECE, CU arch 2, 2009 Annoncements: Spring break net week!! Project 2 de the week after spring
More informationLecture 9: Control Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lectre 9: Control Hazard and Resoltion James C. Hoe Department of ECE Carnegie ellon University 18 447 S18 L09 S1, James C. Hoe, CU/ECE/CALC, 2018 Yor goal today Hosekeeping simple control flow
More informationTopics: A multiple cycle implementation. Distributed Notes
COSC 22: Compter Organization Instrctor: Dr. Amir Asif Department of Compter Science York University Handot # lticycle Implementation of a IPS Processor Topics: A mltiple cycle implementation Distribted
More informationReview. Combined Datapath
Review Topics:. A single cycle implementation 2. State Diagrams. A mltiple cycle implementation COSC 22: Compter Organization Instrctor: Dr. Amir Asif Department of Compter Science York University Handot
More informationDesigning Single-Cycle MIPS Processor
CSE 32: Introdction to Compter Architectre Designing Single-Cycle IPS Processor Presentation G Stdy:.-. Gojko Babić 2/9/28 Introdction We're now ready to look at an implementation of the system that incldes
More informationCPU DESIGN The Single-Cycle Implementation
22 ompter Organization Seqential vs. ombinational ircits Digital circits can be classified into two categories: DESIGN The Single-ycle Implementation. ombinational ircits: m, 2. Seqential ircits: flip-flops,
More informationCMP N 301 Computer Architecture. Appendix C
CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)
More informationComputer Architecture Lecture 5: ISA Wrap-Up and Single-Cycle Microarchitectures
8-447 Compter Architectre Lectre 5: ISA Wrap-Up and Single-Cycle icroarchitectres Prof. Onr tl Carnegie ellon University Spring 22, /25/22 Homework Was de Wednesday! 34 received 2 Reminder: Homeworks for
More informationEXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control
The MIPS Pipeline CSCI206 - Computer Organization & Programming Pipeline Datapath and Control zybook: 11.6 Developed and maintained by the Bucknell University Computer Science Department - 2017 Hazard
More informationConcepts Introduced. Digital Electronics. Logic Blocks. Truth Tables
Concepts Introdced Digital Electronics trth tables, logic eqations, and gates combinational logic seqential logic Digital electronics operate at either high or low voltage. Compters se a binary representation
More informationPipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2
Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
More informationSimple Instruction-Pipelining. Pipelined Harvard Datapath
6.823, L8--1 Simple ruction-pipelining Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. I fetch decode & eg-fetch execute memory Clock period
More informationCOMP303 Computer Architecture Lecture 11. An Overview of Pipelining
COMP303 Compute Achitectue Lectue 11 An Oveview of Pipelining Pipelining Pipelining povides a method fo executing multiple instuctions at the same time. Laundy Example: Ann, Bian, Cathy, Dave each have
More informationComputer Architecture ELEC2401 & ELEC3441
Last Time Pipeline Hazard Computer Architecture ELEC2401 & ELEC3441 Lecture 8 Pipelining (3) Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Structural Hazard Hazard Control
More informationL07-L09 recap: Fundamental lesson(s)!
L7-L9 recap: Fundamental lesson(s)! Over the next 3 lectures (using the IPS ISA as context) I ll explain:! How functions are treated and processed in assembly! How system calls are enabled in assembly!
More informationSimple Instruction-Pipelining. Pipelined Harvard Datapath
6.823, L8--1 Simple ruction-pipelining Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. fetch decode & eg-fetch execute
More informationECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)
ECE 3401 Lecture 23 Pipeline Design Control State Register Combinational Control Logic New/ Modified Control Word ISA: Instruction Specifications (for reference) P C P C + 1 I N F I R M [ P C ] E X 0 PC
More informationComputer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle
Computer Engineering Department CC 311- Computer Architecture Chapter 4 The Processor: Datapath and Control Single Cycle Introduction The 5 classic components of a computer Processor Input Control Memory
More informationImplementing the Controller. Harvard-Style Datapath for DLX
6.823, L6--1 Implementing the Controller Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 6.823, L6--2 Harvard-Style Datapath for DLX Src1 ( j / ~j ) Src2 ( R / RInd) RegWrite MemWrite
More information3. (2) What is the difference between fixed and hybrid instructions?
1. (2 pts) What is a "balanced" pipeline? 2. (2 pts) What are the two main ways to define performance? 3. (2) What is the difference between fixed and hybrid instructions? 4. (2 pts) Clock rates have grown
More informationProblem Class 4. More State Machines (Problem Sheet 3 con t)
Problem Class 4 More State Machines (Problem Sheet 3 con t) Peter Cheng Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.imperial.ac.k/pcheng/ee2_digital/ E-mail: p.cheng@imperial.ac.k
More informationFast Path-Based Neural Branch Prediction
Fast Path-Based Neral Branch Prediction Daniel A. Jiménez http://camino.rtgers.ed Department of Compter Science Rtgers, The State University of New Jersey Overview The context: microarchitectre Branch
More information4. (3) What do we mean when we say something is an N-operand machine?
1. (2) What are the two main ways to define performance? 2. (2) When dealing with control hazards, a prediction is not enough - what else is necessary in order to eliminate stalls? 3. (3) What is an "unbalanced"
More informationCSCI-564 Advanced Computer Architecture
CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA
More information6 PM Midnight A B C D. Time. T a s k. O r d e r. Computer Architecture CTKing/TTHwang. Pipelining-1. Pipelining-3 CTKing/TTHwang
CS: PP ii pp ee ll ii nn ii nn gg Otline d a t a t h P c D a t a h a z a s a D a t a h a z a s a s t a h h a z a s E c S c a a d y m An overview of pipelining A pipelined pa ipelined ont rol rd nd forwa
More informationChapter 4 Supervised learning:
Chapter 4 Spervised learning: Mltilayer Networks II Madaline Other Feedforward Networks Mltiple adalines of a sort as hidden nodes Weight change follows minimm distrbance principle Adaptive mlti-layer
More information[2] Predicting the direction of a branch is not enough. What else is necessary?
[2] What are the two main ways to define performance? [2] Predicting the direction of a branch is not enough. What else is necessary? [2] The power consumed by a chip has increased over time, but the clock
More informationChapter 3 MATHEMATICAL MODELING OF DYNAMIC SYSTEMS
Chapter 3 MATHEMATICAL MODELING OF DYNAMIC SYSTEMS 3. System Modeling Mathematical Modeling In designing control systems we mst be able to model engineered system dynamics. The model of a dynamic system
More informationLinear System Theory (Fall 2011): Homework 1. Solutions
Linear System Theory (Fall 20): Homework Soltions De Sep. 29, 20 Exercise (C.T. Chen: Ex.3-8). Consider a linear system with inpt and otpt y. Three experiments are performed on this system sing the inpts
More informationProject Two RISC Processor Implementation ECE 485
Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor
More information10.2 Solving Quadratic Equations by Completing the Square
. Solving Qadratic Eqations b Completing the Sqare Consider the eqation ( ) We can see clearl that the soltions are However, What if the eqation was given to s in standard form, that is 6 How wold we go
More information[2] Predicting the direction of a branch is not enough. What else is necessary?
[2] When we talk about the number of operands in an instruction (a 1-operand or a 2-operand instruction, for example), what do we mean? [2] What are the two main ways to define performance? [2] Predicting
More information1. Tractable and Intractable Computational Problems So far in the course we have seen many problems that have polynomial-time solutions; that is, on
. Tractable and Intractable Comptational Problems So far in the corse we have seen many problems that have polynomial-time soltions; that is, on a problem instance of size n, the rnning time T (n) = O(n
More informationCPU DESIGN The Single-Cycle Implementation
CSE 202 Computer Organization CPU DESIGN The Single-Cycle Implementation Shakil M. Khan (adapted from Prof. H. Roumani) Dept of CS & Eng, York University Sequential vs. Combinational Circuits Digital circuits
More informationSources of Non Stationarity in the Semivariogram
Sorces of Non Stationarity in the Semivariogram Migel A. Cba and Oy Leangthong Traditional ncertainty characterization techniqes sch as Simple Kriging or Seqential Gassian Simlation rely on stationary
More informationFRTN10 Exercise 12. Synthesis by Convex Optimization
FRTN Exercise 2. 2. We want to design a controller C for the stable SISO process P as shown in Figre 2. sing the Yola parametrization and convex optimization. To do this, the control loop mst first be
More informationProcessor Design & ALU Design
3/8/2 Processor Design A. Sahu CSE, IIT Guwahati Please be updated with http://jatinga.iitg.ernet.in/~asahu/c22/ Outline Components of CPU Register, Multiplexor, Decoder, / Adder, substractor, Varity of
More informationCS 52 Computer rchitecture and Engineering Lecture 4 - Pipelining Krste sanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste! http://inst.eecs.berkeley.edu/~cs52!
More informationSection 7.4: Integration of Rational Functions by Partial Fractions
Section 7.4: Integration of Rational Fnctions by Partial Fractions This is abot as complicated as it gets. The Method of Partial Fractions Ecept for a few very special cases, crrently we have no way to
More information1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?
1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished? 2. (2 )What are the two main ways to define performance? 3. (2 )What
More informationClassify by number of ports and examine the possible structures that result. Using only one-port elements, no more than two elements can be assembled.
Jnction elements in network models. Classify by nmber of ports and examine the possible strctres that reslt. Using only one-port elements, no more than two elements can be assembled. Combining two two-ports
More informationBLOOM S TAXONOMY. Following Bloom s Taxonomy to Assess Students
BLOOM S TAXONOMY Topic Following Bloom s Taonomy to Assess Stdents Smmary A handot for stdents to eplain Bloom s taonomy that is sed for item writing and test constrction to test stdents to see if they
More informationLecture Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2
BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Lectre Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2 Prepared by, Dr. Sbhend Kmar Rath, BPUT, Odisha. Tring Machine- Miscellany UNIT 2 TURING MACHINE
More informationLecture Notes: Finite Element Analysis, J.E. Akin, Rice University
9. TRUSS ANALYSIS... 1 9.1 PLANAR TRUSS... 1 9. SPACE TRUSS... 11 9.3 SUMMARY... 1 9.4 EXERCISES... 15 9. Trss analysis 9.1 Planar trss: The differential eqation for the eqilibrim of an elastic bar (above)
More informationFEA Solution Procedure
EA Soltion Procedre (demonstrated with a -D bar element problem) EA Procedre for Static Analysis. Prepare the E model a. discretize (mesh) the strctre b. prescribe loads c. prescribe spports. Perform calclations
More informationPulses on a Struck String
8.03 at ESG Spplemental Notes Plses on a Strck String These notes investigate specific eamples of transverse motion on a stretched string in cases where the string is at some time ndisplaced, bt with a
More informationAssignment Fall 2014
Assignment 5.086 Fall 04 De: Wednesday, 0 December at 5 PM. Upload yor soltion to corse website as a zip file YOURNAME_ASSIGNMENT_5 which incldes the script for each qestion as well as all Matlab fnctions
More informationEssentials of optimal control theory in ECON 4140
Essentials of optimal control theory in ECON 4140 Things yo need to know (and a detail yo need not care abot). A few words abot dynamic optimization in general. Dynamic optimization can be thoght of as
More informationWorst-case analysis of the LPT algorithm for single processor scheduling with time restrictions
OR Spectrm 06 38:53 540 DOI 0.007/s009-06-043-5 REGULAR ARTICLE Worst-case analysis of the LPT algorithm for single processor schedling with time restrictions Oliver ran Fan Chng Ron Graham Received: Janary
More informationFormal Methods for Deriving Element Equations
Formal Methods for Deriving Element Eqations And the importance of Shape Fnctions Formal Methods In previos lectres we obtained a bar element s stiffness eqations sing the Direct Method to obtain eact
More informationCFD-Simulation thermoakustischer Resonanzeffekte zur Bestimmung der Flammentransferfunktion
CFD-Simlation thermoakstischer Resonanzeffekte zr Bestimmng der Flammentransferfnktion Ator: Dennis Paschke Technische Universität Berlin Institt für Strömngsmechanik nd Technische Akstik FG Experimentelle
More information61C In the News. Processor Design: 5 steps
www.eetimes.com/electronics-news/23235/thailand-floods-take-toll-on--makers The Thai floods have already claimed the lives of hundreds of pele, with tens of thousands more having had to flee their homes
More informationReflections on a mismatched transmission line Reflections.doc (4/1/00) Introduction The transmission line equations are given by
Reflections on a mismatched transmission line Reflections.doc (4/1/00) Introdction The transmission line eqations are given by, I z, t V z t l z t I z, t V z, t c z t (1) (2) Where, c is the per-nit-length
More informationTEST 1 REVIEW. Lectures 1-5
TEST 1 REVIEW Lectures 1-5 REVIEW Test 1 will cover lectures 1-5. There are 10 questions in total with the last being a bonus question. The questions take the form of short answers (where you are expected
More informationFRÉCHET KERNELS AND THE ADJOINT METHOD
PART II FRÉCHET KERNES AND THE ADJOINT METHOD 1. Setp of the tomographic problem: Why gradients? 2. The adjoint method 3. Practical 4. Special topics (sorce imaging and time reversal) Setp of the tomographic
More informationOptimal Control of a Heterogeneous Two Server System with Consideration for Power and Performance
Optimal Control of a Heterogeneos Two Server System with Consideration for Power and Performance by Jiazheng Li A thesis presented to the University of Waterloo in flfilment of the thesis reqirement for
More informationUNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL
8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING - 19-1 April 01, Tallinn, Estonia UNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL Põdra, P. & Laaneots, R. Abstract: Strength analysis is a
More informationEXPT. 5 DETERMINATION OF pk a OF AN INDICATOR USING SPECTROPHOTOMETRY
EXPT. 5 DETERMITIO OF pk a OF IDICTOR USIG SPECTROPHOTOMETRY Strctre 5.1 Introdction Objectives 5.2 Principle 5.3 Spectrophotometric Determination of pka Vale of Indicator 5.4 Reqirements 5.5 Soltions
More informationUnit 6: Branch Prediction
CIS 501: Computer Architecture Unit 6: Branch Prediction Slides developed by Joe Devie/, Milo Mar4n & Amir Roth at Upenn with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi,
More informationSimple Instruction-Pipelining (cont.) Pipelining Jumps
6.823, L9--1 Simple ruction-pipelining (cont.) + Interrupts Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Src1 ( j / ~j ) Src2 ( / Ind) Pipelining Jumps
More informationCSE Computer Architecture I
Execution Sequence Summary CSE 30321 Computer Architecture I Lecture 17 - Multi Cycle Control Michael Niemier Department of Computer Science and Engineering Step name Instruction fetch Instruction decode/register
More informationBayes and Naïve Bayes Classifiers CS434
Bayes and Naïve Bayes Classifiers CS434 In this lectre 1. Review some basic probability concepts 2. Introdce a sefl probabilistic rle - Bayes rle 3. Introdce the learning algorithm based on Bayes rle (ths
More informationChapter 3. Preferences and Utility
Chapter 3 Preferences and Utilit Microeconomics stdies how individals make choices; different individals make different choices n important factor in making choices is individal s tastes or preferences
More informationModule 4. Analysis of Statically Indeterminate Structures by the Direct Stiffness Method. Version 2 CE IIT, Kharagpur
Modle Analysis of Statically Indeterminate Strctres by the Direct Stiffness Method Version CE IIT, Kharagr Lesson The Direct Stiffness Method: Trss Analysis (Contined) Version CE IIT, Kharagr Instrctional
More informationLab Manual for Engrd 202, Virtual Torsion Experiment. Aluminum module
Lab Manal for Engrd 202, Virtal Torsion Experiment Alminm modle Introdction In this modle, o will perform data redction and analsis for circlar cross section alminm samples. B plotting the torqe vs. twist
More informationMath 116 First Midterm October 14, 2009
Math 116 First Midterm October 14, 9 Name: EXAM SOLUTIONS Instrctor: Section: 1. Do not open this exam ntil yo are told to do so.. This exam has 1 pages inclding this cover. There are 9 problems. Note
More informationMicroprocessor Power Analysis by Labeled Simulation
Microprocessor Power Analysis by Labeled Simulation Cheng-Ta Hsieh, Kevin Chen and Massoud Pedram University of Southern California Dept. of EE-Systems Los Angeles CA 989 Outline! Introduction! Problem
More informationIntroduction to Quantum Information Processing
Introdction to Qantm Information Processing Lectre 5 Richard Cleve Overview of Lectre 5 Review of some introdctory material: qantm states, operations, and simple qantm circits Commnication tasks: one qbit
More informationComputer Architecture
Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines
More information10.4 Solving Equations in Quadratic Form, Equations Reducible to Quadratics
. Solving Eqations in Qadratic Form, Eqations Redcible to Qadratics Now that we can solve all qadratic eqations we want to solve eqations that are not eactl qadratic bt can either be made to look qadratic
More informationDiscontinuous Fluctuation Distribution for Time-Dependent Problems
Discontinos Flctation Distribtion for Time-Dependent Problems Matthew Hbbard School of Compting, University of Leeds, Leeds, LS2 9JT, UK meh@comp.leeds.ac.k Introdction For some years now, the flctation
More information4 Exact laminar boundary layer solutions
4 Eact laminar bondary layer soltions 4.1 Bondary layer on a flat plate (Blasis 1908 In Sec. 3, we derived the bondary layer eqations for 2D incompressible flow of constant viscosity past a weakly crved
More information5. The Bernoulli Equation
5. The Bernolli Eqation [This material relates predominantly to modles ELP034, ELP035] 5. Work and Energy 5. Bernolli s Eqation 5.3 An example of the se of Bernolli s eqation 5.4 Pressre head, velocity
More information3.4-Miscellaneous Equations
.-Miscellaneos Eqations Factoring Higher Degree Polynomials: Many higher degree polynomials can be solved by factoring. Of particlar vale is the method of factoring by groping, however all types of factoring
More informationSimulation investigation of the Z-source NPC inverter
octoral school of energy- and geo-technology Janary 5 20, 2007. Kressaare, Estonia Simlation investigation of the Z-sorce NPC inverter Ryszard Strzelecki, Natalia Strzelecka Gdynia Maritime University,
More informationSetting The K Value And Polarization Mode Of The Delta Undulator
LCLS-TN-4- Setting The Vale And Polarization Mode Of The Delta Undlator Zachary Wolf, Heinz-Dieter Nhn SLAC September 4, 04 Abstract This note provides the details for setting the longitdinal positions
More informationMomentum Equation. Necessary because body is not made up of a fixed assembly of particles Its volume is the same however Imaginary
Momentm Eqation Interest in the momentm eqation: Qantification of proplsion rates esign strctres for power generation esign of pipeline systems to withstand forces at bends and other places where the flow
More informationPREDICTABILITY OF SOLID STATE ZENER REFERENCES
PREDICTABILITY OF SOLID STATE ZENER REFERENCES David Deaver Flke Corporation PO Box 99 Everett, WA 986 45-446-6434 David.Deaver@Flke.com Abstract - With the advent of ISO/IEC 175 and the growth in laboratory
More informationAdvanced topics in Finite Element Method 3D truss structures. Jerzy Podgórski
Advanced topics in Finite Element Method 3D trss strctres Jerzy Podgórski Introdction Althogh 3D trss strctres have been arond for a long time, they have been sed very rarely ntil now. They are difficlt
More informationDesigning of Virtual Experiments for the Physics Class
Designing of Virtal Experiments for the Physics Class Marin Oprea, Cristina Miron Faclty of Physics, University of Bcharest, Bcharest-Magrele, Romania E-mail: opreamarin2007@yahoo.com Abstract Physics
More informationEC 413 Computer Organization
EC 413 Computer Organization rithmetic Logic Unit (LU) and Register File Prof. Michel. Kinsy Computing: Computer Organization The DN of Modern Computing Computer CPU Memory System LU Register File Disks
More informationDecision Making in Complex Environments. Lecture 2 Ratings and Introduction to Analytic Network Process
Decision Making in Complex Environments Lectre 2 Ratings and Introdction to Analytic Network Process Lectres Smmary Lectre 5 Lectre 1 AHP=Hierar chies Lectre 3 ANP=Networks Strctring Complex Models with
More informationsin u 5 opp } cos u 5 adj } hyp opposite csc u 5 hyp } sec u 5 hyp } opp Using Inverse Trigonometric Functions
13 Big Idea 1 CHAPTER SUMMARY BIG IDEAS Using Trigonometric Fnctions Algebra classzone.com Electronic Fnction Library For Yor Notebook hypotense acent osite sine cosine tangent sin 5 hyp cos 5 hyp tan
More informationLecture 3, Performance
Lecture 3, Performance Repeating some definitions: CPI Clocks Per Instruction MHz megahertz, millions of cycles per second MIPS Millions of Instructions Per Second = MHz / CPI MOPS Millions of Operations
More informationDesign of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 4: Microprogramming Prof. Onur Mutlu ETH Zurich Spring 27 7 April 27 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationOutcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers
-2. -2.2 piral / Unit 2 Basic Boolean Algebra Logic Functions Decoders Multipleers Mark Redekopp Outcomes I know the difference between combinational and sequential logic and can name eamples of each.
More informationIssue = Select + Wakeup. Out-of-order Pipeline. Issue. Issue = Select + Wakeup. OOO execution (2-wide) OOO execution (2-wide)
Out-of-order Pipeline Buffer of instructions Issue = Select + Wakeup Select N oldest, read instructions N=, xor N=, xor and sub Note: ma have execution resource constraints: i.e., load/store/fp Fetch Decode
More informationTechnical Note. ODiSI-B Sensor Strain Gage Factor Uncertainty
Technical Note EN-FY160 Revision November 30, 016 ODiSI-B Sensor Strain Gage Factor Uncertainty Abstract Lna has pdated or strain sensor calibration tool to spport NIST-traceable measrements, to compte
More informationComplex Variables. For ECON 397 Macroeconometrics Steve Cunningham
Comple Variables For ECON 397 Macroeconometrics Steve Cnningham Open Disks or Neighborhoods Deinition. The set o all points which satis the ineqalit
More informationFEA Solution Procedure
EA Soltion Procedre (demonstrated with a -D bar element problem) MAE 5 - inite Element Analysis Several slides from this set are adapted from B.S. Altan, Michigan Technological University EA Procedre for
More informationStep-Size Bounds Analysis of the Generalized Multidelay Adaptive Filter
WCE 007 Jly - 4 007 London UK Step-Size onds Analysis of the Generalized Mltidelay Adaptive Filter Jnghsi Lee and Hs Chang Hang Abstract In this paper we analyze the bonds of the fixed common step-size
More informationPhysicsAndMathsTutor.com
C Integration - By sbstittion PhysicsAndMathsTtor.com. Using the sbstittion cos +, or otherwise, show that e cos + sin d e(e ) (Total marks). (a) Using the sbstittion cos, or otherwise, find the eact vale
More informationm = Average Rate of Change (Secant Slope) Example:
Average Rate o Change Secant Slope Deinition: The average change secant slope o a nction over a particlar interval [a, b] or [a, ]. Eample: What is the average rate o change o the nction over the interval
More informationIII. Demonstration of a seismometer response with amplitude and phase responses at:
GG5330, Spring semester 006 Assignment #1, Seismometry and Grond Motions De 30 Janary 006. 1. Calibration Of A Seismometer Using Java: A really nifty se of Java is now available for demonstrating the seismic
More informationMulti-Voltage Floorplan Design with Optimal Voltage Assignment
Mlti-Voltage Floorplan Design with Optimal Voltage Assignment ABSTRACT Qian Zaichen Department of CSE The Chinese University of Hong Kong Shatin,N.T., Hong Kong zcqian@cse.chk.ed.hk In this paper, we stdy
More informationFEA Solution Procedure
EA Soltion rocedre (demonstrated with a -D bar element problem) MAE - inite Element Analysis Many slides from this set are originally from B.S. Altan, Michigan Technological U. EA rocedre for Static Analysis.
More information