CSE 675.: Introdction to Compter Architectre Designing IPS Processor (lti-cycle) Presentation H Reading Assignment: 5.5,5.6 lti-cycle Design Principles Break p eection of each instrction into steps. The nmber of steps and the tasks in each step are instrction dependent. Each step takes one clock cycle. Balance the amont of work to be done in each clock cycle. Restrict each cycle to se only one major fnctional nit in the data path, or if more than one major fnctional nit sed they shold be sed in parallel. ajor nits are memory, register file and ALU, since we assme that they introdce the most significant delays dring eection of instrctions. We assme all other delays in the wiring is negligible. g. babic Presentation H
lti-cycle Design Principles (cont.) Dring eection of any instrction, we may be resing fnctional nits, bt in different steps (clock cycles), e.g. Single memory can be sed for instrction and data, ALU will be sed to compte not only tasks it performed in the single-cycle design (e.g. lw & sw addresses and R-type instrction calclations), bt it will be sed to increment PC (by ) and to calclate branch target address. Control signals will not be determined solely by the instrction in eection (i.e. its op-code and/or fnction code) bt also by the particlar clock cycle the instrction is being eected in. At the end of each cycle dring instrction eection store intermediate vales for se in later cycles. For that prpose, introdce additional internal registers. g. babic Presentation H Elaboration on Work Balance in Each Step Dring any given step it is not allowed to have a serial combination of sage of the major fnctional nits; for eample: It is not allowed that in one step contents of registers are read from the register file and then those contents are sed as operands for ALU in the same step, or It is not allowed that in one step ALU performs a fnction on some operands and its reslt is sed as an address for memory read or write in the same step. This principle is introdced to avoid that any step reqires too mch time, implying that clock cycles have to be of that nnecessary length. Notice that two of the major fnctional nits are allowed to be sed in parallel, e.g. reading contents from a register file and the ALU performing a fnction on nrelated data at the same time. g. babic Presentation H
Five Steps In Instrction Eection ajor steps in eection of an instrction are: Instrction Fetch Instrction Decode and Register Fetch Eection, emory Address Comptation, or Branch Completion emory Access or R-type instrction completion Write-back step Not every instrction will have all those steps Or instrctions will take -5 steps, i.e. -5 clock cycles. The first two steps are common to all instrctions. g. babic Presentation H 5 lti-cycle Datapath High Level View Figre 5.5 The se of shared fnctional nits reqires new temporary registers that hold data between clock cycles of the same instrction. The additional registers are: Instrction register (IR), emory data register (DR), A and B registers, ALUot register. 6
lti-cycle Datapath Detailed View I o r D e m e m I R R e g D s t R e g A L U S r c A P C A d d r e s s e m o r y e m D a t a [ 5 ] [ 6 ] [ 5 ] [ 5 ] R e g i s t e r s A B Z e r o A L U A L U r e s l t A L U O t [ 5 ] e m o r y 6 S i g n e t e n d S h i f t l e f t A L U c o n t r o l [ 5 ] Figre 5.7 with additions in red e m t o R e g A L U S r c B A L U O p g. babic Presentation H 7 lti-cycle Datapath and Control P C C o n d P C S o r c e P C I o r D e m O t p t s A L U O p A L U S r c B e m C o n t r o l A L U S r c A e m t o R e g R e g P C A d d r e s s e m o r y e m D a t a [ - 6 ] [ 5 ] [ 6 ] [ 5 ] [ 5 ] e m o r y I R [ 5 ] O p [ 5 ] R e g D s t [ 5 ] 6 8 S h i f t 6 R e g i s t e r s S i g n e t e n d S h i f t l e f t A B A L U c o n t r o l l e f t P C [ - 8 ] Z e r o A L U A L U r e s l t J m p a d d r e s s [ - ] A L U O t [ 5 ] Figre 5.8 g. babic Presentation H 8
Step : Instrction Fetch Use PC to get instrction and pt it in the Instrction Register, i.e. IR emory[pc]; IorD=, emread, IRWrite Increment the PC by and pt the reslt back in the PC, i.e. PC [PC] + ; ALUSrcA=, ALUSrcB=, ALUOp=, PCSorce=, PCWrite Here are rles for signals that are omitted: If signal for m is not stated, it is don t care If ALU signals are not stated, they are don t care If emread, emwrite, RegWrite, IRWrite, PCWrite or PCWriteCond is not stated, it is nasserted, i.e. logical. g. babic Presentation H 9 Step : Instrction Decode & Register Fetch We aren't setting any control lines based on the instrction type, since we are bsy "decoding" it in or control logic. Read registers rs and rt in case we need them: A Reg[IR[5-]]; B Reg[IR[-6]]; Done atomatically Compte the branch address in case the instrction is a branch: ALUOt PC + (sign-etend(ir[5-]) << ); ALUSrcA=, ALUSrcB=, ALUOp= g. babic Presentation H 5
Step : Eecte, em Addr, Branch ALU is performing one of three fnctions, based on instrction type emory Reference (lw or sw): ALUOt A + sign-etend(ir[5-]); ALUSrcA=, ALUSrcB=, ALUop= R-type: ALUOt A op B; ALUSrcA=, ALUSrcB=, ALUp= Branch on Eqal: if (A==B) PC ALUOt; ALUSrcA=, ALUSrcB=, ALUp= PCSorce=, PCWriteCond Note: beq instrction is done, ths this instrction reqires clock cycles to eecte. g. babic Presentation H Steps and 5: Instrction Dependent Step : R-type and emory Access Loads and stores access memory DR emory[aluot] (load); or emory[aluot] B (store); R-type instrctions finish Reg[IR[5-]] ALUOt; IorD=, emread IorD=, emwrite RegDst=, emtoreg=, RegWrite Register write actally takes place at the end of the cycle on the falling edge Store and R-type instrctions are done in clock cycles Step 5: Write back (load only) RegDst=, emtoreg=, Reg[IR[-6]] DR RegWrite g. babic Presentation H 6
Smmary of Instrction Eections Figre 5. Step name Action for R-type instrctions Action for memory-reference instrctions Action for branches Action for jmps Instrction fetch IR emory[pc] PC PC + Instrction decode/register fetch A Reg [IR[5-]] B Reg [IR[-6]] ALUOt PC + (sign-etend (IR[5-]) << ) Eection, address ALUOt A op B ALUOt A + sign-etend if (A ==B) then PC PC [-8] II comptation, branch/ (IR[5-]) PC ALUOt (IR[5-]<<) jmp completion emory access or R-type Reg [IR[5-]] Load: DR emory[aluot] completion ALUOt or Store: emory [ALUOt] B emory read completion Load: Reg[IR[-6]] DR Note: Jmp instrction added: PCSorce=, PCWrite g. babic Presentation H Implementing Control Vales of control signals are dependent on: what instrction is being eected and which step (i.e. clock cycle) is being performed. Use the information we ve accmlated to specify a finite state machine FS: specify the finite state machine graphically, or se microprogramming. Then, an implementation can be derived from specification. g. babic Presentation H 7
Finite State achines N e t s t a t e Figre B.. C r r e n t s t a t e N e t - s t a t e f n c t i o n C l o c k I n p t s O t p t f n c t i o n O t p t s A crrent state is kept in the Crrent state register. Net state fnction and Otpt fnction are determined by Crrent state and Inpts. In or case, Otpt fnction is based only on Crrent state. g. babic Presentation H 5 Finite State achine Graph for Control Unit e m o r y a d d r e s s c o m p t a t i o n A L U S r c A = A L U S r c B = A L U O p = d e c o d e / f e t c h f e t c h e m A L U S r c A = I o r D = A L U S r c A = S t a r t I R A L U S r c B = A L U S r c B = A L U O p = A L U O p = P C ( O p = ' L W ' ) o r ( O p = ' S W ' ) P C S o r c e = ( O p = R - t y p e ) B r a n c h E e c t i o n c o m p l e t i o n ( O p = ' B E Q ' ) J m p c o m p l e t i o n 6 8 9 A L U S r c A = A L U S r c A = A L U S r c B = P C A L U S r c B = A L U O p = P C S o r c e = A L U O p = P C C o n d P C S o r c e = ( O p = ' J ' ) ( O p = ' L W ' ) e m o r y a c c e s s e m o r y a c c e s s R - t y p e c o m p l e t i o n 5 7 Figre 5.8 e m I o r D = e m I o r D = R e g D s t = R e g e m t o R e g = - b a c k s t e p R e g D s t = R e g e m t o R e g = g. babic Presentation H 6 8
Implementation of FS for Control Unit P C P C C o n d I o r D e m e m C o n t r o l l o g i c I R e m t o R e g P C S o r c e O t p t s A L U O p A L U S r c B A L U S r c A R e g R e g D s t Figre C.. I n p t s N S N S N S N S O p 5 O p O p O p O p O p S S S S o p c o d e f i e l d S t a t e g. babic Presentation H 7 IPS Interrpt Processing We are implementing processing of only two eceptions: illegal op- code and integer overflow. When any of the eceptions occrs, IPS processor processes the eception (as any other interrpt) in the following steps: Step : EPC register gets a vale eqal to address of a falty instrction. Step.: PC 88 6 Case register a code of the eception illegal op-code = integer overflow = Step. Processor is now rnning in Kernel mode. Note: we are not implementing step. g. babic Presentation H 8 9
lti-cycle Datapath for Eception Handling P C C o n d P C O t p t s I o r D C a s e I n t C a s e E P C P C S o r c e e m A L U O p e m e m t o R e g I R W r t e C o n t r o l O p [ 5 ] A L U S r c B A L U S r c A R e g R e g D s t Overflow P C A d d r e s s e m o r y e m D a t a [ 5 ] 6 8 Add zeros [ - 6 ] ALU [ - 8 ] [ 5 ] [ 6 ] [ 5 ] [ 5 ] e m o r y [ 5 ] 6 R e g i s t e r s S i g n e t e n d S h i f t l e f t A B A L U c o n t r o l Z e r o A L U A L U r e s l t J m p a d d r e s s [ - ] 8 O 8 A L U O t E P C C a s e [ 5 ] Figre 5.9 with corrections in red g. babic Presentation H 9 FS Graph with Eception Handling e m o r y a d d r e s s c o m p t a t i o n A L U S r c A = A L U S r c B = A L U O p = S t a r t ( O p = ' L W ' ) o r ( O p = ' S W ' ) 6 d e c o d e / f e t c h R e g i s t e r f e t c h e m A L U S r c A = I o r D = A L U S r c A = I R A L U S r c B = A L U S r c B = A L U O p = A L U O p = P C P C S o r c e = E e c t i o n A L U S r c A = A L U S r c B = A L U O p = 8 ( O p = R - t y p e ) B r a n c h c o m p l e t i o n A L U S r c A = A L U S r c B = A L U O p = P C C o n d P C S o r c e = ( O p = ' B E Q ' ) 9 ( O p = ' J ' ) ALUOp= ( O p = o t h e r ) J m p c o m p l e t i o n P C P C S o r c e = ALUSrcA= ALUSrcB= ( O p = ' L W ' ) e m o r y a c c e s s 5 e m I o r D = e m o r y a c c e s s R - t y p e c o m p l e t i o n I n t C a s e = 7 C a s e R e g D s t = A L U S r c A = e m R e g O v e r f l o w A L U S r c B = I o r D = e m t o R e g = A L U O p = E P C P C P C S o r c e = I n t C a s e = C a s e A L U S r c A = A L U S r c B = A L U O p = E P C P C P C S o r c e = - b a c k s t e p O v e r f l o w R e g e m t o R e g = R e g D s t = Figre 5. with additions in red g. babic Presentation H