A Second Datapath Example YH16 Lecture 09 Prof. Yih Huang S365 1 A 16-Bit Architecture: YH16 A word is 16 bit wide 32 general purpose registers, 16 bits each Like MIPS, 0 is hardwired zero. 16 bit P 16 bit ALU Memory space: 2 16 words. S365 2 1
Instruction Format I 15 10 9 5 4 0 opcode rs rt 6 bits 5 bits 5 bits opcode 000000 000001 000010 000011 001000 001100 name add and or xor sub slt meaning rs rs + rt rs rs AND rt rs rs O rt rs rs XO rt rs rs rt rs 1 if rs<rt, or 0 otherwise S365 3 Instruction Format II opcode rs rt Immediate 6 bits 5 bits 5 bits 16 bits opcode 010000 010001 010010 010011 011000 name addi andi ori xori subi meaning rt rs + Immd16 rt rs AND Immd16 rt rs O Immd16 rt rs XO Immd16 rt rs Immd16 S365 4 2
Memory eference Instructions opcode name format meaning 100000 ld I rt Mem[rs] 100001 ldi II rt Mem[rs+Immd16] 100010 st I Mem[rs] rt 100011 sti II Mem[rs+Immd16] rt S365 5 Branch Instructions opcode name format meaning 110000 bt I P rs, if rt 0 110001 bf I P rs, if rt = 0 110100 beq II P P+Immd16, if rs = rt 110101 bne II P P+Immd16, if rs rt bt: branch when true bf: branch when false Unconditional branches? S365 6 3
Exercise reate a program that performs Mem[F0F2] = Mem[F0F0] + Mem[F0F1] Machine code Assembly code 1000 1001 1002 1003 1004 1005 1006 1007 1008 100A S365 7 Exercise reate a program that jumps to location ABD when Mem[2000] is zero. Machine code Assembly code 1000 1001 1002 1003 1004 1005 1006 1007 1008 100A S365 8 4
Processor Memory Interface PU A P MAx 00 01 10 Addr bus addr port I MD B MDx 0 1 Data bus W 0: ead 1: Write data port Memory S365 9 op Data Path Use0 rs 0 0 1 rt P 1x ALUop D MD P Fx 00 01 10 11 Uset d#1 d#2 egister File Wr# 0 1 rs rt A B 1 MD 0 1 00 01 10 11 ALU 2x Z S365 10 5
an t write to P if Pw==0. When Pw==1 write P if UseZ==00 write P if UseZ==01 and Z==1 write P if UseZ==10 and Z==0 Input to P determined by Px 1 Z Z UseZ A Px 0 1 00 01 10 P Write enable Pw S365 11 ALUop (4 bits) ontrol Singals 0000 AND 0001 O 0010 XO 0011 ADD 1011 SUB 1100 SLT opcode I[15:10] rs I[9:5] rt I[4:0] S365 12 6
Multiplexer controls: 1x (mux for the 1 st input of ALU), 2x (mux for the 2 nd input of ALU), Fx (mux for register file write), Px (mux for P) MAx (mux for memory address) MDx (mux for memory data) S365 13 Write enable controls: Aw, Bw, w, Pw, Fw, Iw, MDw, W UseZ: whether to use the Zero output from ALU to affect P writes. Use0: whether the first read register # is rs or 0. Uset: whether the write register # is rs or rt. S365 14 7
Discussion Notice the lack of Zw (enable-write-to-z) Z is written in all cycles. This creates constraints in timing. Example, to see A==B There is only one cycle where Z reflects the result of A B. That cycle follows the cycle of A-B. You must take into account these constraints when designing cycle by cycle activities of instructions. S365 15 Exercise Give the control signals to fetch the next instruction and increase P at the same time. I Mem[P] P + 1 ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 16 8
Exercise Give the control signals to carry out the following tasks in one cycle: P A 0 t P ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 17 ADD ycle 0: I Mem[P] P + 1 ycle 1: P A eg[rs] B eg[rt] ycle 2: A + B ycle 3: eg[rs] Instructions and, or, xor, sub, slt are similar. S365 18 9
Hints This is not a programming exercise. Activities in each cycle must be doable with the datapath without conflicts in resources. Parallel activities are not mandatory, that is, one-cycle-one-action is not wrong. In practice, parallel activities are highly desired in order reduce the numbers of cycles per instruction. S365 19 Exercise Give the control signals to execute ADD. ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 20 10
ADDI ycle 0: I Mem[P] P + 1 ycle 1: P A eg[rs] MD Mem[] ycle 2: A + MD ycle 3: eg[rt] P + 1 ycle 4: P andi, ori, xori, subi are similar S365 21 Exercise Give the control signals to execute ADDI. ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 22 11
LD (Load) ycle 0: I Mem[P] P + 1 ycle 1: A eg[rs] P ycle 2: MD Mem[A] ycle 3: eg[rt] MD S365 23 Exercise: ST (Store) S365 24 12
LDI (Load with Immd) ycle 0: I Mem[P] P + 1 ycle 1: A eg[rs] MD Mem[] P ycle 2: A + MD ycle 3: P + 1 MD Mem[] ycle 4: eg[rt] MD P S365 25 Exercise Give the cycle by cycle actions of STI S365 26 13
Exercise Give the control signals to execute STI. ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 27 BT (branch when true) ycle 0: I Mem[P] P + 1 ycle 2: Z (A==B) A eg[rs] ycle 1: P A eg[0] B eg[rt] ycle 3: P A, if Z S365 28 14
BEQ ycle 0: I Memory[P] P + 1 ycle 1: P MD Mem[] A eg[rs] B eg[rt] ycle 2: P + 1 ycle 3: P ycle 4: P + MD ycle 5: Z (A==B) ycle 6: P, if Z S365 29 Exercises Give the control signals of BEQ cycle 6 ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t Give the control signals of BT cycle 3. ALU A B Write Enables P F I M D W Multiplexer ontrols 1 2 F P MA M D Use Z 0 t S365 30 15
The ontrol Unit The control unit is responsible for generating control signals so that the datapath carries out right actions at right times. We use a 3-bit register S to keep track of the present cycle in executing the current instruction (2 nd cycle of add, 5 th of ldi, ) S is just three D flip-flops. S365 31 Generating ontrol Signals The conditions when Pw is true: ycle 1 of all instructions ycle 3 of bt, bf ycle 4 of addi, andi, ori, xori, subi ycle 4 of ldi, sti ycle 6 of beq, bne S365 32 16
Boolean expression to generate Pw: S 2 S 1 S 0 + S 2 S 1 S 0 ( I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 ) + S 2 S 1 S 0 ( I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 +I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 ) + S 2 S 1 S 0 ( I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 ) + S 2 S 1 S 0 ( I 15 I 14 I 13 I 12 I 11 I 10 + I 15 I 14 I 13 I 12 I 11 I 10 ) S365 33 Exercise Give the conditions when ALU 3 is true: Boolean expression to generate ALU 3 : S365 34 17
Exercise Give the conditions when Uset is true: Boolean expression to generate Uset: S365 35 Exercise Give the conditions when MAx0 is true: Give the conditions when MAx1 is true: S365 36 18
Abstract View of YH 16 ontrol NS 0 S 0 NS 1 NS 2 S 1 S 2 Next State Function opcode 6 Output Function ontrol Signals 25 S365 37 Output Function Figure out the Boolean expressions for all 25 control signals, simplify them, and draw digital circuit accordingly. The result is the output function. Next we work on the next state function, for which we need to figure out the state transition diagram first. S365 38 19
State Transition Diagram 000 001 010 011 110 101 100 S365 39 The conditions when NS 0 is true The Boolean expression for NS 0 S365 40 20
Hardwired Implementations Write down Boolean expressions for control signals and next states Simplify the expressions and draw digital circuits. Or one can use a more systematic approach. S365 41 A Truth Table for YH16 ontrols 6-bit for opcode 3-bit state 25-bit control singals 3-bit next state oooooo sss ccccccc ccccc sss S365 42 21
OM Implementations Just burn the truth table to OM. Address = opcode and state (cycle #) If we limit all instructions to 8 steps, then YH16 needs a OM with 9 bit address (3-bit next state; 6 bits opcode) Each OM word has 25 + 3 = 28 bits S365 43 OM Addresses and ontents 6-bit for opcode 3-bit for state xxxxxxxxxx 9 bit Address 25-bit for control singals 3-bit for next state 28 bit contents S365 44 22
OM-based ontrol I 15:10 Address Port Data Out 25-bit ontrol Signals S 2 S 1 S 1 ontrol OM NS 0 NS 1 NS 2 S365 45 Microprogramming It is as if every instruction (opcode) has a dedicated subroutine, limited to 8 steps. They are of course not real subroutines (at least not the kinds we learn in 112 to 310). Always keep in mind that every step corresponds to one cycle time s worth of hardware actions. What kinds of actions allowed (not allowed) entirely depend on the particular datapath. S365 46 23
Instruction Execution Flow onsider the executions of Or r2,r3 Add r1 r2 an you see the flow of microinstructions? 000000xxx (add) 000001xxx (and) 000010xxx (or) 000011xxx (xor) S365 47 Pros Microcode: Trade-offs Easy to design and write Design architecture and microcode in parallel Easy to make changes after chips are sold (to fix bugs for example) ons Slow Use more silicon areas than simplified Boolean expressions. S365 48 24
How is a P powered up? The processor receives a ESET signal It also sent to the processor when you push the reset button. ESET is one of the inputs to U Upon reset a processor executes the instruction at a fixed memory location. The exact location depends on the architecture. S365 49 Bootstrapping in IA32 A pin is dedicated to an external signal called ESET. When receiving ESET, an x86 processor is hardwired to jump to address 0xFFFFFFF0. The address has to be part of OM. The set of programs in OM is called Basic Input and Output Systems, or BIOS. S365 50 25
Once We Enter BIOS Perform a set of hardware test called Power-On Self Test, or POST. Initialize hardware and display installed devices Look for booting devices. Load the OS from a booting device. S365 51 Booting/esetting YH16 Upon receiving a reset signal Abort the current instruction Set P to jump to location 0000. Precisely: P 0000 16 Next state 000 2 I? How is this achieved in the control unit? S365 52 26
D Flip-Flop with eset and Write-Enable D Write Enable D D D Q latch D Q D latch _ Q Q _ Q eset lock S365 53 27