UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

UNIVERSITY OF LIFORNI, ERKELEY ollege of Engineering Department of Electrical Engineering and omputer Sciences Elad lon Homework #3 EE141 Due Thursday, September 13 th, 5pm, box outside 125 ory PROLEM 1: MOS Logic a) Implement the logic function shown below with a static MOS gate. Out = + + Note that we can first simplify this expression implementing this logic gate. Out = + + = + + + = + + + = + = ( + ) The expression is now in a simplified form which is easily translated into a static MOS gate: Vdd Out 1

b) Someone claims they have implemented a static MOS gate with the circuit shown below. In order to find the problem with this gate, using the switch model, fill in a table showing the voltage (i.e., Vdd, Gnd, Vth, etc.) at Out for all possible combinations of the inputs,, and. In other words, you should fill out the truth table for the gate, but with voltages instead of ones and zeros. Vdd Out First, we should write out a truth table (with voltages at the output) for this circuit. Out 0 0 0 Gnd 0 0 1 Gnd 0 1 0 Gnd 0 1 1 Vdd-Vth 1 0 0 Gnd 1 0 1??? 1 1 0 Gnd 1 1 1 Vdd-Vth y looking at the truth table, we see that there are several problematic input combinations. In particular, when the input combination -- equals either 0-1- 1 or 1-1-1, the output is not driven by Vdd via a low resistance path but instead gets stuck at Vdd-Vth, because NMOS transistors are poor at passing a logical 1. The input combination -- equals 1-0-1 is an issue as well, but for a different reason. In this case, the output is not well defined, since the node connected to the PMOS driven by and the NMOS driven by isn t tied to either supply. ecause of these issues, this is not a static MOS gate. c) y adding just two more transistors to the circuit shown above, fix the circuit so that it will indeed implement a static gate, with the function shown below. Note 2

that you are free to use both the true and complement versions of the input signals (,, and ) to achieve this. Out = ( + ) In the solution to part b), we grouped the three problematic input combinations into two different groups. In the case of the first group, which consists input combinations 0-1-1 and 1-1-1 (in the form --), the problem stems from NMOS transistors doing a poor job at passing a logical 1. One way to fix this is to use a PMOS transistor in parallel, with the gate tied to. This way, when = 1, the input to the PMOS will be 0, and it will be able to cleanly pass Vdd to the output. For the last problematic input combination, we note that the NMOS which is driven by when = 0 is essentially half of an inverter. The drain of the NMOS is floating (ie, not connected to ground or supply via a low resistance path) when = 1. To fix this, we should add the other half of the inverter, and add a PMOS with the gate, drain connected to the same nets as the gate, drain of the NMOS, and with its source tied to Vdd. This revised schematic is shown below: Vdd Out To verify that this implements the desired function, let s write out the truth table : Out 0 0 0 Gnd 0 0 1 Gnd 0 1 0 Gnd 0 1 1 Vdd 1 0 0 Gnd 1 0 1 Vdd 3

1 1 0 Gnd 1 1 1 Vdd We can write out the sum-of-products expression and simplify: Out = + + = + + + = + + + = + = ( + ) Now, we ve verified that this is a static MOS gate that implements the desired function PROLEM 2: Gate Sizing Recall that we have defined β as the ratio between the width of the PMOS transistor and NMOS transistor i.e., β = W p /W n. In this problem we will explore how to optimize β based on different design metrics by using HSPIE. For the following OI gate, we will use the same sizing for all PMOS transistors in the PUN. Similarly, all the NMOS transistors in the PDN are identically sized. You should make the NMOS transistor 1μm wide, and alter the width of the PMOS transistor to change the gate s β ratio. The channel lengths of both the NMOS and PMOS transistors should be fixed at 0.09μm. This is a good chance to explore HSPIE and use some of its built-in functionality to make this problem easier. (Hint: you ll want to sweep transistor parameters and use.mesure statements. Examples will be shown in discussion session.) a) Plot VIL and VIH of the OI gate shown below versus the β ratio, for the input. In order to measure VIL and VIH, you should assume that the, inputs are set to Vdd and GND, respectively, and then sweep the input from 0 to Vdd to trace out the VT. 4

Vdd β β β 1μm 1μm 1μm Out The plot of VIL (yellow) and VIH (green) vs β is shown below: * HW3, Problem 2a 5

.lib '/home/ff/ee141/models/gpdk090_mos.sp' TT_S1V ** Parameters **.param step =.001.param vdd_val = 1.2.param beta = 2 ** OI defintion **.subckt aoi out a b c gnd ** Power Supply ** vdd vdd gnd vdd_val ** Pull Up Network ** M0 pint a vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M1 pint b vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M2 out c pint vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' ** Pull Down Network ** M3 out a nint gnd gpdk090_nmos1v L=90e-9 W=1u M4 nint b gnd gnd gpdk090_nmos1v L=90e-9 W=1u M5 out c gnd gnd gpdk090_nmos1v L=90e-9 W=1u.ends ** Power Supplies ** vdd vdd gnd vdd_val vin in gnd vdd_val vind ind in step ** OI Gates for Vil, Vih ** X0 out in vdd gnd gnd aoi M=1 X0d outd ind vdd gnd gnd aoi M=1 ** Options **.options post=2 nomod.op ** D Sweep **.dc vin 0 vdd_val step beta.1 10 0.05 ** Vil and Vil measurements vs beta **.measure dc vil find v(in) when par('(v(outd)-v(out))/step')=-1 cross=1.measure dc vih find v(in) when par('(v(outd)-v(out))/step')=-1 cross=2.end 6

b) Sweep β and plot the high-to-low transition delay and the low-to-high transition delay for the second OI gate in the fanout-of-4 chain shown below. t p Vdd M = 1 Vdd M = 4 Vdd M = 16 Vdd M = 64 Out The plot of high-to-low (green) and low-to-high (yellow) transition delays versus beta is shown below: * HW3, Problem 2b.lib '/home/ff/ee141/models/gpdk090_mos.sp' TT_S1V ** Parameters **.param step =.001.param vdd_val = 1.2.param beta = 2 ** OI defintion ** 7

.subckt aoi out a b c gnd ** Power Supply ** vdd vdd gnd vdd_val ** Pull Up Network ** M0 pint a vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M1 pint b vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M2 out c pint vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' ** Pull Down Network ** M3 out a nint gnd gpdk090_nmos1v L=90e-9 W=1u M4 nint b gnd gnd gpdk090_nmos1v L=90e-9 W=1u M5 out c gnd gnd gpdk090_nmos1v L=90e-9 W=1u.ends ** Power Supplies ** vdd vdd gnd vdd_val vin1 in1 gnd pulse 0 vdd_val 100ps 100ps 100ps.4ns 1ns ** OI Gates for delay measurements ** X1 in2 in1 vdd gnd gnd aoi M=1 X2 in3 in2 vdd gnd gnd aoi M=4 X3 in4 in3 vdd gnd gnd aoi M=16 X4 out_delay in4 vdd gnd gnd aoi M=64 ** Options **.options post=2 nomod.op ** Transient Simulation **.tran 100f 5n sweep beta 0.1 10 0.05 ** Delay Measurements vs beta**.measure tran tplh trig v(in1) val='vdd_val/2' fall=2 targ v(in2) val='vdd_val/2' + rise=2.measure tran tphl trig v(in1) val='vdd_val/2' rise=2 targ v(in2) val='vdd_val/2' + fall=2 ** Worst case delay through 3 gates vs beta for part c **.measure tran delay_case1 param='tplh+2*tphl'.measure tran delay_case2 param='2*tplh+tphl'.measure tran delay_wc param='max(delay_case1, delay_case2)'.end 8

c) What β would you use to minimize the worst case delay of 3 fanout-of-4 OI gates? (Hint: The delay characteristics are the same as in part b you should be able to use the data extracted from there to answer this) In the netlist in part b), there were several lines included which used measure the worst case delay of a cascade of three of these OI gates, each with a fanout of 4. These lines are listed below:.measure tran delay_case1 param='tplh+2*tphl'.measure tran delay_case2 param='2*tplh+tphl'.measure tran delay_wc param='max(delay_case1, delay_case2)' s stated in the code, there are two delays for a cascade of three gates: one which involves 2 high-to-low and 1 low-to-high transitions, and one that involves 1 high-to-low and 2 low-to-high transitions. We take the max of these two to get the worst case. This generates a plot for the worst case delay of 3 fanout-of-4 OI gates, vs beta, as shown below: From this plot, we see that there is a shallow minimum for worst case delay for beta ranging from about 1.4 to 1.8. d) Sweep β and measure the energy and power of this same OI gate. In this simulation, you should make the input voltage source a 1GHz clock with a 50% 9

duty cycle and 100ps rise/fall time. (n example of how to measure average power and energy using HSPIE will be shown in the discussion session.) elow are the plots of energy, power vs beta, respectively: 10

The plots look roughly linear with beta. This fits with our intuition, as the by increasing beta, we re linearly increasing the drain capacitance of the PMOS transistors in this gate and the gate capacitance of the PMOS transistors in the next stage. Energy is proportional to capacitance, so it should scale linearly as capacitance scales. verage power should have the same shape, since it is just the energy in a cycle divided by the period of the cycle. The netlist for this portion is shown below: * HW3, Problem 2d.lib '/home/ff/ee141/models/gpdk090_mos.sp' TT_S1V ** Parameters **.param step =.001.param vdd_val = 1.2.param beta = 2 ** OI defintion **.subckt aoi out a b c gnd ** Power Supply ** vdd vdd gnd vdd_val 11

** Pull Up Network ** M0 pint a vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M1 pint b vdd vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' M2 out c pint vdd gpdk090_pmos1v L=90e-9 W='beta*1e-6' ** Pull Down Network ** M3 out a nint gnd gpdk090_nmos1v L=90e-9 W=1u M4 nint b gnd gnd gpdk090_nmos1v L=90e-9 W=1u M5 out c gnd gnd gpdk090_nmos1v L=90e-9 W=1u.ends ** Power Supplies ** vdd vdd gnd vdd_val vin1 in1 gnd pulse 0 vdd_val 100ps 100ps 100ps.4ns 1ns ** OI Gates for delay measurements ** X1 in2 in1 vdd gnd gnd aoi M=1 X2 in3 in2 vdd gnd gnd aoi M=4 X3 in4 in3 vdd gnd gnd aoi M=16 X4 out_delay in4 vdd gnd gnd aoi M=64 ** Options **.options post=2 nomod.op ** Transient Simulation **.tran 100f 5n sweep beta 0.1 10 0.05 ** Energy measurements vs beta **.measure tran tstart when v(in1)='vdd_val/2' rise=2.measure tran tstop when v(in1)='vdd_val/2' rise=3.measure tran i_avg avg i(x2.vdd) from='tstart' to='tstop'.measure tran energy param='-1 * i_avg * (tstop - tstart)'.measure tran p_avg param='-1 * i_avg * vdd_val'.end PROLEM 3: Decoder Warm-up In this problem, we will implement two decoders using NOR2 and NND2 gates and inverters and then analyze them to see the effect of the number of inputs on the energy and number of gates. You can use both true and complement forms of the address signals as inputs. 12

a) Implement a 2 to 4 decoder by using only NOR2 gates and inverters. Draw the complete schematic and label the inputs and outputs. The truth table for the 2:4 decoder is shown below: Decode 0 0 y0 0 1 y1 1 0 y2 1 1 y3 Where the decode column lists the output which should be high (all other outputs should be low) s discussed in discussion, these decoders are implemented by taking the and of every possible combination of the inputs. This means that if we have n inputs, we will have 2 n outputs. Then, we can see that y0 =, y1 =, etc. Using De Morgan s Law, we know that we know that + =, which states tells us that we can implement the ND2 function with a single NOR2 gate, provided we have access to inverted inputs, which is exactly the case we are presented with. possible implementation is shown below: y0 y2 y1 y3 b) Implement a 3 to 8 decoder by using only NOR2 and NND2 gates. Draw the complete schematic and label the inputs and outputs. The truth table for this 3:8 decoder is shown below: Decode 0 0 0 y0 0 0 1 y1 0 1 0 y2 0 1 1 y3 13

1 0 0 y4 1 0 1 y5 1 1 0 y6 1 1 1 y7 The decoder, like in part a), is implemented by taking the and of various input combinations. Using De Morgan s Law, each output can be expressed in the form of Y = = = +. Looking at this, as long as we re given both the normal and complement version of each input, we can implement each output with a single NOR2 and NND2 gate. Note that there are a myriad of ways to correctly implement this decoder, but two fairly straightforward implementations will be shown below. Implementation 1: y0 y1 y2 y3 y4 y5 y6 y7 14

Implementation 2: nother possible implementation takes note of the fact that each input uses the results of the 2:4 decoder to create a 3:8 decoder, to create a precoder. y0 y1 y2 y3 y4 y5 y6 y7 c) For this part of the problem you should ignore all the junction capacitors from the transistors and assume that each NOR2 and NND2 gate has 5fF and 4fF of input capacitance, respectively. How much energy is consumed by the above decoder from part b) every time one of the address inputs changes? Note that you shouldn t forget to include the energy consumed by the address inputs. The energy pulled out of the supply depends on the implementation. Here, we ll analyze the energy consumption in the two possible implementations shown above. The main equation we will need for this part is E = t V 2 DD, where t is the total capacitance charged by the supply. Implementation 1: In this implementation, or transitioning will result in 4 NND2 gates at the first level each having an input charged from 0 to 1. This will also cause the output of 2 NND2 gates to go from 0 to 1. This results in 2 NOR2 gates at the second level of logic each having an input charged from 0 to 1. Therefore, the total energy consumed when or transitions is: E = (4 4fF + 2 5fF) V 2 2 DD = 26fF V DD 15

When input transitions, 4 NOR2 gates at the second level will each have an input charge from 0 to 1. Therefore, the total energy consumed when transitions is: E = (4 5fF) V 2 2 DD = 20fF V DD Implementation 2: In this implementation, or transitioning will result in 2 NND2 gates at the first level each having an input charged from 0 to 1. Now, only 1 NND2 gate has its output go from 0 to 1. However, since each NND2 drives 2 NOR2 gates, this transition results in 2 NOR2 gates each having an input charged from 0 to 1. Therefore, the total energy consumed is: E = (2 4fF + 2 5fF) V 2 2 DD = 18fF V DD When input transitions, 4 NOR2 gates at the second level will each have an input charge from 0 to 1. Therefore, the total energy consumed when transitions is: E = (4 5fF) V 2 2 DD = 20fF V DD d) Now let s compare the energy consumption of the 3:8 decoder from part b) to a design that uses only NND2 gates and inverters. What is the maximum input capacitance of the inverter for which the NND2 and inverter only implementation has lower energy consumption than your implementation in part b)? nswer this question for the worst case energy consumption in either design, and use the same input capacitance numbers for the NOR2 and NND2 gates as in part c). gain, there are many possible implementations, but two implementations in a similar style to those in part b) will be shown. 16

Implementation 1: y0 y1 y0 y1 y0 y1 y0 y1 This implementation is very much in the same style as implementation 1 in part b) so the comparison will be done here relative to that implementation. In this implementation, or transitioning will result in 4 NND2 gates at the first level each having an input charge from 0 to 1. t the second level, this transition causes the output of 2 NND2 gates to transition from 0 to 1, charging up the inputs of 2 inverters. t the third level, two NND2 gates each have an input charged from 0 to 1, and at the fourth level, one inverter has its input charged from 0 to 1. The total energy consumed then is: E = 4 4fF + 2 ginv + 2 4fF + ginv V 2 2 DD = (24 ff + 3 ginv ) V DD transitioning causes 4 NND2 gates at the third logic level to each have an input charge from 0 to 1, and 1 inverter s input charge from 0 to 1 at the fourth logic level. The total energy consumed is then: E = 4 4fF + ginv V 2 2 DD = (16 ff + ginv ) V DD For all positive values of ginv, the energy consumption is dominated by the, switching. Thus, we can safely assume that this is the worst case for this design, 17

and compare it to the worst case for the implementation with NOR2 and NND2 gates. This happens to also be the case where or switches. 24 ff + 3 ginv V 2 DD = (26fF) 2 V DD 3 ginv = 2fF ginv = 2 3 ff Thus, for these two implementations, we can see that we require ginv 2 ff for 3 this design to have lower power consumption than implementation 1 in part b). Implementation 2: y0 y1 y2 y3 y4 y5 y6 y7 Similarly to the last part, this implementation is in the same style as implementation 2 in part b), as it has a precoder structure. ecause of this, this solution will compare implementation 2 in part b) to this implementation. In this implementation, or transitioning will result in 2 NND2 gates at the first level each having an input charge from 0 to 1. t the second level, this transition causes the output of 1 NND2 gate to transition from 0 to 1, charging up the input of 1 inverter. t the third level, two NND2 gates each have an input charged from 0 to 1, and at the fourth level, one inverter has its input charged from 0 to 1. The total energy consumed then is: 18

E = 2 4fF + ginv + 2 4fF + ginv V 2 2 DD = (16 ff + 2 ginv ) V DD transitioning causes 4 NND2 gates at the third logic level to each have an input charge from 0 to 1, and 1 inverter s input charge from 0 to 1 at the fourth logic level. The total energy consumed is then: E = 4 4fF + ginv V 2 2 DD = (16 ff + ginv ) V DD gain, we see that the worst case, for positive ginv, occurs for or transitioning. With implementation 2 in part b), the worst case actually occurs when transitions. We can set the energies in these two cases equal to find the critical ginv for which this design consumes more power than that of implementation 2 in part b). 16 ff + 2 ginv V 2 DD = (20fF) 2 V DD 2 ginv = 4fF ginv = 2fF Thus, for these two implementations, we can see that we require ginv 2fF for this design to have lower power consumption than implementation 2 in part b). 19