UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

UNIVERSITY OF CAIFORNIA, BERKEEY College of Engineering Department of Electrical Engineering and Computer Sciences Elad Alon Homework #7 - Solutions EECS141 Due Thursday, October 22, 5pm, box in 240 Cory PROBEM 1: Complex CMOS Gates For this problem you should use the following parameters for the transistors. NMOS: =100nm, VTn = 0.25V, µn = 350 cm 2 /(V s), Cox = 0.95 µf/cm 2, vsat = 1e7 cm/s, λ = 0 PMOS: =100nm, VTp = 0.25V, µp = 175 cm 2 /(V s), Cox = 0.65 µf/cm 2, vsat = 1e7 cm/s, λ = 0 a) Implement the function F = A ( B + C ) + D E. Assuming long-channel transistors, size the devices so that the worst-case drive resistance is the same as an inverter with W N / =2 and W P / =4. b) Imagine that input "B" to the gate was always the last one to arrive, making the delay of the gate from B rising or falling to the output falling or rising critical. Please rearrange the implementation of your gate so that the delay of the gate from B transitioning is minimized.

In order to minimize the delay from input B to Out, the least amount of parasitic caps should be charged/discharged. This happens when the transistors connected to input B are the closest to the output (i.e., at the top of the stack). The revised gate is shown below: c) Draw a stick diagram of the gate you designed for part b) - you should minimize the diffusion breaks and use a single piece of poly for each input. In order to implement the gate without any diffusion breaks, we need to find a Consistent Euler path for both the pull-up and the pull-down network. Based on the logic graph shown below, one such path is D E A B C (note that there are other consistent Euler paths as well any correct solution will receive full credit).

d) Now resize the gate to match the worst-case pull-up and pull-down resistances using the velocity saturated model. What is the E from the B input? Since we re interested in logical effort, the first thing we need to do is figure out to size the reference inverter i.e., what ratio between NMOS and PMOS widths provides equal pull-down and pull-up currents. Setting the two currents equal to each other: I D,1xN = W N C ox,n v sat (V gs V th ) 2 (V gs V th ) +ε c,n = W P C ox,p v sat (V gs V th ) 2 (V gs V th ) +ε c,p = I D,1xP we get that W P /W N 2, meaning that we can continue using the reference inverter from part a) for this part of the problem as well. Now, in order to size the gate itself we need to equate the current through a stack of three NMOS transistors to that of the single NMOS device inside of the reference inverter. In order to do this we need to realize that a stack of N transistors is equivalent to a single transistor with N times the length, as shown in the example below: Therefore, for the pull-down side we get: (V gs V th ) 2 2 I D,1xN = 0.2µm C ox,n v sat (V gs V th ) +ε c,n = W (V gs V th ) NC ox,n v sat (V gs V th ) +ε c,n 3 = I D,3xN, where ε c,n = 2v sat µ n = 0.57V and hence W N = 0.35µm. Repeating the same process for the pull-up stack: (V gs V th ) 2 I D,1xP = 0.4µm C ox,p v sat (V gs V th ) +ε c,p = W P C ox,p v (V gs V th ) 2 sat (V gs V th ) +ε c,p 2 = I D,2xP

where ε c,p = 2v sat µ p = 1.14V and hence W P = 0.63µm. The circuit with the new sizing is shown in the figure below (note that the sizes are shown in units of W/, not absolute microns): The E of the gate from the B input is therefore: E = (3.5+ 6.3)C g (2 + 4)C g = 1.63. e) Use SPICE to extract the E from the B input for the gate with the sizing from part d). How does this compare with the result predicted from part d)? To measure E B of this complex gate we should use the method from HW#4 i.e. we build a chain of these gates where the output of the previous gate is connected to the B input of the next gate. We also have to ensure that the other inputs do not influence the output. In order to do this, when B is low there must be exactly one pull-down branch (the one that contains the NMOS controlled by B) that connects the output to ground. Analogous to this, when B is high there must be exactly one pull-up branch that contains the B input and connects the output to the power supply. Examining the schematic of the gate, we can see that other inputs need to have values: A=0, D=0, E=1, C=1 (note that this is not a unique combination). Now we have that when B=0, branch BCE is on (note that this is the worst pull-down case) and when B=1, branch BA is on which represent the worst pull-up case. Simulating in HSPICE, we get that E=1.87, which is relatively close to what we predicted by hand calculation.

HSPICE deck for finding E: *** HW7 Problem 1e ***.IB '/home/ff/ee141/modes/gpdk090_mos.sp' TT_s1v.PARAM vddval=1.2.param fanout=1.param length=100n.param wn=0.35u.param wp=0.63u * Inverter SUBCKT Definition.SUBCKT inv vdd gnd in out Mp out in vdd vdd gpdk090_pmos1v W=0.4u ='length' Mn out in gnd gnd gpdk090_nmos1v W=0.2u ='length'.ends.subckt cplx_gate vdd gnd ina inb inc ind ine out MpA inta ina vdd vdd gpdk090_pmos1v W='wP' ='length' MpB out inb inta vdd gpdk090_pmos1v W=wP =length MpC out inc inta vdd gpdk090_pmos1v W=wP =length MpE inte ine vdd vdd gpdk090_pmos1v W=wP =length MpD out ind inte vdd gpdk090_pmos1v W=wP =length MnD intd ind gnd gnd gpdk090_nmos1v W=wN =length MnE intd ine gnd gnd gpdk090_nmos1v W=wN =length MnC intc inc intd gnd gpdk090_nmos1v W=wN =length MnA out ina intd gnd gpdk090_nmos1v W=wN =length MnB out inb intc gnd gpdk090_nmos1v W=wN =length.ends * Voltage Sources V1 vdd 0 'vddval' V2 vinb 0 PW 0 0V 10p 0V 11p 'vddval' 1011p 'vddval' 1012p 0V.connect vina 0.connect vinc vdd.connect vind 0.connect vine vdd * Gate chain Xgate2_1 vdd 0 vina vinb vinc vind vine out1 cplx_gate M=1 Xgate2_2 vdd 0 vina out1 vinc vind vine out2 cplx_gate M=fanout Xgate2_3 vdd 0 vina out2 vinc vind vine out3 cplx_gate M='fanout*fanout'

Xgate2_4 vdd 0 vina out3 vinc vind vine out4 cplx_gate M='fanout*fanout*fanout' * INV chain Xinv_1 vdd 0 vinb out12 inv M=1 Xinv_2 vdd 0 out12 out22 inv M=fanout Xinv_3 vdd 0 out22 out32 inv M='fanout*fanout' Xinv_4 vdd 0 out32 out42 inv M='fanout*fanout*fanout' * options.option post=2 nomod.op * analysis.tran 0.1PS 1.5NS sweep fanout 0.001 5.001 1 *gate chain.measure TRAN tph TRIG V(out1) VA='vddval/2' RISE=1 TARG V(out2) VA='vddval/2' FA=1.MEASURE TRAN tph TRIG V(out1) VA='vddval/2' FA=1 TARG V(out2) VA='vddval/2' RISE=1.MEASURE TRAN tpavg PARAM='(tpH+tpH)/2' *inverter chain.measure TRAN tph2 TRIG V(out12) VA='vddval/2' RISE=1 TARG V(out22) VA='vddval/2' FA=1.MEASURE TRAN tph2 TRIG V(out12) VA='vddval/2' FA=1 TARG V(out22) VA='vddval/2' RISE=1.MEASURE TRAN tpavg2 PARAM='(tpH2+tpH2)/2'.END PROBEM 2: B Wire for 256x256 SRAM In this problem we will be looking at a bitline (B) wire for a 256x256 SRAM array. You should assume that the B wire is implemented in M2 (which has a minimum width of 0.14µm, R w = 0.075 Ω/ ) and is running over active. You can also assume that there are no higher metal layers running above it. Right next to this B, at the minimum allowed distance there is another identical B (from another cell), as shown on the next page. To calculate the capacitance of this wire, you should use tables 4-2 and 4-3 from the book. You should assume that the cell height is 1µm, that the access device is 120nm wide (which is the minimum width in this technology), and that C D = 1.6fF/µm. a) What is the total capacitance of the B due to the B wires only (i.e. not including the C D of the access devices)?

From the tables in the book we get the following values for the parameters of the M2 wire: C pp = 15 af / µm 2, C fringe = 27 af / µm, C c = 85 af / µ m. Including the interwire capacitance, we get: C C W C C af µ m µ m µ m af µ m µ m af µ m µ m 2 w = pp + 2 fringe + c = 15 / 0.14 256 + 2 27 / 256 + 85 / 256 = 36.1 ff. b) What is the total resistance (from the top of the SRAM to the bottom) of the B wire? The resistance of the wire is: R w = R sq W = 0.075Ω /sq 256µm 0.14µm = 137Ω. c) Now including the capacitance of the access devices, what is the total RC delay of the B wire? The access transistors are going to add their drain capacitance to the existing wire capacitance. Thus, the total B capacitance is: C B = C w + 256 C D W acc = 36.1 ff + 49. 2 ff = 85.3 ff The wire delay is therefore (assuming slope input): t w = R wc B 2 = 5.84 ps d) How much energy is pulled out of the power supply to charge/discharge the bitlines every time the 256x256 SRAM is read? Each time the SRAM is read, in each column of cells either B or BB (depending on whether the stored value is a one or a zero) gets discharged to ground and then precharged back to Vdd. This leads to the conclusion that the total switching capacitance is: C sw = 256 C B. The energy that is pulled out of the power supply by each read operation is therefore: E = C sw V dd 2 = 256 C B V dd 2 = 31.4 pj. PROBEM 3: Wire Delay

In this problem you will calculate the delay of a chain of gates driving a long wire, as shown below. Gate sizes are annotated on the schematic, as are the dimensions of the wire. For the wire parameters you should use: R = 0.075 Ω/, W = 0.14 µm, C pp = 6.5 af/µm 2, and C fringe = 14 af/µm. For the logic gates you should assume that V dd = 1.2V, C g = 2 ff/um, C d = 1.6 ff/um, R nmos = 10 kω/, and R pmos = 20 kω/. Finally, the maximum input capacitance C in = 8fF. Note that you should use a minimum channel length of 100nm for the transistors. a) Draw the RC model for calculating the delay of the circuit shown above. The model shown below is for In switching from low to high, but the overall delay will be the same in the other case as well since the gates are sized for equal pull-up and pulldown resistances. The wire has been modeled with a single π section. This model also assumes that the critical input drives the NMOS transistor at the top of the stack. b) Calculate the delay using the model from part a). Given that C in = 2C g W N,in = 8 ff (remember that all devices in NAND2 gate have the same width), we get W N, in = 2 µ m and hence: R N,in = 2R N,sq W N,in = 1kΩ, C par,in = γ NAND2 C in = 3 2 γc in = 9.6 ff. Similarly, for the other two gates:

C 3C 3C W 24 ff = = =, so W N,1 = 4 µ m and W P,1 = 8 µ m. Therefore: 1 in g N,1 R P,1 = R P,sq W P,1 = 0.25kΩ, C par,1 = γc 1 = 0.8C 1 = 19.2 ff. C = 40C = 3C W = 320 ff, so W N,2 = 53.33 µ m and: 2 in g N,2 R = R = 18.75Ω, W N,2 N, sq N,2 C par,2 = γc 2 = 0.8C 2 = 256 ff. The wire parameters are calculated as follows: R w = R sq,w w W w = 535.7Ω C w = C pp W w w + 2C fringe w = 28.9 ff. The total delay is just the sum of the delays of the three parts of the chain: t d = t d,1 + t d,2 + t d,3, where each of these delays can be calculated using Elmore time constants (assuming ramp inputs): t d,1 = R N,in (C par,in + C 1 ) = 33.6 ps t d,2 = R P,1 (C par,1 + C w /2)+ (R P,1 + R w )(C w /2+ C 2 ) = 271.86ps t d,3 = R N,2 (C par,2 + C ) = 19.9 ps. Hence, the total delay is: t d = t d,1 + t d,2 + t d,3 = 325.36 ps. c) Without changing the length of the wire, how would you modify the circuit in order to reduce its delay? Draw your improved design and estimate the new delay. (Note that you are allowed to change anything about the circuit except for the length of the wire, C in, and C.)

In order to improve the delay of the circuit, the first thing we should do is figure out which part of the circuit dominates the delay. From part b) it should be obvious that the delay is dominated by td,2, more specifically by the term: t dom = ( R P,1 + R w ) C 2 = 251.4 ps. The best way to reduce the impact of this term is to reduce the values of each of the constintuent components even if that increases the delay coming from the other components. As both resistances are comparable in magnitude, we should probably increase the size of the inverter that drives the wire (in order to decrease its resistance, R P,1 ) and increase the width of the wire (in order to decrease its resistance, R w ). In order to reduce the capacitance that these resistors drive ( C2 ), we should also decrease the size of the last inverter in chain. There are obviously many possible choices for changing the above parameters, and any conceptually correct approach that explains how the choices were made to reduce the delay will receive full credit. As an example, let s assume that one choseto increase the size of gate 1 by a factor 2 while increasing the width of the wire by a factor of 10 (the influence of the wire s width on its capacitance is not that big since the parallel plate portion of the capacitance is much much smaller than the fringe portion). et s also assume that we decrease the size of the last stage by a factor 3. The new sizing is presented on figure below: The new parameters are: C 1 = 2C 1 = 48 ff; W P,1 = 2W P,1 = 16 µ m ; R P,1 = R P,sq = 125Ω; C par,1 = γ C 1 = 38.4 ff W P,1 C 2 = 1 3 C 2 = 106.67 ff; W N,2 = 1 3 W N,2 =17.8µm; R N,2 = R N,sq = 56.4Ω; C par,2 = γc 2 = 85.3 ff W N,2 W w = 10 W w = 1.4µ m ; R w = R w sq,w = 53.6Ω ; C w = C pp W w w + 2C fringe w = 37 ff. W w And the new delay is: t d,1 = R N,in (C par,in + C 1 ) = 57.6 ps t d,2 = R P,1 ( C par,1 + C w /2)+ ( R P,1 + R w )( C w /2+ C 2 ) = 29.5ps

t d,3 = R N,2 ( C par,2 + C ) = 50 ps t d = t d,1 + t d,2 + t d,3 = 137.1ps Note that the optimal way to size the circuit would be to take partial derivative of the total delay for each of the sizing parameters ( W P,1,W w,w N,2 ) and set them all equal to zero. This could be done by hand or using MATAB, resulting in an optimal sizing that makes gate 1 s size 4.5C in, gate 2 s size 20*C in, and the wire 3.44µm wide. The optimal delay is then: t d,min = 124.3ps