Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits Design Perspective Jan M. Rabaey nantha Chandrakasan orivoje Nikolić Designing Combinational Logic Circuits November 2002. 1

Combinational vs. Sequential Logic In Combinational Logic Circuit Out In Combinational Logic Circuit Out State Combinational Sequential Output = f(in) Output = f(in, Previous In) 2

Static CMOS Circuit t every point in time (except during the switching transients) each gate output is connected to either V DD or V ss via a low-resistive path. The outputs of the gates assume at all times the value of the oolean function, implemented by the circuit (ignoring, once again, the transient effects during switching periods). This is in contrast to the dynamic circuit class, which relies on temporary storage of signal values on the capacitance of high impedance circuit nodes. 3

CMOS Circuit Styles Static complementary CMOS - except during switching, output connected to either V DD or GND via a lowresistance path high noise margins - full rail to rail swing - V OH and V OL are at V DD and GND, respectively low output impedance, high input impedance no steady state path between V DD and GND (no static power consumption) delay a function of load capacitance and transistor resistance comparable rise and fall times (under the appropriate transistor sizing conditions) Dynamic CMOS - relies on temporary storage of signal values on the capacitance of high-impedance circuit nodes simpler, faster gates increased sensitivity to noise 4

Static Complementary CMOS Pull-up network (PUN) and pull-down network (PDN) In 1 In 2 In N In 1 In 2 In N V DD PUN PDN PMOS transistors only pull-up: make a connection from V DD to F when F(In 1,In 2, In N ) = 1 F(In 1,In 2, In N ) pull-down: make a connection from F to GND when F(In 1,In 2, In N ) = 0 NMOS transistors only PUN and PDN are dual logic networks CSE477 L06 Static CMOS Logic.5 Irwin&Vijay, PSU, 2003

NMOS Transistors in Series/Parallel Connection Transistors can be thought as a switch controlled by its gate signal NMOS switch closes when switch control input is high X Y Y = X if and X Y Y = X if OR NMOS Transistors pass a strong 0 but a weak 1 6

PMOS Transistors in Series/Parallel Connection PMOS switch closes when switch control input is low X Y Y = X if ND = + X Y Y = X if OR = PMOS Transistors pass a strong 1 but a weak 0 7

Threshold Drops PUN V DD S V DD D V DD D 0 V DD V GS S 0 V DD - V Tn C L C L PDN V DD 0 V DD V Tp V DD D C L V GS S C L S D 8

Construction of PDN NMOS devices in series implement a NND function NMOS devices in parallel implement a NOR function + CSE477 L06 Static CMOS Logic.9 Irwin&Vijay, PSU, 2003

Dual PUN and PDN PUN and PDN are dual networks DeMorgan s theorems + = [!( + ) =!! or!( ) =! &!] = + [!( ) =! +! or!( & ) =!!] a parallel connection of transistors in the PUN corresponds to a series connection of the PDN Complementary gate is naturally inverting (NND, NOR, OI, OI) Number of transistors for an N-input logic gate is 2N CSE477 L06 Static CMOS Logic.10 Irwin&Vijay, PSU, 2003

Complementary CMOS Logic Style 11

Example Gate: NND 12

Example Gate: NOR 13

Complex CMOS Gate C D D C OUT = D + ( + C) 14

Constructing a Complex Gate V DD V DD C D C F SN1 D F C SN4 SN2 SN3 D F (a) pull-down network (b) Deriving the pull-up network hierarchically by identifying sub-nets D C (c) complete gate 15

RC Delay Model Use equivalent circuits for MOS transistors Ideal switch + capacitance and ON resistance Unit nmos has resistance R, capacitance C Unit pmos has resistance 2R, capacitance C Capacitance proportional to width Resistance inversely proportional to width

Switch Delay Model R eq R p R p R p R p R n C L R n C L R p C int NND2 R n Cint INV R n R n C L NOR2 17

Input Pattern Effects on Delay R p R n R n R p C L C int Delay is dependent on the pattern of inputs Low to high transition both inputs go low delay is 0.69 R p /2 C L one input goes low delay is 0.69 R p C L High to low transition both inputs go high delay is 0.69 2R n C L 18

dding devices in series slows down the circuit, and devices must be made wider to avoid performance penalty. When sizing the transistors in a gate with multiple inputs, we should pick the combination of inputs that triggers the worst case conditions. For the NND gate to have the same pull-down delay (tphl) with an inverter, the NMOS devices in the PDN stack must be made twice as wide (2.5 times, if velocity saturation is effective). PMOS devices can remain unchanged. (Extra capacitance introduced by widening is ignored here, which is not a good assumption) 19

Voltage [V] Delay Dependence on Input Patterns 3 2.5 2 1.5 1 0.5 0-0.5 ==1 0 =1, =1 0 =1 0, =1 0 100 200 300 400 time [ps] Input Data Pattern Delay (psec) ==0 1 67 =1, =0 1 64 = 0 1, =1 61 ==1 0 45 =1, =1 0 80 = 1 0, =1 81 NMOS = 0.5 m/0.25 m PMOS = 0.75 m/0.25 m C L = 100 ff 20

Transistor Sizing R p R p 2 2 4 R p 2 R n C L 4 R p C int 2 R n Cint 1 R n R n 1 C L 21

Transistor Sizing a Complex CMOS Gate 4 3 C 8 8 6 6 D 4 6 OUT = D + ( + C) D 1 2 2 C 2 22

Fan-In Considerations C D C 3 C L Distributed RC model (Elmore delay) C D C 2 C 1 t phl = 0.69 R eqn (C 1 +2C 2 +3C 3 +4C L ) Propagation delay deteriorates rapidly as a function of fan-in quadratically in the worst case. 23

t p (psec) t p as a Function of Fan-In 1250 1000 750 500 250 0 fan-in t phl t p t t plh pl H 2 4 6 8 10 12 14 16 quadratic linear Gates with a fan-in greater than 4 should be avoided. 24

t p (psec) t p as a Function of Fan-Out t p NOR2 t p NND2 t p INV ll gates have the same drive current. 2 4 6 8 10 12 14 16 eff. fan-out Slope is a function of driving strength 25

t p as a Function of Fan-In and Fan-Out Fan-in: quadratic due to increasing resistance and capacitance Fan-out: each additional fan-out gate adds two gate capacitances to C L t p = a 1 FI + a 2 FI 2 + a 3 FO 26

Fast Complex Gates: Design Technique 1 Transistor sizing as long as fan-out capacitance dominates Progressive sizing In N MN C L Distributed RC line In 3 M3 C 3 M1 > M2 > M3 > > MN (the fet closest to the output is the smallest) In 2 In 1 M2 M1 C 2 C 1 Can reduce delay by more than 20%; decreasing gains as technology shrinks 27

Fast Complex Gates: Design Technique 2 Transistor ordering critical path critical path In 3 1 In 2 1 In 1 0 1 M3 C L M3 In 1 M2 C 2 charged 2 M2 C2 In M1 charged 3 1 M1 C 1 C 1 charged 0 1 In 1 C L charged discharged discharged delay determined by time to discharge C L, C 1 and C 2 delay determined by time to discharge C L 28

Fast Complex Gates: Design Technique 3 lternative logic structures F = CDEFGH 29

Fast Complex Gates: Design Technique 4 Isolating fan-in from fan-out using buffer insertion C L C L 30

Logical Effort t p = t p0 (1 + C ext / C g ) = t p0 (1 + f/ ) t p = t p0 (p + g f/ ) t p0 is the intrinsic delay of an inverter f is the effective fan-out (C ext /C g ) also called the electrical effort p is the ratio of the instrinsic (unloaded) delay of the complex gate and a simple inverter (a function of the gate topology and layout style) g is the logical effort of the complex gate - how much more input capacitance a gate presents to deliver the same output current as an inverter (how much worse it is at producing output current than an inverter) Normalize everything to an inverter: ginv=1, pinv= 1 P = n for n input NND and NOR gates ssume γ = 1.

Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity 32

Delay in a Logic Gate Gate delay: d = h + p effort delay intrinsic delay Effort delay: h = g f logical effort effective fanout = C out /C in Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size 33

Logical Effort PMOS/NMOS ratio of 2, the input capacitance of a minimum sized symmetrical inverter equals 3 times the gate capacitance of a minimum sized NMOS (called C unit ) V DD V DD V DD 2 2 2 4 1 F 2 F 4 F 2 1 1 Inverter 2-input NND 2-input NOR g = 1 (3 C unit ) g = 4/3 (4 C unit ) g = 5/3 (5 C unit ) 34

Intrinsic Delay Term, p The more involved the structure of the complex gate, the higher the intrinsic delay compared to an inverter Gate Type p Inverter 1 n-input NND n n-input NOR n n-way mux 2n XOR, XNOR n 2 n-1 Ignoring second order effects such as internal node capacitances CSE477 L11 Fast Logic.35 Irwin&Vijay, PSU, 2003

Logical Effort From Sutherland, Sproull 36

Normalized delay (d) Logical Effort of Gates g = p = d = t pnnd g = p = d = tpinv F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h) 37

Normalized delay (d) Logical Effort of Gates g = 4/3 p = 2 d = (4/3)f+2 t pnnd tpinv g = 1 p = 1 d = f+1 d = p + g f F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h) 38

Normalized Delay Delay as a Function of Fan-Out 5 4 3 2 1 Inverter: g = 1; p = 1 2-input NND: g = 4/3; p = 2 Effort Delay Intrinsic Delay The slope of the line is the logical effort of the gate (d = p + fg) The y-axis intercept is the intrinsic delay Can adjust the delay by adjusting the effective fan-out (by sizing) or by choosing a gate with a different logical effort 1 2 3 4 5 Gate effort: h = fg Fanout f 39

Multistage Networks Total delay of a path: t p = N j=1 N t p,j = t p0 j=1 p j + f jg j γ Using a a similar procedure with the sizing of the inverter chain (finding N-1 partial derivatives and equating to zero), we find that each stage should bear the same gate effort: f 1 g 1 = f 2 g 2 = f N g N Here we have some definitions: Path effective fan-out (Path electrical effort): F = C out /C in Path logical effort: G = g 1 g 2 g N 40

ranching effort of a logical gate on a path: b = C on path + C off path C on path Path branching effort: = 1 N b i The path electrical effort can now be related to the electrical and branching efforts: F = N 1 fi b i = f i 41

Finally the total path effort H can be defined: H = N 1 h i = N 1 g, f i = GF From here on, the analysis can be carried out as in the case of inverter chain. The gate effort that minimizes the path delay is: h = N H nd the minimum delay through the path is: D = t p0 N j=1 p j + N N H γ 42

We assume that a unit-size gate has a driving capability equal to a minimum-size inverter. This means that its input capacitance is g times larger than that of the reference inverter (Cref). With s1the sizing factor of the first gate in the chain, the input capacitance of the chain can be written as: Cg1 = g 1 s 1 Cref Including the branching effort, the input capacitance of gate 2 will be f 1 /b 1 times larger: g 2 s 2 C ref = f 1 b 1 g 1 s 1 C ref For gate i in the chain: s i = g 1s 1 g i i 1 j=1 f j b j 43

Example: Optimize Path 1 a b c 5 g = 1 f = a g = 5/3 f = b/a g = 5/3 f = c/b g = 1 f = 5/c Effective fanout, F = G = H = h = a = b = 46

Example: Optimize Path 1 a b c 5 g = 1 f = a g = 5/3 f = b/a g = 5/3 f = c/b g = 1 f = 5/c Effective fanout, F = 5 G = 1x(5/3)x(5/3)x1 = 25/9 H = GF (no branching)=125/9 = 13.9 h = 4 H =1.93 since f 1 g 1 = f 2 g 2 = f 3 g 3 = f 4 g 4 = h: f 1 = 1.93, f 2 = 1.93 (3/5) = 1.16, f 3 = 1.16, f 4 = 1.93 a = s 2 = f 1 g 1 /g 2 = 1.16 b = s 3 = f 1f 2 g 1 g 3 = 1.34 c = s 4 = f 1 f 2 f 3 g 1 /g 4 = 2.60 47

Example 8-input ND G=3.33 G=3.33 G=2.96 49

Method of Logical Effort Compute the path effort: F = GH Find the best number of stages N ~ log 4 F Compute the stage effort f = F 1/N Sketch the path with this number of stages Work either from either end, find sizes: C in = C out *g/f Reference: Sutherland, Sproull, Harris, Logical Effort, Morgan-Kaufmann 1999. 50

Summary Sutherland, Sproull Harris 51