Digital EE141 Integrated Circuits 2nd Combinational Circuits

Digital Integrated Circuits Designing i Combinational Logic Circuits 1

Combinational vs. Sequential Logic 2

Static CMOS Circuit t every point in time (except during the switching transients) each gate output is connected to either V DD or V ss via a low-resistive path. The outputs of the gates assume at all times the value of the oolean function, implemented by the circuit (ignoring, once again, the transient effects during switching periods). This is in contrast to the dynamic circuit class, which relies on temporary storage of signal values on the capacitance of high impedance circuit nodes. 3

Static Complementary CMOS V DD In1 In2 InN In1 In2 InN PUN PDN PMOS only F(In1,In2, InN), NMOS only PUN (pull-up network) and PDN (pull-down network) are dual logic networks 4

NMOS Transistors in Series/Parallel l Connection Transistors can be thought as a switch controlled by its gate signal NMOS switch closes when switch control input is high X Y Y = X if and X Y Y = X if OR NMOS Transistors pass a strong 0buta weak 1 5

PMOS Transistors in Series/Parallel l Connection PMOS switch closes when switch control input is low X Y Y = X if ND = + X Y Y = X if OR = PMOS Transistors pass a strong 1 but a weak 0 6

Threshold Drops PUN V DD S V DD D V DD D 0 V S DD 0 V DD -V Tn V GS C L C L PDN V DD 0 V DD V Tp VDD D C L V GS S C L S D 7

Complementary CMOS Logic Style 8

Example Gate: NND 9

Example Gate: NOR 10

Complex CMOS Gate C D D C OUT = D + ( + C) 11

Constructing a Complex Gate 12

Cell Design Standard Cells General purpose logic Can be synthesized Same height, varying width Datapath Cells For regular, structured designs (arithmetic) Includes some wiring in the cell Fixed height and width 13

Standard Cell Layout Methodology 1980s Routing channel V DD signals GND 14

Standard Cell Layout Methodology 1990s Mirrored Cell No Routing channels V DD V DD M2 M3 GND Mirrored Cell GND 15

Standard Cells N Well V DD Cell height ht 12 metal tracks Metal track is approx. 3λ + 3λ Pitch = repetitive distance between objects Cell height is 12 pitch 2λ In Out Cell boundary GND Rails ~10λ 16

Standard Cells With minimal diffusion routing V DD With silicided diffusion V DD V DD In M 2 Out In Out In Out M 1 GND GND 17

Standard Cells V DD 2-input NND gate V DD Out GND 18

Stick Diagrams Contains no dimensions Represents relative positions of transistors Inverter V DD NND2 V DD Out Out GND In GND 19

Stick Diagrams j C Logic Graph X C PUN X = C (+ ) X i V DD C i j C GND PDN 20

Two Versions of C ( + ) C C V DD V DD X X GND GND 21

Consistent Euler Path X C X i V DD j GND C 22

OI22 Logic Graph C X PUN D D C X = (+) (C+D) X V DD C D C D GND PDN 23

Example: x = ab+cd x x b c b c x V DD x V DD a d a d GND GND (a) Logic graphs for (ab+cd) (b) Euler Paths {a b c d} V DD x GND a b c d (c) stick diagram for ordering {a b c d} 24

Multi-Fingered Transistors One finger Two fingers (folded) Less diffusion capacitance 25

Properties of Complementary CMOS Gates Snapshot High noise margins: V OH and V OL are at V DD and GND, respectively. No static power consumption: There never exists a direct path between V DD and V SS (GND)insteady-state state mode. Comparable rise and fall times: (under appropriate sizing conditions) 26

CMOS Properties Full rail-to-rail swing; high noise margins Logic levels not dependent upon the relative device sizes; ratioless lways a path to Vdd or Gnd in steady state; low output impedance Extremely high input resistance; nearly zero steady-state state input current No direct path steady state between power and ground; no static power dissipation Propagation delay function of load capacitance and resistance of transistors 27

Switch Delay Model R eq R p R n R p R p C L R n C L R p R p C int R R n R n C n L C NND2 R n Cint INV NOR2 28

Input Pattern Effects on Delay Delay is dependent on R R the pattern of inputs p p Low to high transition R n R n C L C int both inputs go low delay is 0.69 R p /2 C L one input goes low delay is 0.69 R p C L High to low transition both inputs go high h delay is 0.69 2R n C L 29

Delay Dependence on Input Patterns Vo oltage [V V] 3 2.5 2 15 1.5 1 0.5 0-0.5 ==1 0 =1, =1 0 =1 0, =1 0 100 200 300 400 time [ps] Input Data Pattern Delay (psec) ==0 1 67 =1 1, =0 1 64 = 0 1, =1 61 ==1 0 45 =1, =1 0 80 = 1 0, =1 81 NMOS = 0.5μm/0.25 μm PMOS = 0.75μm/0.25 μm C L = 100 ff 30

Transistor Sizing R p R p R p 2 2 4 2 R n C L 4 R p C int 2 R n Cint 1 R n R n 1 C L 31

Transistor Sizing a Complex CMOS Gate 8 6 4 3 C 8 6 D 4 6 OUT = D + ( + C) 2 D 1 2 C 2 32

Fan-In Considerations C D C L Distributed RC model (Elmore delay) C 3 C C 2 D t phl = 0.69 R eqn (C 1 +2C 2 +3C 3 +4C L ) C 1 Propagation delay deteriorates rapidly as a function of fan-in quadratically in the worst case. 33

t p as a Function of Fan-In t p (pse ec) 1250 1000 750 500 250 0 t ph t p L t pl H 2 4 6 8 10 12 14 16 fan-in quadratic linear Gates with a fan-in greater than 4 should be avoided. 34

t p as a Function of Fan-Out t p (pse ec) t p NOR2 t p NND2 t p INV 2 4 6 8 10 12 14 16 eff. fan-out ll gates have the same drive current. Slope is a function of driving strength 35

t p as a Function of Fan-In and Fan-Out Fan-in: quadratic due to increasing resistance and capacitance Fan-out: each additional fan-out gate adds two gate capacitances to C L t p = a 1 FI + a 2 FI 2 + a 3 FO 36

Fast Complex Gates: Design Technique 1 Transistor sizing i as long as fan-out capacitance dominates Progressive sizing In N MN C L Distributed RC line In 3 M3 C 3 M1 > M2 > M3 > > MN (the fet closest to the output is the smallest) In 2 In 1 M2 M1 C 2 C 1 Can reduce delay by more than 20%; decreasing gains as technology shrinks 37

Fast Complex Gates: Design Technique 2 Transistor ordering critical path critical path charged 0 1 In 1 3 M3 C In 1 M3 C charged L L 1 In 1 In 2 M2 C 2 charged 2 M2 C 2 discharged In In 1 1 M1 C charged 3 M1 C discharged 1 1 0 1 delay determined by time to discharge C L,C 1 and C 2 delay determined by time to discharge C L 38

Fast Complex Gates: Design Technique 3 lternative logic structures F = CDEFGH 39

Fast Complex Gates: Design Technique 4 Isolating fan-in from fan-out using buffer insertion C L C L 40

Fast Complex Gates: Design Technique 5 Reducing the voltage swing t phl =069(3/4(C 0.69 (C L V DD )/ I DSTn ) = 0.69 (3/4 (C L V swing )/ I DSTn ) linear reduction in delay also reduces power consumption ut the following gate is much slower! Or requires use of sense amplifiers on the receiving end to restore the signal level (memory design) 41

Sizing Logic Paths for Speed Frequently, input capacitance of a logic path is constrained Logic also has to drive some capacitance Example: LU load in an Intel s microprocessor is 0.5pF How do we size the LU datapath to achieve maximum speed? We have already solved this for the inverter chain can we generalize it for any type of logic? 42

uffer Example In Out 1 2 N C L Delay = N ( pi + gi fi ) i= 1 (in units of τ inv ) For given N: C i+1 /C i = C i /C i-1 To find N: C i+1 /C i ~ 4 How to generalize this to any logic path? 43

Logical Effort CL Delay = k R unitcunit 1 + C γ in = τ p + g f ( ) p intrinsic delay (3kR unit C unit γ) - gate parameter f(w) g logical effort (kr unitc unit) ) gate parameter f(w) f effective fanout Normalize everything to an inverter: g inv =1, p inv = 1 Divide everything by τ inv (everything is measured in unit delays τ inv) ) ssume γ = 1. 44

Delay in a Logic Gate Gate delay: d = h + p effort delay intrinsic delay Effort delay: h = g f logical effort effective fanout = C out /C in Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size 45

Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity 46

Logical Effort 47

Logical Effort of Gates zed delay (d) Normali g = p = d = t pnnd g = p = d = t pinv F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h) 48

Logical Effort of Gates zed delay (d) Normali g = 4/3 p = 2 d = (4/3)h+2 t pnnd t pinv g = 1 p = 1 d = h+1 F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h) 49

Logical Effort of Gates 50

dd ranching Effort ranching effort: b = C on path + C C off on path path 51

Multistage Networks Delay = N ( pi + gi fi ) i= 1 Stage effort: h i = g i f i Path electrical effort: F = C out /C in Path logical effort: G = g 1 g 2 gg N ranching effort: = b 1 b 2 b N Path effort: H = GF Path delay D = Σd i = Σp i + Σh i 52

Optimum Effort per Stage When each stage bears the same effort: h N = H h = N H Stage efforts: g 1 f 1 = g 2 f 2 = = g N f N Effective fanout of each stage: f = h Minimum path delay Dˆ = i g i ( ) 1/ N g f + p = NH P i i i + 53

Optimal Number of Stages For a given load, and given input capacitance of the first gate Find optimal number of stages and optimal sizing 1/ N D = NH + Np inv N D 1/ N 1/ ln( 1/ ( N ) N = H H + H + p = 0 inv Substitute best stage effort h = H 1/ Nˆ 54

Logical Effort From Sutherland, Sproull 55

Example: Optimize Path 56

Example: Optimize Path 57

Example: Optimize Path 1 b c a 5 g 1 = 1 g 2 = 5/3 g 3 = 5/3 g 4 = 1 Effective fanout, H = 5 G = 25/9 F = 125/9 = 13.9 f = 1.93 a = 1.93 b = fa/g 2 = 2.23 c = fb/g 3 = 5g 4 /f = 259 2.59 58

Example 8-input ND 59

Method of Logical Effort Compute the path effort: F = GH Find the best number of stages N ~ log 4 F Compute the stage effort f = F 1/N Sketch the path with this number of stages Work either from either end, find sizes: C in = C out *g/f Reference: Sutherland, Sproull, Harris, Logical Effort, Morgan-Kaufmann 1999. 60

Summary Sutherland, Sproull Harris 61

Ratioed Logic 62

Ratioed Logic V DD V DD V DD Resistive s Depletion PMOS Load Load V T < 0 Load R L In 1 In 1 In 2 PDN In 2 In 3 In 3 F PDN F In 1 In 2 In 3 V SS PDN F V SS V SS V SS (a) resistive load (b) depletion load NMOS (c) pseudo-nmos Goal: to reduce the number of devices over complementary CMOS 63

Ratioed Logic Resistive V DD N transistors + Load V OH = V DD Load OH DD R L F V OL = R PN R PN + R L In 1 In 2 In 3 PDN ssymetrical response Static power consumption V SS t pl = 0.69 R L C L 64

ctive Loads V DD V DD Depletion Load V T < 0 T PMOS Load F V SS F In 1 In 2 In 3 PDN In 1 In 2 In 3 PDN V SS V SS depletion load NMOS pseudo-nmos 65

Pseudo-NMOS V DD C D F C L V OH = V DD (similar to complementary CMOS) V 2 k V V OL k n ( DD Tn )V ------------- OL = ------ p ( V V 2 2 DD Tp ) 2 V OL = ( V DD V T ) 1 1 k ------ p (assuming that V k T = V Tn = V Tp ) n SMLLER RE & LOD UT STTIC POWER DISSIPTION!!! 66

Pseudo-NMOS VTC 3.0 2.5 2.0 W/L p = 4 p t [V] V ou t 1.5 1.0 W/L p =2 0.5 W/L p = 0.5 W/L p = 0.25 W/L p = 1 0.0 0.0 0.5 1.0 1.5 2.0 2.5 V in [V] 67

Improved Loads V DD Enable M1 M2 M1 >> M2 F C D C L daptive Load 68

Improved Loads (2) V DD V DD M1 M2 Out Out PDN1 PDN2 V SS V SS Differential Cascode Voltage Switch Logic (DCVSL) 69

DCVSL Example Out Out XOR-NXOR gate 70

DCVSL Transient Response 2.5 ol ta ge [V] V 1.5 0.5,, -0.5 0 0.2 0.4 0.6 0.8 1.0 Time [ns] 71

Pass-Transistor Logic 72

Pass-Transistor Logic Inputs Switch Network Out Out N transistors No static consumption 73

Example: ND Gate 74

NMOS-Only Only Logic V DD In x 0.5μm/0.25μm 3.0 1.5μm/ 0.25μm 2.0 Out Voltage [V] 0.5μm/ 0.25μm 1.0 Out x In 0.0 0 0.5 1 1.5 2 Time [ns] 75

NMOS-only Switch C = 2.5V C = 2.5 V = 2.5 V = 25V 2.5 M 2 M n C L M 1 V does not pull up to 2.5V, but 2.5V - V TN Threshold h voltage loss causes static power consumption NMOS has higher threshold than PMOS (body effect) 76

NMOS Only Logic: Level Restoring Transistor V DD Level Restorer M r V DD M 2 M n X Out M 1 dvantage: Full Swing Restorer adds capacitance, takes away pull down current at X Ratio problem 77

Restorer Sizing [V] Voltage 3.0 2.0 1.0 W/L r =1.75/0.25 W/L r =1.50/0.25 W/L r =1.0/0.25 W/L r =1.25/0.25 Upper limit on restorer size Pass-transistor pull-down can have several transistors in stack 0.0 0 100 200 300 400 500 Time [ps] 78

Solution 2: Single Transistor Pass Gate with V T =0 V DD 0V 2.5V V DD V DD 0V Out 2.5V WTCH OUT FOR LEKGE CURRENTS 79

Complementary Pass Transistor Logic Pass-Transistor Network F (a) Inverse Pass-Transistor Network F F= F=+ F= ΒÝ (b) F= F=+ F= ΒÝ ND/NND OR/NOR EXOR/NEXOR 80

Solution 3: Transmission Gate C C C C C = 2.5 V = 2.5 V C L C = 0 V 81

Resistance of Transmission Gate 30 R n 2. 5 V Rn Resis stance, ohm ms 20 10 R p R n R p 25V 2.5 0V R p V ou t 0 0.0 1.0 2.0 V ou t, V 82

Pass-Transistor ased Multiplexer V DD S S S V DD M 2 S F M 1 S GND In 1 S S In 2 83

Transmission Gate XOR M2 M1 F M3/M4 84

Delay in Transmission Gate Networks 2.5 2.5 2.5 2.5 In V 1 V i-1 V i V i+1 V n-1 V n C 0 0 C 0 C C 0 C (a) In R eq R V eq R eq R 1 V i V i+1 V eq n-1 V n C C C C C m (b) In R eq R eq R eq R eq R eq R eq C C C C C C C C (c) 85

Delay Optimization 86

Transmission Gate Full dder P V DD V DD P C i C i P S Sum Generation V DD P P P V DD C o Carry Generation C i C i Setup C i P Similar delays for sum and carry 87

Dynamic Logic 88

Dynamic CMOS In static circuits at every point in time (except when switching) the output is connected to either GND or V DD via a low resistance path. fan-in of n requires 2n (n N-type + n P-type) devices Dynamic circuits i rely on the temporary storage of signal values on the capacitance of high impedance nodes. requires on n + 2 (n+1 N-type + 1 P-type) transistors 89

Dynamic Gate Clk M p Clk M p Out Out In 1 In 2 In 3 PDN C L C Clk M e Clk M e Two phase operation Precharge (CLK = 0) Evaluate (CLK = 1) 90

Dynamic Gate Clk In 1 In 2 In 3 Clk M p PDN M e C L Out Two phase operation Precharge (Clk = 0) Evaluate (Clk = 1) off Clk M p on 1 Out Clk M e off on (()+C) C 91

Conditions on Output Once the output of a dynamic gate is discharged, it cannot be charged again until the next precharge operation. Inputs to the gate can make at most one transition during evaluation. Output can be in the high impedance state during and after evaluation (PDN off), state is stored on C L 92

Properties of Dynamic Gates Logic function is implemented by the PDN only number of transistors is N + 2 (versus 2N for static complementary CMOS) Full swing outputs (V OL = GND and V OH = V DD ) Non-ratioed - sizing of the devices does not affect the logic levels Faster switching speeds reduced load capacitance due to lower input capacitance (C in ) reduced d load capacitance due to smaller output t loading (Cout) no I sc, so all the current provided by PDN goes into discharging C L 93

Properties of Dynamic Gates Overall power dissipation usually higher than static CMOS no static current path ever exists between V DD and GND (including P sc ) no glitching g higher transition probabilities extra load on Clk PDN starts to work as soon as the input signals exceed V Tn, so V M M, V IH and V IL equal to V Tn low noise margin (NM L ) Needs a precharge/evaluate clock 94

Issues in Dynamic Design 1: Charge Leakage CLK Clk M p Out Clk C L V Evaluate Out M e Precharge Leakage sources Dominant component is subthreshold current 95

Solution to Charge Leakage Keeper Clk M p M kp C L Out Clk M e Same approach as level restorer for pass-transistor logic 96

Issues in Dynamic Design 2: Charge Sharing Clk M p C L Out Charge stored originally on C L is redistributed (shared) over C L and C leading to reduced robustness =0 Clk M e C C 97

Charge Sharing Example Clk Out C L =50fF C a =15fF! C b =15fF C c =15fF C C C d =10fF Clk 98

Charge Sharing V DD case 1) if ΔVV out < V Tn Clk M p M a X Out C L C V = L DD C V t + C V V V L out ( ) a ( DD Tn ( X ) ) or ΔV out = V out ( t) V DD = C a -------V ( C DD V Tn ( V X )) L = 0 M b C a case 2) if ΔV out > V Tn Clk M e C b C a ΔV out V -------------------- = DD C a + C L 99

Solution to Charge Redistribution Clk M p M kp Clk Out Clk M e Precharge internal nodes using a clock-driven transistor (at the cost of increased area and power) 100

Issues in Dynamic Design 3: ackgate Coupling Clk M p Out1 =1 Out2 =0 C L1 C L2 =0 =0 In Clk M e Dynamic NND Static NND 101

ackgate Coupling Effect 3 2 1 Clk Out1 0 In Out2-1 0 2 4 6 Time, ns 102

Issues in Dynamic Design 4: Clock Feedthrough h Clk Coupling between Out and M p Clk input of the precharge Out device due to the gate to C L drain capacitance. So voltage of Out can rise above V DD. The fast rising Clk M e (and falling edges) of the clock couple to Out. 103

Clock Feedthrough Clk Out 2.5 Clock feedthrough In 1 In 2 1.5 In 3 In 4 0.5 In & Clk Out Clk -0.5 0 0.5 1 Time, ns Clock feedthrough 104

Other Effects Capacitive coupling Substrate coupling Minority charge injection Supply noise (ground bounce) 105

Cascading Dynamic Gates V Clk Clk M p M p In Out1 Out2 Clk In Clk M e Clk M e Out1 V Tn Out2 ΔV t Only 0 1 transitions allowed at inputs! 106

Domino Logic Clk M p 1 1 1 0 Out1 Clk 0 0 0 1 M p In 1 In 2 PDN In 4 PDN In In 3 5 Clk M e Clk M e M kp Out2 107

Why Domino? Clk In i In j Clk PDN In i PDN In i PDN In i PDN In j In j In j Like falling dominos! 108

Properties of Domino Logic Only non-inverting logic can be implemented Very high speed static inverter can be skewed, only L-H transition Input capacitance reduced smaller logical effort 109

Designing with Domino Logic V DD V DD V DD Clk M p Out1 Clk M p M r Out2 In 1 In 2 PDN In 4 PDN In 3 Can be eliminated! Clk M e Clk M e Inputs = 0 during gprecharge 110

Footless Domino 111

Differential (Dual Rail) Domino Out = off on Clk M kp Clk M p M kp 1 0 1 0!! M p Out = Clk M e Solves the problem of non-inverting logic 112

np-cmos Clk M p 1 1 1 0 Out1 Clk In In 1 4 PUN In 2 PDN In 5 0 0 In 3 Out2 0 1 (to PDN) Clk Clk M p M e M e Only 0 1 transitions allowed at inputs of PDN Only 1 0 transitions allowed at inputs of PUN 113

NOR Logic Clk In 1 In 2 M p PDN 1 1 1 0 Clk Out1 In 4 In M e PUN In 5 In 3 Clk Clk M M e M p 0 0 0 1 Out2 (to PDN) to other PDN s to other PUN s WRNING: Very sensitive to noise! 114