Interconnect (2) Buffering Techniques. Logical Effort Lecture 14 18-322 Fall 2002 Textbook: [Sections 4.2.1, 8.2.3] A few announcements! M1 is almost over: The check-off is due today (by 9:30PM) Students in Sections A and B who checked-off 1-2 days late will not get any points off (Yes, we re trying to make it fair for everybody.) # The report is due tomorrow (Friday) by 4:00PM no exceptions # and M2 has been already posted. You can start working on it! Midterm 1 is nearly here # Date: 10/15/02, time: 3:00-4:20PM, place: in class (DH2210) # Material required: Lec1 Lec12 (including Lec12) Closed books, closed notes! (Calculators OK) # Review session: Monday 10/14/, 4:30-6:00PM in DH2210! Final Exam scheduled for 12/16/02 (1:00-4:00PM)
Overview! Electrical wire models # Lumped RC model # Distributed rc line! Designing gates for performance # Progressive sizing # Input re-ordering! Driving large capacitances # Buffering techniques # Logical effort 3 Delay Definitions V in V in Propagation delay input waveform 50% t p = (t phl + t plh )/2 ~ t t phl t plh output waveform 50% 90% signal slopes t f 10% t r t Irwin&Vijay, PSU, CSE 477, 2002
The Lumped Model R driver driver V in (t) = V in (1 e -t/τ ) where τ = R driver Note: - ( 0% - 50%) V DD t LH = 0.69RC - (10% - 90%) V DD t r = 2.2 RC Lumped π Network Assume: Wire modeled by N equal-length segments V in R C R C R C R = R line /N C = C line /N τ N = N(N+1)/2 (RC) τ =1/2 (R line C line ) R line 1/2C line 1/2C line τ = R line (C line /2)
Distributed rc lines τ τ L 2 L RC-Models in Spice Time to reach the 50% point is t = ln(2)τ = 0.69τ Time to reach the 90% point is t = ln(9)τ = 2.2τ
Step Response Points! Example: Consider a Al1 wire 10 cm long and 1 µm wide # Using a lumped C only model with a source resistance (R Driver ) of 10 kω and a total lumped capacitance (C lumped ) of 11 pf t 50% = 0.69 x 10 kω x 11pF = 76 ns t 90% = 2.2 x 10 kω x 11pF = 242 ns # Using a distributed RC model with c = 110 af/µm and r = 0.075 Ω/µm t 50% = 0.38 x (0.075 Ω/µm) x (110 af/µm) x (10 5 µm) 2 = 31.35 ns t 90% = 0.9 x (0.075 Ω/µm) x (110 af/µm) x (10 5 µm) 2 = 74.25 ns Irwin&Vijay, PSU, CSE 477, 2002 Putting It All Together! Total propagation delay (driver and wire) τ D = R Driver C w + (R w C w )/2 = R Driver C w + 0.5r w c w L 2 and t p = 0.69 R Driver C w + 0.38 R w C w where R w = r w L and C w = c w L R Driver r w,c w,l V in! The delay introduced by wire resistance becomes dominant when (R w C w )/2 R Driver C W (when L 2R Driver /R w ) # For an R Driver = 1 kω driving an 1 µm wide Al1 wire, L crit is 2.67 cm Irwin&Vijay, PSU, CSE 477, 2002
Design Rules of Thumb! rc delays should be considered when t prc > t pgate of the driving gate # L crit > (t pgate /0.38rc) # actual L crit depends upon the size of the driving gate and the interconnect material! rc delays should be only considered when the rise (fall) time at the line input is smaller than RC, the rise (fall) time of the line # t rise < RC (RC is the total resistance and capacitance of the wire) # when not met, the change in the signal is slower than the propagation delay of the wire so a lumped C model suffices Overview $ Electrical wire models # Lumped RC model # Distributed rc line! Designing gates for performance # Progressive sizing # Input re-ordering! Driving large capacitances # Buffering techniques # Logical effort 12
Design for Performance! Reduce keep the drain diffusion as small as possible interconnect capacitance fanout! Increase W/L ratio of the transistor the most effective performance optimization tool for the designer! Increase V DD can trade-off energy for performance increasing V DD above a certain level yields only minimal improvement reliability concerns enforce a firm upper bound on V DD! Slope engineering keeping signal rise and fall times smaller than or equal to the gate propagation delays and of approximately equal values good for performance good for power consumption Irwin&Vijay, PSU, CSE 477, 2002 NMOS/PMOS Ratio! So far we have sized the PMOS and NMOS so that the R eq s match symmetrical VTC equal high-to-low and low-to-high propagation delays! If speed is the main concern Use minimum channel length (smallest possible L for all FETs) Finding the width W that minimizes delay is more difficult Reduce the width of the PMOS device Widening the PMOS degrades the t phl due to larger parasitic capacitances Widening both PMOS and NMOS by a factor S reduces Req by an identical factor (R eq = R ref /S), but raises the intrinsic capacitance by the same factor (C int = SC iref )
Fast Complex Gates - Design Technique 1 Transistor Sizing: As long as Fan-out Capacitance dominates Progressive Sizing: M1 > M2 > M3 > MN In N MN Out V DD In 3 In 2 M3 M2 C 3 C 2 Distributed RC-line In 1 M1 C 1 Can Reduce Delay with more than 25%! In 1 In 2 Long N-Chains: Progressive Sizing In N MN Out output voltage V DD 1 2 3 In 3 M3 C 3 T 1 (0.38RC) In 2 M2 C 2 T 2 (0.69RC) In 1 M1 C 1 T d time
Progressive Sizing (cont d) In N MN Out C eq R X Out In 3 M3 C 3 R 3 C 3 In 2 M2 C 2 R 2 C 2 In 1 M1 C 1 R 1 C 1 T d = R 1 C 1 + (R 1 +R 2 )C 2 + + (R 1 +R 2 + + R X ) R 1 = α(l 1 /W 1 ) R 2 = α(l 2 /W 2 ) Fast Complex Gates: Design Technique 2! Input re-ordering #when not all inputs arrive at the same time critical path critical path In 3 1 In 2 1 In 1 0 1 M3 0 1 charged In 1 M3 C charged L M2 C 2 charged In 2 1 M2 C2 discharged In M1 charged 3 1 M1 C discharged 1 C 1 delay determined by time to discharge, C 1 and C 2 delay determined by time to discharge
Overview $ Electrical wire models # Lumped RC model # Distributed rc line $ Designing gates for performance # Progressive sizing # Input re-ordering! Driving large capacitances # Buffering techniques # Logical effort 19 Reducing Wire Delay L L/2 L/2 rc L 2 /2 t inv + 2rc/2 (L/2) 2 As long as t inv is smaller than half the wire delay, the total delay may be reduced by inserting an inverter! 1mm 1mm r = 20Ω/µm c = 4 10-4 pf/µm t1 = 0.69 4 10-15 L 2 (delay of a 1mm section) tp = 2.8 10-15 (1000) 2 + t inv + 2.8 10-15 (1000) 2 = 5.6ns + tinv (< 11.2 ns when inv is missing)
Driving Large Capacitances inv1 R line inv2 C line V DD V DD V in P1 P2 C i N1 N2 α opt = ε(1 + C w /C n ) If C W = 0; ε = 2.5 => α ~ 1.6 Single Inverter Buffer V DD V DD V in C i α 1 αu u = xc i Q: what value of u minimizes the propagation delay (inv + Buffer)? buffer u = x t p,opt = 2t p0 x
Using Cascaded Buffers! If is given # How should the inverters be sized? # How many stages are needed to minimize the delay? In 1 u u 2 u N Out C i C 1 C 2 u opt = e t p,opt = e t p0 ln( /C i ) t p as function of u and x 60.0 u/ln(u) 40.0 x=10,000 x=1000 20.0 x=100 x=10 0.0 1.0 3.0 5.0 7.0 u
Overview $ Electrical wire models # Lumped RC model # Distributed rc line $ Designing gates for performance # Progressive sizing # Input re-ordering! Driving large capacitances $ Buffering techniques # Logical effort 25 Logical Effort! A way of thinking about delay in MOS circuits. It seeks to determine quickly a circuit s maximum possible speed and how to achieve it.! Book: Logical effort: Designing fast CMOS Circuits by I. Sutherland, B. Sproull and D. Harris
Definitions! The logical effort of a logical gate is defined as the ratio of its input capacitance to that of an inverter that delivers equal output current.! Use inverter as the reference gate Logical Effort (cont d) % Type of efforts - logical (G = Πg i ) - electrical (H = C out /C in ) - branching (B = Πb i ) % Path effort -F = GBH
Optimization % N-stage logic network % Idea: The path delay is least when each stage in the path bears the same stage effort % f = g i h i = (F) 1/N % Main result: minimum delay achievable along a path % D = N (F) 1/N + P (where P = p i ) % C ini = (1/f ) g i C outi (used for transistor sizing!) % The method of logical effort achieves an approximate optimum! Example A C G = (4/3) 3 = 2.37 B = 1 H = C/C = 1 y z B C F = 2.37 D = 3(2.37) 1/3 + 3(2p inv ) = 10 delay units (min delay) f = (2.37) 1/3 = 4/3 (this is the stage effort) z = C (4/3) / (4/3) = C y = z (4/3) / (4/3) = C (all 3 gates should have the same input capacitance) Gate 1 inp INV 1 NAND NOR XOR Gate Inv n-nand n-nor XOR 2 inp 3 inp 4/3 5/3 5/3 7/3 4 12 P P inv = 1 np inv np inv 4p inv