EE241 - Spring 2000 Advanced Digital Integrated Circuits Lecture 3 Circuit Optimization for Speed Announcements Tu 2/8/00 class will be pre-taped on Friday, 2/4, 4-5:30 203 McLaughlin Class notes are available in the morning before the class on the Web ISSCC preview seminars:» Fri 1/28, 2-5pm, 531 Cory (Hogan Rm) - 5 speakers from Stanford» Tue 2/1 3:30-5:30pm, 531 Cory 4 speakers from Philips» Thu 2/3 4-5 531 Cory 2 speakers from UC Davis 1
Static CMOS Delay Intrinsic logic delay» From internal R, C = constant RC delay of the load» Fan-out» Wire RC For a given gate, delay is a function of:» Load» Input rise time(s)» Intrinsic delay Delay model application: intrinsic delay 2C D +C G Close to Shoji 2
Delay dependence on inputs V DD R P R P 3.0 A = B = 1 0 A R N B B F C L 2.0 1.0 0.0 A = 1, B = 1 0 A = 1 0, B =1 R N A C int -1.0 0 100 200 300 400 time, ps Optimizing the intrinsic RC delay 3
Progressive Sizing Uniform versus progressive sizing Uniform k n-1 Non-uniform 1 N k n-1 k k 2 k n-1 4
Sizing models Case study 5
Example: Progressive Scaling of NMOS Devices in DOMINO CMOS What if External Capacitance is Dominant? Divide and Conquer! 6
Divide and Conquer In 1 u u 2 u N-1 Out Ci C1 C2 C L u opt = e t p as a function of u and x 60.0 u/ln(u) 40.0 x=10,000 x=1000 20.0 x=100 x=10 0.0 1.0 3.0 5.0 7.0 u 7
Adding Intrinsic Load Optimum Tapering Factor for Realistic Load 5 4 u opt 3 2 0 1 2 3 α = gγ 8
Tapering Factor for Realistic Load Effect of Rise/Fall Times 9
Effect of Rise/Fall Times Negative Delay? 10
Impact of Rise and Fall Times Impact of Termination m f N k o k o k1 C L 11
Required Number of Buffer Stages What about power consumption (and area)? 12
Delay versus Area and Power Optimizing t p versus t r /t f 13
Problem: Ground Bounce Sizing in Presence of Noise 14
RC-line delay O mm 0.35 µm process minimum width wires r = 0.12 Ω/µm c = 0.16 ff/µm τ = 0.019 fs/µm2 1O mm Waveforms for 10 mm RC Line: tp = 750 psec Delay and Rise Times for a 10 mm line Simulation Results Analytically: t d = 0.4 d 2 RC t r = d 2 RC 15
Repeaters Optimum repetition rate t d = (l/l s )2t b For 0.35 mm tech. l s = 3.5 mm n = 17.5 mm/ns Increasing Wire Width and Spacing Reduces impact of fringing capacitance and capacitance to neighboring wires 16
Overdrive of Low-Swing RC Lines 300 mv signal 1.5 V overdrive 750 ps -> 350 ps Bipolar Overdrive Signaling Issues: Crosstalk Delay still quadratic with respect to length 17
Impact of wire-delay on highperformance design Critical Path Delay Model (Phil Fischer - Sematech) L net 1 2 N Global Wire L Edge t d stage t d global = t d inv + t d Edge t d cycle = N * t d stage + t d global t d = R o (C out + f.o.c g in ) + R o (C w ) + 0.4(C w R w 1.6 + t of 1.6 ) 1/1.6 + 0.7R w C g in Transistor and Interconnect Parameters Parameter 250nm 180nm 150nm 130nm 100nm 70nm 50nm Wint M1-2 local (nm) 320 230 195 170 130 95 70 Hint M1-2 local (nm) 576 420 390 360 312 257 210 Pitchw M1-2 local (nm) 640 460 390 340 260 190 140 tins M1-2 local (nm) 650 500 450 360 320 270 210 Wint M semi-global (nm) 500 Hint M semi-global (nm) 900 Pitchw M semi-global (nm) 1500 1000 900 800 600 400 350 tins M semi-global (nm) 900 Wint M global (nm) 2000 Hint M global (nm) 2000 Pitchw M global(nm) 4000 4000 4000 4000 4000 4000 4000 tins M global (nm) 1400 Chip area (cm 2 ) 3.0 3.6 3.9 4.3 5.2 6.2 7.5 Vdd (V) 2.5 1.8 1.8 1.5 1.2 0.9 0.6 Metal Levels 6 6 6 7 8 9 9 Metal resistivity (µohm-cm) 3.3 2.2 2.2 2.2 2.2 1.8 1.8 Relative dielectric constant 3.9 2.7 2.5 2.0 1.5 1.4 1.4 Transistors: Equivalent tox (nm) 4.50 3.50 2.80 2.50 1.80 1.40 1.00 In nominal (µa/µm) 600 600 600 600 600 600 600 Ip nominal (µa/µm) 280 280 280 280 280 280 280 18
NAND Gate Stage Delay (with scaled local interconnect, fi = fo = 3) 140 Source: P. Fischer 120 100 Stage delay Delay (ps) 80 60 Gate delay 40 20 Wire delay 0 250240 190180 150140130 10090 70 50 40 Technology Min. Feature (nm) For future Cu-low k High-Performance Microprocessor 3200 2800 Clock Freq. (MHz) 2400 2000 1600 1200 800 400 Cu-Low k Al-Low k Cu-SiO 2 Al-SiO 2 0 250 240 190 180 150 140 130 100 90 70 50 40 Technology Min. Feature (nm) Logic Depth = 12 Gates 19
Clock-Cycle Model Summary Parameter 250nm 180nm 150nm 130nm 100nm 70nm 50nm L net Logic wire CP ( µm) 175 139 126 103 78 59 48 Cint M1 (pf/cm) 1.92 1.21 1.16 0.97 0.77 0.76 0.81 Cint 2µm M global (pf/cm) 1.86 1.29 1.19 0.95 0.72 0.67 0.67 Rint M1 (Kohm/cm) 1.8 2.5 2.9 3.6 5.4 7.4 12.3 Rint 2µm M global (ohm/cm) 83 55 55 55 55 45 45 tof (ps/cm) 80 64 57 57 49 48 48 L edge (cm) 1.73 1.90 1.96 2.07 2.28 2.49 2.74 td-global (ps) 377 235 210 190 176 168 160 td-gates Logic D=12 (ps) 830 512 428 370 274 193 139 td-cycle Model (ps) 1207 746 637 560 450 361 299 tcycle time (ps) 1341 829 708 622 500 401 332 Ratio (td-cycle/t cycle time) 0.90 0.90 0.90 0.90 0.90 0.90 0.90 High Perf. Clock, mp (MHz) 746 1206 1412 1607 2001 2497 3015 Rounded Clock, mp (MHz) 750 1200 1400 1600 2000 2500 3000 Cost - Performance (D=25) Cost Perf. Clock, mp (MHz) 380 615 727 833 1072 1406 1782 Rounded Clock, mp (MHz) 400 600 700 800 1100 1400 1800 Wn Output ( µm) 6.0 4.0 3.5 3.0 2.0 1.5 1.0 Logic Area (cm 2 ) 1.20 1.15 1.18 1.10 1.11 1.11 1.10 Ng 1.00E+06 1.50E+06 2.00E+06 3.00E+06 6.00E+06 1.20E+07 2.00E+07 Ntr Logic 6.00E+06 9.00E+06 1.20E+07 1.80E+07 3.60E+07 7.20E+07 1.20E+08 Rules of Thumb Keep the fan-in less than 3 Keep the fan-out less than 5 Same delays of gates in the critical path Same rise/fall times Size the transistors to drive the interconnect 20
Logical Effort Designing for speed on the Back of an Envelope Sutherland, Sproull, ARVLSI 91 Sutherland, Sproull, Harris, Morgan-Kaufman 99 A simple method for manual sizing of logic gates: simple delay model transistor sizes number of stages Expanding the sizing rules of inverters to complex logic gates Delay Model Normalized delay (d) t pnand tpinv F(Fan-in) 1 2 3 4 5 6 7 Fan-out 21
Delay in a Logic Gate Gate delay: d = f + p effort delay parasitic delay Effort delay: f = g h logical effort electrical effort = C out /C in Logical effort is a function of topology, independent of sizing Electrical effort is a function of load/gate size Logical Effort Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current g = 1 g = 4/3 g = 5/3 22
Logical Effort Example 8-input AND 23