CPE/EE 427, CPE 527 VLSI Design I L13: Wires, Design for Speed. Course Administration

Similar documents
Interconnects. Wire Resistance Wire Capacitance Wire RC Delay Crosstalk Wire Engineering Repeaters. ECE 261 James Morizio 1

Interconnects. Introduction

VLSI Design I; A. Milenkovic 1

CMPEN 411 VLSI Digital Circuits Spring 2012

Very Large Scale Integration (VLSI)

Digital Integrated Circuits (83-313) Lecture 5: Interconnect. Semester B, Lecturer: Adam Teman TAs: Itamar Levi, Robert Giterman 1

Interconnect (2) Buffering Techniques. Logical Effort

Digital Integrated Circuits A Design Perspective

Properties of CMOS Gates Snapshot

9/18/2008 GMU, ECE 680 Physical VLSI Design

Lecture 9: Interconnect

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

EE141. Administrative Stuff

Digital Integrated Circuits. The Wire * Fuyuzhuo. *Thanks for Dr.Guoyong.SHI for his slides contributed for the talk. Digital IC.

! Crosstalk. ! Repeaters in Wiring. ! Transmission Lines. " Where transmission lines arise? " Lossless Transmission Line.

EE 447 VLSI Design. Lecture 5: Logical Effort

ECE429 Introduction to VLSI Design

VLSI Design, Fall Logical Effort. Jacob Abraham

Lecture 6: Logical Effort

Homework #2 10/6/2016. C int = C g, where 1 t p = t p0 (1 + C ext / C g ) = t p0 (1 + f/ ) f = C ext /C g is the effective fanout

The Wire. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

The Wire EE141. Microelettronica

Lecture 8: Combinational Circuit Design

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

ENEE 359a Digital VLSI Design

Digital Integrated Circuits A Design Perspective

VLSI GATE LEVEL DESIGN UNIT - III P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT

EE115C Digital Electronic Circuits Homework #5

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

CPE/EE 427, CPE 527 VLSI Design I L06: CMOS Inverter, CMOS Logic Gates. Course Administration. CMOS Inverter: A First Look

Interconnect (2) Buffering Techniques.Transmission Lines. Lecture Fall 2003

5.0 CMOS Inverter. W.Kucewicz VLSICirciuit Design 1

CMOS Inverter: CPE/EE 427, CPE 527 VLSI Design I L06: CMOS Inverter, CMOS Logic Gates. Course Administration. CMOS Properties.

Lecture 8: Logic Effort and Combinational Circuit Design

THE INVERTER. Inverter

CMOS logic gates. João Canas Ferreira. March University of Porto Faculty of Engineering

The CMOS Inverter: A First Glance

Digital EE141 Integrated Circuits 2nd Combinational Circuits

Lecture 12 CMOS Delay & Transient Response

Introduction to CMOS VLSI Design. Lecture 5: Logical Effort. David Harris. Harvey Mudd College Spring Outline

CARNEGIE MELLON UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING DIGITAL INTEGRATED CIRCUITS FALL 2002

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

EEC 116 Lecture #5: CMOS Logic. Rajeevan Amirtharajah Bevan Baas University of California, Davis Jeff Parkhurst Intel Corporation

CPE/EE 427, CPE 527 VLSI Design I Delay Estimation. Department of Electrical and Computer Engineering University of Alabama in Huntsville

EEC 118 Lecture #6: CMOS Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Digital Integrated Circuits A Design Perspective

Logical Effort. Sizing Transistors for Speed. Estimating Delays

CPE/EE 427, CPE 527 VLSI Design I L18: Circuit Families. Outline

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Announcements. EE141- Fall 2002 Lecture 7. MOS Capacitances Inverter Delay Power

CMOS Transistors, Gates, and Wires

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Review: 1st Order RC Delay Models. Review: Two-Input NOR Gate (NOR2)

The CMOS Inverter: A First Glance

Introduction to CMOS VLSI Design. Logical Effort B. Original Lecture by Jay Brockman. University of Notre Dame Fall 2008

Integrated Circuits & Systems

University of Toronto. Final Exam

COMP 103. Lecture 10. Inverter Dynamics: The Quest for Performance. Section 5.4.2, What is this lecture+ about? PERFORMANCE

Lecture 5: DC & Transient Response

Digital Integrated Circuits 2nd Inverter

VLSI Design and Simulation

Integrated Circuits & Systems

VLSI Design I; A. Milenkovic 1

Dynamic Repeater with Booster Enhancement for Fast Switching Speed and Propagation in Long Interconnect

Lecture 4: CMOS Transistor Theory

Topics to be Covered. capacitance inductance transmission lines

The Inverter. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic

E40M Capacitors. M. Horowitz, J. Plummer, R. Howe

EECS 151/251A Homework 5

Announcements. EE141- Fall 2002 Lecture 25. Interconnect Effects I/O, Power Distribution

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: Nick Weaver & John Wawrzynek. Lecture 12 EE141

EE115C Digital Electronic Circuits Homework #6

Lecture 7 Circuit Delay, Area and Power

Lecture 1: Gate Delay Models

COMP 103. Lecture 16. Dynamic Logic

COMBINATIONAL LOGIC. Combinational Logic

Integrated Circuits & Systems

and V DS V GS V T (the saturation region) I DS = k 2 (V GS V T )2 (1+ V DS )

EE141-Spring 2007 Digital Integrated Circuits. Administrative Stuff. Last Lecture. Wires. Interconnect Impact on Chip. The Wire

VLSI Design I; A. Milenkovic 1

Lecture 5: DC & Transient Response

ECE 438: Digital Integrated Circuits Assignment #4 Solution The Inverter

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

ECE260B CSE241A Winter Interconnects. Website:

ECE321 Electronics I

Capacitance - 1. The parallel plate capacitor. Capacitance: is a measure of the charge stored on each plate for a given voltage such that Q=CV

EE213, Spr 2017 HW#3 Due: May 17 th, in class. Figure 1

MOS Transistor Theory

UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences. Professor Oldham Fall 1999

EECS 141: SPRING 09 MIDTERM 2

ENEE 359a Digital VLSI Design

Lecture 4: CMOS review & Dynamic Logic

CMOS Digital Integrated Circuits Lec 10 Combinational CMOS Logic Circuits

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

CPE/EE 427, CPE 527 VLSI Design I Pass Transistor Logic. Review: CMOS Circuit Styles

EECS 141: FALL 05 MIDTERM 1

Check course home page periodically for announcements. Homework 2 is due TODAY by 5pm In 240 Cory

Chapter 5. The Inverter. V1. April 10, 03 V1.1 April 25, 03 V2.1 Nov Inverter

C.K. Ken Yang UCLA Courtesy of MAH EE 215B

EE141-Spring 2008 Digital Integrated Circuits EE141. Announcements EECS141 EE141. Lecture 24: Wires

Semiconductor memories

Transcription:

CPE/EE 427, CPE 527 VLSI Design I L3: Wires, Design for Speed Department of Electrical and Computer Engineering University of labama in Huntsville leksandar Milenkovic ( www.ece.uah.edu/~milenka ) www.ece.uah.edu/~milenka/cpe527-05f Course dministration Instructor: leksandar Milenkovic milenka@ece.uah.edu www.ece.uah.edu/~milenka EB 27-L Mon. 5:30 PM 6:30 PM, Wen. 2:30 3:30 PM URL: http://www.ece.uah.edu/~milenka/cpe527-05f T: Joel Wilder Labs: Lab#4: due 0/4/05; Lab#5: 0/2/05 Hws: Solutions in secure directory /scr (cpe427fall05,?) Project: Proposals due was on 0/0/05 Test I: 0/7/05 Text: CMOS VLSI Design, 3rd ed., Weste, Harris Review: Chapters, 2, 3, 4; Today: Wires, Design for Speed (meet M in the Lab tonight) 0//2005 VLSI Design I;. Milenkovic 2 VLSI Design I;. Milenkovic

Outline Introduction Wire Resistance Wire Capacitance Wire RC Delay Crosstalk Wire Engineering Repeaters 0//2005 VLSI Design I;. Milenkovic 3 Introduction Chips are mostly made of wires called interconnect In stick diagram, wires set size Transistors are little things under the wires Many layers of wires Wires are as important as transistors Speed Power Noise lternating layers run orthogonally 0//2005 VLSI Design I;. Milenkovic 4 VLSI Design I;. Milenkovic 2

Wire Geometry Pitch = w + s spect ratio: R = t/w Old processes had R << Modern processes have R 2 Pack in many skinny wires w s l t h 0//2005 VLSI Design I;. Milenkovic 5 Layer Stack MI 0.6 µm process has 3 metal layers Modern processes use 6-0+ metal layers Example: Layer T (nm) W (nm) S (nm) R Intel 80 nm process 6 720 860 860 2.0 000 M: thin, narrow (< 3λ) 5 600 800 800 2.0 High density cells M2-M4: thicker For longer wires M5-M6: thickest For V DD, GND, clk 000 4 080 540 540 2.0 700 3 700 320 320 2.2 700 2 700 320 320 2.2 700 480 250 250.9 800 Substrate 0//2005 VLSI Design I;. Milenkovic 6 VLSI Design I;. Milenkovic 3

ρ= resistivity (Ω*m) Wire Resistance ρ l l R = = R t w w R = sheet resistance (Ω/ ) is a dimensionless unit(!) Count number of squares R = R * (# of squares) l w l l w w t t Rectangular Block R = R (L/W) Ω 4 Rectangular Blocks R = R (2L/2W) Ω = R (L/W) Ω 0//2005 VLSI Design I;. Milenkovic 7 Choice of Metals Until 80 nm generation, most wires were aluminum Modern processes often use copper Cu atoms diffuse into silicon and damage FETs Must be surrounded by a diffusion barrier Metal Silver (g) Copper (Cu) Gold (u) luminum (l) Tungsten (W) Molybdenum (Mo) Bulk resistivity (µω*cm).6.7 2.2 2.8 5.3 5.3 0//2005 VLSI Design I;. Milenkovic 8 VLSI Design I;. Milenkovic 4

Sheet Resistance Typical sheet resistances in 80 nm process Layer Diffusion (silicided) Diffusion (no silicide) Polysilicon (silicided) Polysilicon (no silicide) Metal Metal2 Metal3 Metal4 Metal5 Metal6 Sheet Resistance (Ω/ ) 3-0 50-200 3-0 50-400 0.08 0.05 0.05 0.03 0.02 0.02 0//2005 VLSI Design I;. Milenkovic 9 Contacts Resistance Contacts and vias also have 2-20 Ω Use many contacts for lower R Many small contacts for current crowding around periphery 0//2005 VLSI Design I;. Milenkovic 0 VLSI Design I;. Milenkovic 5

Wire Capacitance Wire has capacitance per unit length To neighbors To layers above and below C total = C top + C bot + 2C adj s w layer n+ h 2 C top t layer n h C bot C adj layer n- 0//2005 VLSI Design I;. Milenkovic Capacitance Trends Parallel plate equation: C = ε/d Wires are not parallel plates, but obey trends Increasing area (W, t) increases capacitance Increasing distance (s, h) decreases capacitance Dielectric constant ε = kε 0 ε 0 = 8.85 x 0-4 F/cm k = 3.9 for SiO 2 Processes are starting to use low-k dielectrics k 3 (or less) as dielectrics use air pockets 0//2005 VLSI Design I;. Milenkovic 2 VLSI Design I;. Milenkovic 6

M2 Capacitance Data Typical wires have ~ 0.2 ff/µm Compare to 2 ff/µm for gate capacitance 400 350 C total (af/µm) 300 250 200 50 00 M, M3 planes s = 320 s = 480 s = 640 s= Isolated s = 320 s = 480 s = 640 s= 8 8 50 0 0 500 000 500 2000 w (nm) 0//2005 VLSI Design I;. Milenkovic 3 Diffusion & Polysilicon Diffusion capacitance is very high (about 2 ff/µm) Comparable to gate capacitance Diffusion also has high resistance void using diffusion runners for wires! Polysilicon has lower C but high R Use for transistor gates Occasionally for very short wires between gates 0//2005 VLSI Design I;. Milenkovic 4 VLSI Design I;. Milenkovic 7

Lumped Element Models Wires are a distributed system pproximate with lumped element models N segments R R/N R/N R/N R/N C C/N C/N C/N C/N R R R/2 R/2 C C/2 C/2 C L-model π-model T-model 3-segment π-model is accurate to 3% in simulation L-model needs 00 segments for same accuracy! Use single segment π-model for Elmore delay 0//2005 VLSI Design I;. Milenkovic 5 Example Metal2 wire in 80 nm process 5 mm long 0.32 µm wide Construct a 3-segment π-model R = C permicron = 0//2005 VLSI Design I;. Milenkovic 6 VLSI Design I;. Milenkovic 8

Example Metal2 wire in 80 nm process 5 mm long 0.32 µm wide Construct a 3-segment π-model R = 0.05 Ω/ => R = 78 Ω C permicron = 0.2 ff/µm => C = pf 260 Ω 67 ff 67 ff 260 Ω 67 ff 67 ff 260 Ω 67 ff 67 ff 0//2005 VLSI Design I;. Milenkovic 7 Wire RC Delay Estimate the delay of a 0x inverter driving a 2x inverter at the end of the 5mm wire from the previous example. R = 2.5 kω*µm for gates Unit inverter: 0.36 µm nmos, 0.72 µm pmos t pd = 0//2005 VLSI Design I;. Milenkovic 8 VLSI Design I;. Milenkovic 9

Wire RC Delay Estimate the delay of a 0x inverter driving a 2x inverter at the end of the 5mm wire from the previous example. R = 2.5 kω*µm for gates Unit inverter: 0.36 µm nmos, 0.72 µm pmos 78 Ω 690 Ω 500 ff 500 ff 4 ff t pd =. ns Driver Wire Load 0//2005 VLSI Design I;. Milenkovic 9 Simulated Wire Delays V in 2.5 L L/0 L/4 L/2 L V out voltage (V) 2.5 0.5 0 0 0.5.5 2 2.5 3 3.5 4 4.5 5 time (nsec) 0//2005 VLSI Design I;. Milenkovic 20 VLSI Design I;. Milenkovic 0

Wire Delay Models Ideal wire same voltage is present at every segment of the wire at every point in time - at equi-potential only holds for very short wires, i.e., interconnects between very nearest neighbor gates Lumped C model when only a single parasitic component (C, R, or L) is dominant the different fractions are lumped into a single circuit element When the resistive component is small and the switching frequency is low to medium, can consider only C; the wire itself does not introduce any delay; the only impact on Driver performance comes from wire capacitance V out R Driver V out c wire C lumped capacitance per unit length good for short wires; pessimistic and inaccurate for long wires 0//2005 VLSI Design I;. Milenkovic 2 Wire Delay Models, con t Lumped RC model total wire resistance is lumped into a single R and total capacitance into a single C good for short wires; pessimistic and inaccurate for long wires Distributed RC model circuit parasitics are distributed along the length, L, of the wire c and r are the capacitance and resistance per unit length V in r L c L r L r L r L r L c L c L c L c L V N V in (r,c,l) V N Delay is determined using the Elmore delay equation τ Di = c k r ik 0//2005 VLSI Design I;. Milenkovic 22 N k= VLSI Design I;. Milenkovic

Chain Network Elmore Delay V in r r 2 r i- r i r 2 i- i N N c c 2 c i- c i c N V N Elmore delay equation N i τ DN = c i r ii = c i r j 0//2005 VLSI Design I;. Milenkovic 23 Chain Network Elmore Delay τ D =c r τ D2 =c r +c 2 (r +r 2 ) V in r r 2 r i- r i r 2 i- i N N c c 2 c i- c i c N V N τ Di =c r +c 2 (r +r 2 )+ +c i (r +r 2 + +r i ) Elmore delay equation N i τ DN = c i r ii = c i r j τ Di =c r eq +2c 2 r eq +3c 3 r eq + + ic i r eq 0//2005 VLSI Design I;. Milenkovic 24 VLSI Design I;. Milenkovic 2

Distributed RC Model for Simple Wires length L RC wire can be modeled by N segments of length L/N The resistance and capacitance of each segment are given by r L/N and c L/N τ DN = (L/N) 2 (cr+2cr+ +Ncr) = (crl 2 ) (N(N+))/(2N 2 ) = CR((N+)/(2N)) where R (= rl) and C (= cl) are the total lumped resistance and capacitance of the wire For large N τ DN = RC/2 = rcl 2 /2 Delay of a wire is a quadratic function of its length, L The delay is /2 of that predicted (by the lumped model) 0//2005 VLSI Design I;. Milenkovic 25 Putting It ll Together R Driver r w,c w,l V out V in Total propagation delay consider driver and wire τ D = R Driver C w + (R w C w )/2 = R Driver C w + 0.5r w c w L 2 and t p = 0.69 R Driver C w + 0.38 R w C w where R w = r w L and C w = c w L The delay introduced by wire resistance becomes dominant when (R w C w )/2 R Driver C W (when L 2R Driver /R w ) For an R Driver = kω driving an µm wide l wire, L crit is 2.67 cm 0//2005 VLSI Design I;. Milenkovic 26 VLSI Design I;. Milenkovic 3

Design Rules of Thumb rc delays should be considered when t prc > t pgate of the driving gate L crit > (t pgate /0.38rc) actual L crit depends upon the size of the driving gate and the interconnect material rc delays should be considered when the rise (fall) time at the line input is smaller than RC, the rise (fall) time of the line t rise < RC when not met, the change in the signal is slower than the propagation delay of the wire so a lumped C model suffices 0//2005 VLSI Design I;. Milenkovic 27 Delay with Long Interconnects When gates are farther apart, wire capacitance and resistance can no longer be ignored. (r w, c w, L) V in V out c int c fan t p = 0.69R dr C int + (0.69R dr +0.38R w )C w + 0.69(R dr +R w )C fan where R dr = (R eqn + R eqp )/2 = 0.69R dr (C int +C fan ) + 0.69(R dr c w +r w C fan )L + 0.38r w c w L 2 Wire delay rapidly becomes the dominate factor (due to the quadratic term) in the delay budget for longer wires. 0//2005 VLSI Design I;. Milenkovic 28 VLSI Design I;. Milenkovic 4

Crosstalk capacitor does not like to change its voltage instantaneously. wire has high capacitance to its neighbor. When the neighbor switches from -> 0 or 0->, the wire tends to switch too. Called capacitive coupling or crosstalk. Crosstalk effects Noise on nonswitching wires Increased delay on switching wires 0//2005 VLSI Design I;. Milenkovic 29 Crosstalk Delay ssume layers above and below on average are quiet Second terminal of capacitor can be ignored Model as Cgnd = Ctop + Cbot Effective Cadj depends on behavior of neighbors Miller effect Cgnd C adj B C gnd B V C eff() MCF Constant Switching with Switching opposite 0//2005 VLSI Design I;. Milenkovic 30 VLSI Design I;. Milenkovic 5

Crosstalk Delay ssume layers above and below on average are quiet Second terminal of capacitor can be ignored Model as Cgnd = Ctop + Cbot Effective Cadj depends on behavior of neighbors Miller effect Cgnd C adj B C gnd B Constant Switching with Switching opposite V V DD 0 2V DD C eff() C gnd + C adj C gnd C gnd + 2 C adj MCF 0 2 0//2005 VLSI Design I;. Milenkovic 3 Crosstalk Noise Crosstalk causes noise on nonswitching wires If victim is floating: model as capacitive voltage divider C adj Vvictim = Cgnd v + Cadj V aggressor ggressor V aggressor Victim C adj C gnd-v V victim 0//2005 VLSI Design I;. Milenkovic 32 VLSI Design I;. Milenkovic 6

Driven Victims Usually victim is driven by a gate that fights noise Noise depends on relative resistances Victim driver is in linear region, agg. in saturation If sizes are same, R aggressor = 2-4 x R victim Cadj Vvictim = V C + C + k gnd v adj aggressor V aggressor R aggressor C gnd-a ggressor k τ aggressor = = τ ( + ) ( + ) R C C aggressor gnd a adj R C C victim victim gnd v adj R victim C gnd-v Victim C adj V victim 0//2005 VLSI Design I;. Milenkovic 33 Coupling Waveforms Simulated coupling for C adj = C victim.8 ggressor.5.2 0.9 Victim (undriven): 50% 0.6 0.3 Victim (half size driver): 6% Victim (equal size driver): 8% Victim (double size driver): 4% 0 0 200 400 600 800 000 200 400 800 2000 t (ps) 0//2005 VLSI Design I;. Milenkovic 34 VLSI Design I;. Milenkovic 7

Noise Implications So what if we have noise? If the noise is less than the noise margin, nothing happens Static CMOS logic will eventually settle to correct output even if disturbed by large noise spikes But glitches cause extra delay lso cause extra power from false transitions Dynamic logic never recovers from glitches Memories and other sensitive circuits also can produce the wrong answer 0//2005 VLSI Design I;. Milenkovic 35 Wire Engineering Goal: achieve delay, area, power goals with acceptable noise Degrees of freedom: 0//2005 VLSI Design I;. Milenkovic 36 VLSI Design I;. Milenkovic 8

Wire Engineering Goal: achieve delay, area, power goals with acceptable noise Degrees of freedom: 2.0 0.8 Width.8 0.7.6 0.6 Spacing.4 Delay (ns): RC/2.2.0 0.8 0.6 0.4 0.2 0 0 500 000 500 2000 Pitch (nm) Coupling: 2C adj / (2C adj +C gnd ) 0.5 0.4 0.3 0.2 0. 0 0 500 000 500 2000 Pitch (nm) Wire Spacing (nm) 320 480 640 0//2005 VLSI Design I;. Milenkovic 37 Wire Engineering Goal: achieve delay, area, power goals with acceptable noise Degrees of freedom: 2.0 0.8 Width.8 0.7.6 0.6 Spacing.4 0.5.2.0 0.4 Layer 0.8 Delay (ns): RC/2 0.6 0.4 0.2 0 0 500 000 500 2000 Pitch (nm) Coupling: 2C adj / (2C adj +C gnd ) 0.3 0.2 0. 0 0 500 000 500 2000 Pitch (nm) Wire Spacing (nm) 320 480 640 0//2005 VLSI Design I;. Milenkovic 38 VLSI Design I;. Milenkovic 9

Wire Engineering Goal: achieve delay, area, power goals with acceptable noise Degrees of freedom: 2.0 0.8 Width.8 0.7.6 0.6 Spacing.4 0.5.2.0 0.4 Layer 0.8 0.3 0.6 0.2 Shielding 0.4 Delay (ns): RC/2 0.2 0 0 500 000 500 2000 Pitch (nm) Coupling: 2C adj / (2C adj +C gnd ) 0. 0 0 500 000 500 2000 Pitch (nm) Wire Spacing (nm) 320 480 640 vdd a 0 a gnd a 2 a 3 vdd vdd a 0 gnd a vdd a 2 gnd a 0 b 0 a b a 2 b 2 0//2005 VLSI Design I;. Milenkovic 39 Repeaters R and C are proportional to l RC delay is proportional to l 2 Unacceptably great for long wires 0//2005 VLSI Design I;. Milenkovic 40 VLSI Design I;. Milenkovic 20

Repeaters R and C are proportional to l RC delay is proportional to l 2 Unacceptably great for long wires Break long wires into N shorter segments Drive each one with an inverter or buffer Wire Length: l Driver Receiver l/n N Segments Segment l/n l/n Driver Repeater Repeater Repeater Receiver 0//2005 VLSI Design I;. Milenkovic 4 Repeater Design How many repeaters should we use? How large should each one be? Equivalent Circuit Wire length l/n Wire Capaitance C w *l/n, Resistance R w *l/n Inverter width W (nmos = W, pmos = 2W) Gate Capacitance C *W, Resistance R/W 0//2005 VLSI Design I;. Milenkovic 42 VLSI Design I;. Milenkovic 2

Repeater Design How many repeaters should we use? How large should each one be? Equivalent Circuit Wire length l Wire Capacitance C w *l, Resistance R w *l Inverter width W (nmos = W, pmos = 2W) Gate Capacitance C *W, Resistance R/W R w ln R/W C w l/2n C w l/2n C'W 0//2005 VLSI Design I;. Milenkovic 43 Repeater Results Write equation for Elmore Delay Differentiate with respect to W and N Set equal to 0, solve l = N t pd l W = 2RC R C w w ( 2 2) = + RCw R C w RC R C w w ~60-80 ps/mm in 80 nm process 0//2005 VLSI Design I;. Milenkovic 44 VLSI Design I;. Milenkovic 22

Designing for Speed Department of Electrical and Computer Engineering University of labama in Huntsville Review: CMOS Inverter: Dynamic V DD t phl = f(r n, C L ) V out t phl = 0.69 R eqn C L R n C L t phl = 0.69 (3/4 (C L V DD )/ I DSTn ) = 0.52 C L / (W/L n k n V DSTn ) V in = V DD 0//2005 VLSI Design I;. Milenkovic 46 VLSI Design I;. Milenkovic 23

Review: Designing Inverters for Performance Reduce C L internal diffusion capacitance of the gate itself interconnect capacitance fanout Increase W/L ratio of the transistor the most powerful and effective performance optimization tool in the hands of the designer watch out for self-loading! Increase V DD only minimal improvement in performance at the cost of increased energy dissipation Slope engineering - keeping signal rise and fall times smaller than or equal to the gate propagation delays and of approximately equal values good for performance good for power consumption 0//2005 VLSI Design I;. Milenkovic 47 Switch Delay Model R eq R p B R p R p B R p R n C L R n C L R p C int R n B C int INVERTER R n R n B C L NND NOR 0//2005 VLSI Design I;. Milenkovic 48 VLSI Design I;. Milenkovic 24

Input Pattern Effects on Delay R p R n R n B B R p C L Cint Delay is dependent on the pattern of inputs Low to high transition both inputs go low delay is 0.69 R p /2 C L since two p-resistors are on in parallel one input goes low delay is 0.69 R p C L High to low transition both inputs go high delay is 0.69 2R n C L dding transistors in series (without sizing) slows down the circuit 0//2005 VLSI Design I;. Milenkovic 49 Delay Dependence on Input Patterns 3 2.5 =B= 0 2-input NND with NMOS = 0.5µm/0.25 µm PMOS = 0.75µm/0.25 µm C L = 0 ff Voltage, V 2.5 =, B= 0 0.5 = 0, B= 0 0-0.5 00 200 300 400 time, psec Input Data Pattern =B=0 =, B=0 = 0, B= =B= 0 =, B= 0 = 0, B= Delay (psec) 69 62 50 35 76 57 0//2005 VLSI Design I;. Milenkovic 50 VLSI Design I;. Milenkovic 25

Transistor Sizing R p R p B 2 B R p 2 R n B C L 2 R p C int 2 R n Cint R n R n B C L 0//2005 VLSI Design I;. Milenkovic 5 Fan-In Considerations B C D C L B C D C 3 C 2 C Distributed RC model (Elmore delay) t phl = 0.69 R eqn (C +2C 2 +3C 3 +4C L ) Propagation delay deteriorates rapidly as a function of fan-in quadratically in the worst case. 0//2005 VLSI Design I;. Milenkovic 52 VLSI Design I;. Milenkovic 26

t p as a Function of Fan-In 250 000 quadratic function of fan-in t p (psec) 750 500 t phl t p 250 0 t plh linear function of 2 4 6 8 0 2 4 6 fan-in fan-in Gates with a fan-in greater than 4 should be avoided. 0//2005 VLSI Design I;. Milenkovic 53 Fast Complex Gates: Design Technique Transistor sizing as long as fan-out capacitance dominates Progressive sizing Distributed RC line In N MN C L M > M2 > M3 > > MN In 3 M3 C 3 The fet closest to the output should be the smallest. In 2 In M2 M C 2 C Can reduce delay by more than 20%; decreasing gains as technology shrinks 0//2005 VLSI Design I;. Milenkovic 54 VLSI Design I;. Milenkovic 27

Fast Complex Gates: Design Technique 2 Input re-ordering when not all inputs arrive at the same time critical path critical path In 3 In 2 In 0 M3 C charged L 0 In M3 In M2 2 In M 3 M2 M C2 C C 2 C C L charged 0//2005 VLSI Design I;. Milenkovic 55 Fast Complex Gates: Design Technique 2 Input re-ordering when not all inputs arrive at the same time critical path critical path In 3 In 2 In 0 M3 0 charged C L In M3 charged C L M2 C 2 charged In 2 M2 C2 discharged In M charged 3 M C discharged C delay determined by time to discharge C L, C and C 2 delay determined by time to discharge C L 0//2005 VLSI Design I;. Milenkovic 56 VLSI Design I;. Milenkovic 28

Sizing and Ordering Effects 3 B 3 C 3 D 3 4 4 C L = 00 ff B C D 4 5 4 6 4 7 C 3 C 2 C Progressive sizing in pull-down chain gives up to a 23% improvement. Input ordering saves 5% critical path 23% critical path D 7% 0//2005 VLSI Design I;. Milenkovic 57 Fast Complex Gates: Design Technique 3 lternative logic structures F = BCDEFGH 0//2005 VLSI Design I;. Milenkovic 58 VLSI Design I;. Milenkovic 29

Fast Complex Gates: Design Technique 4 Isolating fan-in from fan-out using buffer insertion C L C L Real lesson is that optimizing the propagation delay of a gate in isolation is misguided. 0//2005 VLSI Design I;. Milenkovic 59 Logical Effort: Design Technique 5 Logical effort generalizes to multistage networks Path Logical Effort G = g i Path Electrical Effort Cout-path H = C in-path Path Effort F = f = gh i i i 0 g = h = x/0 x g 2 = 5/3 h 2 = y/x y g 3 = 4/3 h 3 = z/y z g 4 = h 4 = 20/z 20 0//2005 VLSI Design I;. Milenkovic 60 VLSI Design I;. Milenkovic 30

Branching Effort Introduce branching effort ccounts for branching between stages in path C b = on path C + C on path off path B= b i Now we compute the path effort F = GBH Note: h i = BH 0//2005 VLSI Design I;. Milenkovic 6 Multistage Delays Path Effort Delay Path Parasitic Delay Path Delay DF = f P= p i D = d = D + P i i F 0//2005 VLSI Design I;. Milenkovic 62 VLSI Design I;. Milenkovic 3

Designing Fast Circuits D = d = D + P i Delay is smallest when each stage bears same effort fˆ = gh = F i i N Thus minimum delay of N stage path is N D = NF + P This is a key result of logical effort Find fastest possible delay Doesn t require calculating gate sizes F 0//2005 VLSI Design I;. Milenkovic 63 Gate Sizes How wide should the gates be for least delay? fˆ = gh= g C C out in gc i Cin = i fˆ out i Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. 0//2005 VLSI Design I;. Milenkovic 64 VLSI Design I;. Milenkovic 32

Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter Initial Driver D = Datapath Load N: f: D: 64 64 64 64 2 3 4 0//2005 VLSI Design I;. Milenkovic 65 Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter Initial Driver D = NF /N + P = N(64) /N + N 8 4 2.8 6 8 23 Datapath Load 64 64 64 64 N: f: D: 64 65 2 8 8 3 4 5 Fastest 4 2.8 5.3 0//2005 VLSI Design I;. Milenkovic 66 VLSI Design I;. Milenkovic 33

Derivation Consider adding inverters to end of path How many give least delay? N n i= ( ) D = NF + p + N n p D N N N = F ln F + F + pinv = 0 N Define best stage effort ρ = F N i ( ) p + ρ lnρ = 0 inv inv Logic Block: n Stages Path Effort F N - n Extra Inverters 0//2005 VLSI Design I;. Milenkovic 67 Best Stage Effort has no closed-form solution p + ρ lnρ = 0 inv ( ) Neglecting parasitics (p inv = 0), we find ρ = 2.78 (e) For p inv =, solve numerically for ρ = 3.59 0//2005 VLSI Design I;. Milenkovic 68 VLSI Design I;. Milenkovic 34

Sensitivity nalysis How sensitive is delay to using exactly the best number of stages? D(N) /D(N).6.5.4.26.2.5.0 (ρ=6) (ρ =2.4) 0.0 0.5 0.7.0.4 2.0 2.4 < ρ < 6 gives delay within 5% of optimal We can be sloppy! I like ρ = 4 N / N 0//2005 VLSI Design I;. Milenkovic 69 VLSI Design I;. Milenkovic 35