ESE534: Computer Organization Day 5: September 19, 2016 VLSI Scaling 1 Today VLSI Scaling Rules Effects Historical/predicted scaling Variations (cheating) Limits Note: Day 5 and 6 most gory MOSFET equations will see! goal is to understand trends Give equations then push through scaling implications together 2 Why Care? In this game, we must be able to predict the future Technology advances rapidly Reason about changes and trends Re-evaluate prior solutions given technology at time X. Make direct comparison across technologies E.g. to understand older designs What comes from process vs. architecture 3 Why Care Cannot compare against what competitor does today but what they can do at time you can ship Development time > Technology generation Careful not to fall off curve lose out to someone who can stay on curve 4 Scaling Old Premise: features scale uniformly everything gets better in a predictable manner Parameters: " λ (lambda) -- Mead and Conway " F -- Half pitch ITRS (F=2λ) " S scale factor Rabaey " F =S F 5 ITRS Roadmap Semiconductor Industry rides this scaling curve Try to predict where industry going (requirements self fulfilling prophecy) http://public.itrs.net 6 1
Preclass Scale factor S from 32nm! 22nm? MOS Transistor Scaling (1974 to present) S=0.7 [0.5x per 2 nodes] Pitch Gate 7 Source: 2001 ITRS - Exec. Summary, ORTC Figure [from Andrew Kahng] 8 Half Pitch (= Pitch/2) Definition Metal Pitch Poly Pitch Scaling Calculator + Cycle Time: 0.7x 0.7x Node 250 -> 180 -> 130 -> 90 -> 65 -> 45 -> 32 -> 22 -> 16 Log Half-Pitch 1994 NTRS -. 7x/3yrs Actual -.7x/2yrs Linear Time 0.5x N N+1 N+2 Node Cycle Time (T yrs): *CARR(T) = (Typical DRAM) Source: 2001 ITRS - Exec. Summary, ORTC Figure (Typical MPU/ASIC) [from Andrew Kahng] 9 [(0.5)^(1/2T yrs)] - 1 * CARR(T) = Compound Annual CARR(3 yrs) = -10.9% Reduction Rate (@ cycle time period, T) CARR(2 yrs) = -15.9% Source: 2001 ITRS - Exec. Summary, ORTC Figure [from Andrew Kahng] 10 Channel Length (L) Channel Width (W) Oxide Thickness (T ox ) Dimensions Scaling Channel Length (L) Channel Width (W) Oxide Thickness (T ox ) Doping (N a ) Voltage (V) 11 12 2
Full Scaling (Ideal Scaling, Dennard Scaling) Channel Length (L) Channel Width (W) Oxide Thickness (T ox ) S S S Doping (N a ) 1/S Voltage (V) S 13 Effects on Physical Properties? Area Capacitance Resistance Threshold (V th ) Current (I d ) Gate Delay (τ gd ) Wire Delay (τ wire ) Energy Power Go through full (ideal) then come back and ask what still makes sense today. 14 " F# FS " Area impact? " Α = L W " Α # ΑS 2 Area S=0.7 [0.5x per 2 nodes] " 32nm # 22nm " 50% area " 2 capacity same area L W Preclass When will have 100-core processor What feature size? What time frame? Pitch Gate 15 16 Capacity Scaling from Wikipedia Capacitance Capacitance per unit area scaling: C ox = ε SiO 2 /T ox T ox # S T ox C ox # C ox /S 17 18 3
Capacitance Resistance Gate Capacitance " C gate = A C ox " Α # Α S 2 " C ox # C ox /S " C gate # S C gate Resistance R=ρL/(W*t) W! S W L, t similar R! R/S 19 20 Threshold Voltage Current V TH # S V TH Saturation Current I d =(µc OX /2)(W/L)(V gs -V TH ) 2 V gs= V! S V V TH! S V TH W! S W L! S L C ox! C ox /S I d! S I d 21 22 Current Velocity Saturation Current scaling: V DSAT Lν sat µ n I DS ν sat C OX W V GS V T V DSAT V gs= V! S V 2 V TH! S V TH L! S L V DSAT! S V DSAT W! S W C ox! C ox /S I d! S I d 23 " Gate Delay " τ gd =Q/I=(CV)/I " V! S V " I d! S I d " τ gd! S τ gd Gate Delay 24 4
Overall Scaling Results, Transistor Speed and Leakage. Preliminary Data HP = High-Performance Logic from 2005 ITRS. LOP = Low Operating Power Logic LSTP = Low Standby Power Logic Intrinsic Transistor Delay, τ = CV/I (lower delay = higher speed) Leakage Current (HP: standby power dissipation issues) ITRS 2009 Transistor Speed LOP LSTP HP LOP HP! Target: 17%/yr, historical rate 17%/yr rate 25 Planar Bulk MOSFETs LSTP! Target: Isd,leak ~ 10 pa/um Advanced MOSFETs RO=Ring Oscillator 26 Wire Delay Impact of Wire and Gate Delay Scaling " Wire delay " τ wire =R C " R! R/S " τ wire! τ wire assuming (logical) wire lengths remain constant... If gate delay scales down but wire delay does not scale, what does that suggest about the relative contribution of gate and wire delays to overall delay as we scale? 27 28 Energy Power Dissipation (Dynamic) Switching Energy per operation E=1/2 CV 2 " V! S V " E! S 3 E Capacitive (Dis)charging " P=(1/2)CV 2 f " V! S V Increase Frequency? " τ gd! S τ gd " So: f! f/s? 29 " P! S 3 P " P! S 2 P 30 5
Effects? Area S 2 Capacitance S Resistance 1/S Threshold (V th ) S Current (I d ) S Gate Delay (τ gd ) S Wire Delay (τ wire ) 1 Energy S 3 Power Density P# S 2 P (increase frequency) P# S 3 P (dynamic, same freq.) A # S 2 A Power Density: P/A two cases? P/A # P/A increase freq. P/A # S P/A same freq. Power S 2!S 3 31 32 Cheating Improving Resistance Don t like some of the implications High resistance wires Higher capacitance Atomic-scale dimensions. Quantum tunneling Not scale speed fast enough 33 R=ρL/(W t) W! S W L, t similar R! R/S " What might we do? " Don t scale t quite as fast! now taller than wide. " Decrease ρ (copper) introduced 1997 http://www.ibm.com/ibm100/us/en/icons/copperchip/ 34 Capacitance and Leakage Capacitance per unit area C ox = ε SiO 2 /T ox T ox # S T ox C ox # C ox /S What s wrong with Tox = 1.2nm? 35 source: Borkar/Micro 2004 36 6
Capacitance and Leakage High-K dielectric Survey Capacitance per unit area C ox = ε SiO 2 /T ox T ox # S T ox C ox # C ox /S What might we do? Reduce Dielectric Constant ε (interconnect) and Increase Dielectric to substitute for scaling T ox (gate quantum tunneling) 37 Wong/IBM J. of R&D, V46N2/3P133 168, 2002 38 Improving Gate Delay " τ gd =Q/I=(CV)/I " V! S V " I d =(µc OX /2)(W/L)(V gs -V TH ) 2 " I d! S I d " τ gd! S τ gd " Lower C. " Don t scale V. How might we accelerate? Don t scale V: V!V I!I/S τ gd! S 2 τ gd 39 But Power Dissipation (Dynamic) Capacitive (Dis)charging " P=(1/2)CV 2 f " V! V Increase Frequency? " f! f/s 2? " P! P/S If not scale V, power dissipation not scale down. 40 And Power Density Historical Voltage Scaling P! P/S (increase frequency) But Α! S 2 Α What happens to power density? P/A! (1/S 3 )P/A Power Density Increases this is where some companies have gotten into trouble 41 http://software.intel.com/en-us/articles/gigascale-integration-challenges-and-opportunities/ Freq. and Power Density impact: HW3.1 42 7
uproc Clock Frequency up Power Density MHz Watts The Future of Computing Performance: Game Over or Next Level? National Academy Press, 2011 43 http://www.nap.edu/catalog.php?record_id=12980 The Future of Computing Performance: Game Over or Next Level? National Academy Press, 2011 44 http://www.nap.edu/catalog.php?record_id=12980 Power Density Impact P/A! (1/S 3 )P/A (scale freq) or (1/S)P/A (don t scale freq) Assuming no more voltage scaling, what is the impact of power density growing to hit our practical thermal density limit? 45 46 Doping? Features? Physical Limits 47 Depended on bulk effects Physical Limits doping current (many electrons) mean free path in conductor localized to conductors Eventually single electrons, atoms distances close enough to allow tunneling 48 8
Dopants/Transistor Leads to Variation 10000 Mean Number of Dopant Atoms 1000 100 10 100 0 500 250 130 65 32 16 Technology Node (nm) 49 [Borkar, IEEE Micro Nov.-Dec. 2005] 50 Conventional Scaling Ends in your lifetime Perhaps already has: "Basically, this is the end of scaling. May 2005, Bernard Meyerson, V.P. and chief technologist for IBM's systems and technology group 51 Bob Colwell March 2013 52 Feature Size Scaling ITRS IEEE Spectrum, Sep. 2016 53 Conventional Scaling Ends in your lifetime Perhaps already: "Basically, this is the end of scaling. May 2005, Bernard Meyerson, V.P. and chief technologist for IBM's systems and technology group Dennard Scaling ended Feature scaling end at 10nm? Transistor count integration may continue E.g. 3D 54 9
Big Ideas [MSB Ideas] Finishing Up... Moderately predictable VLSI Scaling unprecedented capacities/capability growth for engineered systems change be prepared to exploit account for in comparing across time but not for much longer 55 56 Big Ideas [MSB-1 Ideas] Uniform scaling reasonably accurate for past couple of decades Area increase 1/S 2 Real capacity maybe a little less? Gate delay decreases (S) maybe a little less Wire delay not decrease, maybe increase Overall delay decrease less than (S) Lack of V scale! Power density limit 57 Cannot turn on all the transistors Admin HW3 out HW1 graded Reading for Wednesday on Web 58 10