EE241 - Spring 2005 Advanced Digital Integrated Circuits Lecture 10: Power Intro Admin Project Phase 2 due Monday March 14, 5pm (by e-mail to jan@eecs.berkeley.edu and huifangq@eecs.berkeley.edu) Should be a 3 page (max) double-column conference style paper (submitted in word or pdf). No fonts smaller than 10 point. Should describe motivation and goals of your project, describe what you have learned so far from studying the background material and by your own analysis, and spell out what you expect to do by the end of the semester That is, how will you evaluate and/or demonstrate your results. Two lectures on We afternoon (2pm and 3:30pm). No lecture next Monday. 2 1
Wrapping up HS: Other Logic Styles Dominant logic styles: static, PTL, Dynamic Search for other options is continuously going on (noise margins, leakage, higher performance) The balance is shifting with every new technology generation 3 Sense-Amplifying Logic Matsui, JSSC 12/94 4 2
GHz Logic with Sense Amplifiers Takahashi, JSSC 5/99 5 Read-out scheme 6 3
Implemented Macros 7 Rotator (ROT) 8 4
Incrementer (INC) 9 Current-Mode Logic (CML) M. Mizuno, JSSC 6/96 10 5
Current-Mode Logic (CML) 11 Current-Mode Logic (CML) 12 6
Optimization for Power The Importance of Power Awareness Crucial for Portable Applications Determines battery lifetime Increased amount of computation Crucial for High-Performance Applications Determines cooling and energy costs Many designs today are power limited Still need maximum performance 14 7
The Power Challenge Source: Roger Schmidt IBM Corp K. Yazawa, Sony 15 Mobility: Battery storage the limiting factor Little change in basic technology store energy using a chemical reaction Battery capacity doubles every 10 years Energy density/size, safe handling are limiting factor Energy density KWH/kg of material Gasoline 14 Lead-Acid 0.04 Li polymer 0.15 16 8
Battery Progress 160 140 120 100 80 60 40 20 0 1940 1950 1960 1970 1980 1990 2000 2010 First Commercial Use Energy Density (Wh/kg) Trend Line NiCd SLA NiMH Li-Ion Reusable Alkaline Facture 4 over the last 10 years! Li- Polymer 17 Fuel cell may increase stored energy more than a order of magnitude 10000 1000 Duration [Hour] Courtesy Toshiba 100 10 Cellular PDA Lithium Ion Battery Direct Methanol Fuel Cell Note PC 1 1 10 100 1000 Output [W] 18 9
What can one do with 1 cm 3? Energy Storage Micro Fuel cell Primary battery Secondary battery Ultra-capacitor J/cm 3 3500 2880 1080 100 µw/cm 3 /year 110 90 34 3.2 19 What can one do with 1 cm 3? Energy Generation µw/cm 3 Solar (outside) Air flow Human power Vibration Temperature Pressure Var. Solar (inside) 15,000 380 330 200 40 17 10 20 10
What can one do with 1 cm 3? Reference case: the human brain P avg (brain) = 20 W (20% of the total dissipation, 2% of the weight), Power density: ~15 mw/cm 3 Nerve cells only 4% of brain volume Average neuron density: 70 million/cm 3 21 What can one do with 1 cm 3? Perform computations 300 million 4 input NAND gates (90 nm) 7 million Xilinx gates (90 nm) Assuming 500 MHz clock frequency, 1V Vdd and fanout of 4 and 10% activity: 15 Peta gate-ops/sec @ 45 W Reducing supply voltage to 0.2V and clock rate to 10 MHz: 300 Giga gate-ops @ 40 mw 22 11
Outline 1. Know your enemy: Power consumption in CMOS 2. Leakage is here to stay 3. Power and performance are tightly coupled and have to be jointly optimized 4. Principles of Power Minimization 23 Where does power go in CMOS? Switching power Charging capacitors Leakage power Transistors are imperfect switches Short-circuit power Both pull-up and pull-down on during transition Static currents Biasing currents 1. Know Your Enemy 24 12
Dynamic Power Consumption V dd E 2 0 1 = C L V DD A 1 PMOS NETWORK i L 1 2 E R = CLVDD 2 A N NMOS NETWORK V out C L 1 2 E C = CLVDD 2 One half of the power from the supply is consumed in the pull-up network and one half is stored on C L Charge from C L is dumped during the 1 0 transition 25 Circuits with Reduced Swing V dd V dd V dd -V Th C L E 0 1 = C V L DD ( V V ) DD Th 26 13
Dynamic Power Consumption Power = Energy/transition Transition rate = C L V DD2 f 0 1 = C L V DD2 f P 0 1 = C switched V DD2 f Power dissipation is data dependent depends on the switching probability Switched capacitance C switched = C L P 0 1 27 Transition Activity and Power Energy consumed in N cycles, E N : E N = C L V DD2 n 0 1 n 0 1 number of 0 1 transitions in N cycles P avg EN n0 lim f = lim N N N N n0 α 0 1 = lim N N 1 = P avg = α 0 1 C L 1 V C f 2 DD f L V 2 DD f 28 14
Type of Logic Function: NOR vs. XOR Example: Static 2-input NOR Gate A B Out 0 0 1 0 1 0 1 0 0 1 1 0 If inputs switch every cycle Assume signal probabilities p A=1 = 1/2 p B=1 = 1/2 Then transition probability p 0 1 = p Out=0 x p Out=1 = 3/4 x 1/4 = 3/16 α 0 1 = 3/16 29 Type of Logic Function: NOR vs. XOR A 0 0 1 1 B 0 1 0 1 Example: Static 2-input XOR Gate Out If inputs switch in every cycle 0 1 1 0 Assume signal probabilities p A=1 = 1/2 p B=1 = 1/2 Then transition probability p 0 1 = p Out=0 x p Out=1 = 1/2 x 1/2 = 1/4 α 0 1 = 1/4 30 15
Transition Probabilities P 0->1 (NOR,NAND) = (2 N -1)/2 2N P 0->1 (XOR) = 1/4 31 Transition Probabilities for Basic Gates p 0 1 AND OR XOR (1 - p A p B )p A p B (1 - p A )(1 - p B )(1 - (1 - p A )(1 - p B )) (1 - (p A +p B 2p A p B ))(p A + p B 2p A p B ) Transition probabilities for static CMOS gates p 0 1 = p 0 p 1 32 16
Problem: Reconvergent Fanout A X B Z Reconvergence P(Z = 1) = P(B = 1). P(X = 1 B=1) Becomes complex and intractable fast 33 Inter-Signal Correlations A B C Z A C B Z Logic without reconvergent fanout Logic with reconvergent fanout p 0 1 =(1 p A p B ) p A p B P(Z = 1) = p(c=1 B=1) p(b=1) p 0 1 = 0 Need to use conditional probabilities to model inter-signal correlations CAD tools required for such analysis 34 17
(V) Voltage Glitching in Static CMOS A X B C Z ABC 101 000 X Z Gate Delay Also known as dynamic hazards The result is correct, but there is extra power dissipated 35 Example: Chain of NOR Gates 1 Out 1 Out 2 Out 3 Out 4 Out 5 3.0 2.0 Out 2 Out 6 Out 6 Out 8 1.0 Out 1 Out 7 Out 5 Out 3 0.0 0 200 400 600 Time (ps) 36 18
(A) I sc Short Circuit Current V DD V DD I sc 0 I sc = I MAX V in C L V out x10 4 2.5 V in C L V out Large load 2 1.5 C L = 20 ff C L = 100 ff Small load 1 0.5 C L = 500 ff 0 0.5 0 20 40 60 time (s) Short circuit current is usually well controlled 37 2. Transistors Leak Drain leakage Diffusion currents Drain-induced barrier lowering Junction leakages Gate-induced drain leakage Gate leakage Tunneling currents through thin oxide 38 19
Transistor Leakage -3 V DS = 1.2V -4 G log I DS [log A] -5-6 -7 S C i C d Sub D -8 Subthreshold slope S = kt/q ln10 (1+C d /C i ) -9 0 0.2 0.4 0.6 0.8 1 1.2 V GS [V] Drain leakage current is exponential with V GS Subthreshold slope is ~70mV/dec 39 Transistor Leakage 8 IDS [na] 6 4 3-10x in current technologies 2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 V DS [V] Two effects: diffusion current (like a bipolar transistor) exponential increase with V DS (DIBL) 40 20
Subthreshold Current Subthreshold behavior can be modeled physically Ids V V 2 g Th Vds W kt mkt q kt q = µ e 1 e L q Or simplified to: Ids W = I0 10 W0 ( V V ) gs Th +γvds S 41 From a design perspective Leakage exponential function of VT Leakage dependence upon VDD Initially quite linear Goes up exponentially for larger voltages due to DIBL 42 21
Gate Leakage Trends Tunneling at thin oxides Courtesy of IEEE Press, New York. 2000 43 Gate Tunneling I GD ~ e Tox e Vgd, I GS ~ e Tox e Vgs V DD Independent of the sub-threshold leakage I SUB Contributes to the total leakage Modeled in BSIM4 V DD I GD 0V Also in BSIM3v3 but foundries usually do not include it I Leak NMOS gate leakage usually worse than PMOS I GS 44 22
Power /Energy Optimizaton Space Constant Throughput/Latency Variable Throughput/Latency Energy Design Time Sleep Mode Run Time Logic design Active Scaled V DD TSizing Clock Gating DFS, DVS Multi-V DD Leakage Stack effects + Multi-V T Sleep T s Multi-V DD Variable V T + Input control + Variable V T 45 Reducing active power Downsizing transistors (C L ) Slows down logic Lowering the supply voltage (V DD ) Slows down logic Reducing swing slows down the succeeding stage Pdyn Reducing frequency (f) E Does not reduce energy Reducing switching activity (α) Logic restructuring Reducing glitching Balancing logic ~ α C ~ α C L L V V swing swing V V DD DD f 46 23
Relationship Between Power and Delay 2 Power : P = p t f CLK C L V DD + I 0 10 V TH V DD S Delay : D = k C L V DD (V DD - V TH ) 1.3 Power (W) 0.8 0.6 0.4 1 x 10-4 0.2 0 4 3 [From Kuroda] A V DD (V) 2 B 10.8 0.4 0 V TH (V) Delay (s) -0. 4 5 x 10-10 4 3 2 1 0 4 3 A V DD (V) 2 B 1 0.8 0.4 0 V TH (V) -0.4 Power is reduced while delay is unchanged if both V DD and V TH are lowered such as from A to B. 47 Reducing Active Power Downsizing, lowering the supply on the critical path will lower the operating frequency Downsize non-critical paths Narrows down the path delay distribution Increases impact of variations Path count Original delay distribution Target delay Delay 48 24
Multi-Level Approach Energy minimization subject to delay constraint Optimal trade-off between energy and area Architecture Micro-Architecture Circuit (Logic & FFs) Energy-Area (Cost) Performance Energy-Performance Energy-Delay 49 Literature Books: J. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits: A Design Perspective, 2 nd ed, Prentice Hall 2003. A. Chandrakasan, W. Bowhill, F. Fox (eds.), Design of High-Performance Microprocessor Circuits, IEEE Press 2001. Chapter 4, Low-Voltage Technologies, by Kuroda and Sakurai Chapter 3, Techniques for Leakage Power Reduction, by De, et al. A. Chandrakasan and R. Brodersen, Low Power CMOS Design, Kluwer Academic Publishers, 1995. J. Rabaey and M. Pedram, Ed., Low Power Design Methodologies, Kluwer Academic Publishers, 1995. 2 nd ed, 2002. A. Chandrakasan and R. Brodersen, Low-Power CMOS Design, IEEE Press, 1998 (Reprint Volume) 50 25
Literature Articles: A. P. Chandrakasan and R. W. Brodersen, Minimizing power consumption in digital CMOS circuits, Proceedings of the IEEE, no.4, p.498-523, April 1995. A.P. Chandrakasan, S. Sheng, R.W. Brodersen, Low-power CMOS digital design. IEEE Journal of Solid-State Circuits, vol.27, no.4, p.473-84, April 1992. T.Kuroda, T. Sakurai, Overview of low-power ULSI circuit techniques, IEICE Trans. on Electronics, vol. E78-C, no. 4, pp. 334-344, April 1995. S. Borkar, Design challenges of technology scaling, IEEE Micro, vol.19, no.4, p.23-29, July-Aug. 1999. 51 26