Why Power Matters PE/EE 47, PE 57 VLSI Design I L5: Power and Designing for Low Power Department of Electrical and omputer Engineering University of labama in Huntsville leksandar Milenkovic ( www.ece.uah.edu/~milenka ) www.ece.uah.edu/~milenka/cpe57-5f Packaging costs Power supply rail design hip and system cooling costs Noise immunity and system reliability attery life (in portable systems) Environmental concerns Office equipment accounted for 5% of total US commercial energy usage in 993 Star compliant systems /8/5 VLSI Design I;. Milenkovic Why worry about power? -- Power Dissipation Problem Illustration Lead microprocessors power continues to increase P6 Pentium 86 486 886 386 885 88 88 44 Power (Watts). 97 974 978 985 99 Year Power delivery and dissipation will be prohibitive Source: orkar, De Intel /8/5 VLSI Design I;. Milenkovic 3 /8/5 VLSI Design I;. Milenkovic 4 attery (4+ lbs) Why worry about power? attery Size/Weight Nominal apacity (W-hr/lb) 5 4 3 65 7 75 8 85 9 95 Expected battery lifetime increase over the next 5 years: 3 to 4% Rechargable Lithium Nickel-admium Ni-Metal Hydride Year From Rabaey,, 995 /8/5 VLSI Design I;. Milenkovic 5 Why worry about power? -- Standby Power Drain leakage will increase as V T decreases to maintain noise margins and meet frequency demands, leading to excessive battery draining standby power consumption. 8KW Standby Power Year Power supply V dd (V) Threshold V T (V) 5% 4% 3% % % W 88W.5.4 4W 5..4.7KW 8.35.5 % Source: orkar, De Intel 4 6 8 /8/5 VLSI Design I;. Milenkovic 6.9.7.3 4.6 and phones leaky! VLSI Design I;. Milenkovic
better better Power and Figures of Merit Power consumption in Watts determines battery life in hours Peak power determines power ground wiring designs sets packaging limits impacts signal noise margin and reliability analysis efficiency in Joules rate at which power is consumed over time = power * delay Joules = Watts * seconds lower energy number means less power to perform a computation at the same frequency Watts Watts Power versus Power is height of curve Lower power design could simply be slower pproach pproach time is area under curve Two approaches require the same energy pproach pproach time /8/5 VLSI Design I;. Milenkovic 7 /8/5 VLSI Design I;. Milenkovic 8 PDP and EDP Power-delay product (PDP) = P av * t p = ( )/ PDP is the average energy consumed per switching event (Watts * sec = Joule) lower power design could simply be a slower design -delay product (EDP) = PDP * t p = P av * t p 5 EDP is the average energy energy-delay consumed multiplied by the computation time required takes into account that one can trade increased delay energy for lower energy/operation 5 (e.g., via supply voltage scaling that increases delay, delay but decreases energy consumption).5 allows one to understand tradeoffs better.5.5 Vdd (V) /8/5 VLSI Design I;. Milenkovic 9 -Delay (normalized) Understanding Tradeoffs Which design is the best (fastest, coolest, both)? b c a d /Delay better /8/5 VLSI Design I;. Milenkovic Understanding Tradeoffs MOS & Power Equations Which design is the best (fastest, coolest, both)? E = P + t sc I peak P + I leakage d b c a Lower EDP f = P * f clock P = f + t sc I peak f + I leakage /Delay better /8/5 VLSI Design I;. Milenkovic Dynamic power Short-circuit power Leakage power /8/5 VLSI Design I;. Milenkovic VLSI Design I;. Milenkovic
Dynamic Power onsumption Vdd Vin Pop Quiz onsider a.5 micron chip, 5 MHz clock, average load cap of 5fF/gate (fanout of 4),.5V supply. Dynamic Power consumption per gate is?? /transition = * * P P dyn = /transition * f = * * P * f P dyn = EFF * * f where EFF = P Not a function of transistor sizes! Data dependent - a function of switching activity! f /8/5 VLSI Design I;. Milenkovic 3 With million gates (assuming each transitions every clock) Dynamic Power of entire chip =??. /8/5 VLSI Design I;. Milenkovic 4 Lowering Dynamic Power Short ircuit Power onsumption apacitance: Function of fan-out, wire length, transistor sizes Supply Voltage: Has been dropping with successive generations Vin I sc P dyn = P f ctivity factor: How often, on average, do wires switch? lock frequency: Increasing Finite slope of the input signal causes a direct current path between and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting. /8/5 VLSI Design I;. Milenkovic 5 /8/5 VLSI Design I;. Milenkovic 6 Short ircuit urrents Determinates Impact of on P sc E sc = t sc I peak P I sc I sc I max P sc = t sc I peak f Vin Vin Duration and slope of the input signal, t sc I peak determined by the saturation current of the P and N transistors which depend on their sizes, process technology, temperature, etc. strong function of the ratio between input and output slopes a function of Large capacitive load Output fall time significantly larger than input rise time. Small capacitive load Output fall time substantially smaller than the input rise time. /8/5 VLSI Design I;. Milenkovic 7 /8/5 VLSI Design I;. Milenkovic 8 VLSI Design I;. Milenkovic 3
I peak () x -4.5.5.5 I peak as a Function of = ff = ff = 5 ff 4 6 x -.5 - time (sec) 5 psec input slope When load capacitance is small, I peak is large. Short circuit dissipation is minimized by matching the rise/fall times of the input and output signals - slope engineering. /8/5 VLSI Design I;. Milenkovic 9 P normalized 8 7 6 5 4 3 P sc as a Function of Rise/Fall Times = 3.3 V =.5 V =.5V When load capacitance is small (t sin /t sout > for > V) the power is dominated by P sc If < V Tn + V Tp then P sc is eliminated since both devices are never on at the same time. t sin /t sout 4 W/L p =.5 µm/.5 µm normalized wrt zero input W/L n =.375 µm/.5 µm rise-time dissipation = 3 ff /8/5 VLSI Design I;. Milenkovic Leakage (Static) Power onsumption Gate leakage I leakage Drain junction leakage Sub-threshold current Sub-threshold current is the dominant factor. ll increase exponentially with temperature! /8/5 VLSI Design I;. Milenkovic - ID () -7 Leakage as a Function of V T ontinued scaling of supply voltage and the subsequent scaling of threshold voltage will make subthreshold conduction a dominant component of power dissipation. VT=.4V VT=.V n 9mV/decade V T roll-off - so each 55mV increase in V T gives 3 orders of magnitude reduction in leakage (but adversely affects performance) -..4.6.8 VGS (V) /8/5 VLSI Design I;. Milenkovic TSM Processes Leakage and V T Exponential Increase in Leakage urrents V dd T ox (effective) L gate I DSat (n/p) (µ/µm) I off (leakage) (ρ/µm) V Tn FET Perf. (GHz) L8 G.8 V 4 Å.6 µm 6/6.4 V 3 L8 LP.8 V 4 Å.6 µm 5/8.6.63 V L8 ULP.8 V 4 Å.8 µm 3/3.5.73 V 4 L8 HS V 4 Å.3 µm 78/36 3.4 V 43 L5 HS.5 V 9 Å. µm 86/37,8.9 V 5 L3 HS. V 4 Å.8 µm 9/4 3,.5 V 8 I leakage (n/µm) 3 4 5 6 7 8 9 Temp().5.8.3. From MPR, /8/5 VLSI Design I;. Milenkovic 3 From De,999 /8/5 VLSI Design I;. Milenkovic 4 VLSI Design I;. Milenkovic 4
Review: & Power Equations Power and Design Space E = P I leakage + t sc I peak P + P = f + t sc I peak f + Dynamic power (~9% today and decreasing relatively) f = P * f clock Short-circuit power (~8% today and decreasing absolutely) I leakage Leakage power (~% today and increasing) ctive Leakage onstant Design Time Logic Design Reduced V dd Sizing + Multi-V T Non-active Modules lock Gating Sleep Transistors Variable V T Variable Run Time DFS, DVS (Dynamic Freq, Voltage Scaling) + Variable V T /8/5 VLSI Design I;. Milenkovic 5 /8/5 VLSI Design I;. Milenkovic 6 Dynamic Power as a Function of Device Size Device sizing affects dynamic energy consumption gain is largest for networks with large overall effective fan-outs (F = / g, ) The optimal gate sizing.5 factor (f) for dynamic energy F= is smaller than the one for performance, especially for F= large F s F=5 e.g., for F=, f opt (energy) = 3.53 while f opt (performance) = 4.47 If energy is a concern avoid oversizing beyond the optimal 3 4 5 6 7 From Nikolic, U /8/5 VLSI Design I;. Milenkovic 7 normalized energy.5 f F= F= Dynamic Power onsumption is Data Dependent Switching activity, P, has two components static component function of the logic topology dynamic component function of the timing behavior (glitching) Static transition probability P = P out= x P out= -input NOR Gate = P Out x (-P ) With input signal probabilities P = = / P = = / NOR static transition probability = 3/4 x /4 = 3/6 /8/5 VLSI Design I;. Milenkovic 8 NOR Gate Transition Probabilities Switching activity is a strong function of the input signal statistics P and P are the probabilities that inputs and are one P P = P x P = (-(-P )(-P )) (-P )(-P ) /8/5 VLSI Design I;. Milenkovic 9 P Transition Probabilities for Some asic Gates P = P out= x P out= NOR (-(-P )( - P )) x ( - P )( - P ) OR ( - P )( - P ) x (-(-P )( - P )) NND P P x ( - P P ) ND ( - P P ) x P P OR ( - (P + P -P P )) x (P + P -P P ).5.5 For : P = For : P = /8/5 VLSI Design I;. Milenkovic 3 VLSI Design I;. Milenkovic 5
Transition Probabilities for Some asic Gates NOR OR NND P = P out= x P out= (-(-P )( - P )) x ( - P )( - P ) ( - P )( - P ) x ( -( -P )( - P )) P P x ( - P P ) ND ( - P P ) x P P OR ( - (P + P -P P )) x (P + P -P P ).5.5 For : P = P x P = (-P ) P =.5 x.5 =.5 For : P = P x P = (-P P ) P P = ( (.5 x.5)) x (.5 x.5) = 3/6 /8/5 VLSI Design I;. Milenkovic 3 Inter-signal orrelations Determining switching activity is complicated by the fact that signals exhibit correlation in space and time reconvergent fan-out.5.5 Reconvergent fan-out P(=) = P(=) & P(= =) Have to use conditional probabilities /8/5 VLSI Design I;. Milenkovic 3.5.5 Inter-signal orrelations Determining switching activity is complicated by the fact that signals exhibit correlation in space and time reconvergent fan-out (-.5)(-.5)x(-(-.5)(-.5)) = 3/6 P(=) = P(=) * P(= =) =.5 * =.5 Reconvergent P(=) = P(=)*P(= =) =.5 P(->) =.5*.5 =.5 P(=) = P(=) & P(= =) Have to use conditional probabilities Logic Restructuring Logic restructuring: changing the topology of a logic network to reduce transitions ND: P = P x P = ( - P P ) x P P.5.5 (-.5)*.5 = 3/6 W 7/64.5 5/56.5.5.5 D F D.5.5 3/6 Y 3/6 hain implementation has a lower overall switching activity than the tree implementation for random inputs Ignores glitching effects 5/56 F /8/5 VLSI Design I;. Milenkovic 33 /8/5 VLSI Design I;. Milenkovic 34 Input Ordering Input Ordering.5.. F...5 F.5. (-.5x.)x(.5x.)=.9 F. (-.x.)x(.x.)=.96.. F.5 eneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to.5) eneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to.5) /8/5 VLSI Design I;. Milenkovic 35 /8/5 VLSI Design I;. Milenkovic 36 VLSI Design I;. Milenkovic 6
Glitching in Static MOS Networks Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value Glitching in Static MOS Networks Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value Unit Delay /8/5 VLSI Design I;. Milenkovic 37 Unit Delay /8/5 VLSI Design I;. Milenkovic 38 Glitching in an R in S5 S4 S S S 3 S Output Voltage (V) S3 S4 S5 in S S5 S S S 4 6 8 /8/5 VLSI Design Time I;. (ps) Milenkovic 39 alanced Delay Paths to Reduce Glitching Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs F F F 3 So equalize the lengths of timing paths through logic /8/5 VLSI Design I;. Milenkovic 4 F F F 3 Power and Design Space Dynamic Power as a Function of ctive Leakage onstant Design Time Logic Design Reduced V dd Sizing + Multi-V T Non-active Modules lock Gating Sleep Transistors Variable V T Variable Run Time DFS, DVS (Dynamic Freq, Voltage Scaling) + Variable V T Decreasing the decreases dynamic energy consumption (quadratically) ut, increases gate delay (decreases performance) t p(normalized) 5.5 5 4.5 4 3.5 3.5.5.8..4.6.8..4 (V) Determine the critical path(s) at design time and use high for the transistors on those paths for speed. Use a lower on the other gates, especially those that drive large capacitances (as this yields the largest energy benefits). /8/5 VLSI Design I;. Milenkovic 4 /8/5 VLSI Design I;. Milenkovic 4 VLSI Design I;. Milenkovic 7
Multiple onsiderations How many? Two is becoming common Many chips already have two supplies (one for core and one for I/O) When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up) If a gate supplied with L drives a gate at H, the PMOS never turns off H The cross-coupled PMOS transistors do the level conversion The NMOS transistor operate on a V out VDDL reduced supply V in Level converters are not needed for a step-down change in voltage Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop /8/5 VLSI Design I;. Milenkovic 43 Dual-Supply Inside a Logic lock Minimum energy consumption is achieved if all logic paths are critical (have the same delay) lustered voltage-scaling Each path starts with H and switches to L (gray logic gates) when delay slack is available Level conversion is done in the flipflops at the end of the paths /8/5 VLSI Design I;. Milenkovic 44 ctive Leakage Power and Design Space onstant Design Time Logic Design Reduced V dd Sizing + Multi-V T Non-active Modules lock Gating Sleep Transistors Variable V T Variable Run Time DFS, DVS (Dynamic Freq, Voltage Scaling) + Variable V T /8/5 VLSI Design I;. Milenkovic 45 Stack Effect Leakage is a function of the circuit topology and the value of the inputs V T = V T + γ( -φ F + V S - -φ F ) where V T is the threshold voltage at V S = ; V S is the sourcebulk (substrate) voltage; γ is the body-effect coefficient V V T ln(+n) V GS =V S = -V Out V GS =V S = -V T V GS =V S = V V SG =V S = /8/5 VLSI Design I;. Milenkovic 46 I SU Leakage is least when = = Leakage reduction due to stacked transistors is called the stack effect Short hannel Factors and Stack Effect In short-channel devices, the subthreshold leakage current depends on V GS,V S and V DS. The V T of a short-channel device decreases with increasing V DS due to DIL (drain-induced barrier loading). Typical values for DIL are to 5mV change in V T per voltage change in V DS so the stack effect is even more significant for short-channel devices. V reduces the drain-source voltage of the top nfet, increasing its V T and lowering its leakage For our.5 micron technology, V settles to ~mv in steady state so V S = -mv and V DS = -mv which is times smaller than the leakage of a device with V S = mv and V DS = Leakage as a Function of Design Time V T Reducing the V T increases the subthreshold leakage current (exponentially) 9mV reduction in V T increases leakage by an order of magnitude ut, reducing V T decreases gate delay (increases performance) ID () VT=.4V VT=.V..4.6.8 VGS (V) Determine the critical path(s) at design time and use low V T devices on the transistors on those paths for speed. Use a high V T on the other logic for leakage control. careful assignment of V T s can reduce the leakage by as much as 8% /8/5 VLSI Design I;. Milenkovic 47 /8/5 VLSI Design I;. Milenkovic 48 VLSI Design I;. Milenkovic 8
V T (V) Dual-Thresholds Inside a Logic lock Variable V T () at Run Time Minimum energy consumption is achieved if all logic paths are critical (have the same delay) Use lower threshold on timing-critical paths ssignment can be done on a per gate or transistor basis; no clustering of the logic is needed No level converters are needed /8/5 VLSI Design I;. Milenkovic 49 V T = V T + γ( -φ F + V S - -φ F ) For an n-channel device, the substrate is normally tied to ground (V S = ) negative bias on V S causes V T to increase djusting the substrate bias at run time is called adaptive body-biasing () Requires a dual well fab process -.5 - -.5 - -.5 V S (V) /8/5 VLSI Design I;. Milenkovic 5.9.85.8.75.7.65.6.55.5.45.4 VLSI Design I;. Milenkovic 9