Power in Digital CMOS Circuits. Fruits of Scaling SpecInt 2000

Similar documents
Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design

CIS 371 Computer Organization and Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

Intro To Digital Logic

Lecture 2: CMOS technology. Energy-aware computing

Scaling of MOS Circuits. 4. International Technology Roadmap for Semiconductors (ITRS) 6. Scaling factors for device parameters

CSE493/593. Designing for Low Power

Lecture 2: Metrics to Evaluate Systems

Lecture 12: Energy and Power. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 15: Scaling & Economics

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

Where Does Power Go in CMOS?

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Total Power. Energy and Power Optimization. Worksheet Problem 1

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

Grasping The Deep Sub-Micron Challenge in POWERFUL Integrated Circuits

Lecture 6 Power Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

Low power Architectures. Lecture #1:Introduction

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

MODULE III PHYSICAL DESIGN ISSUES

EE371 - Advanced VLSI Circuit Design

Midterm. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Lecture Outline. Pass Transistor Logic. Restore Output.

Lecture 21: Packaging, Power, & Clock

Previously. ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems. Today. Variation Types. Fabrication

ECE 415/515 ANALOG INTEGRATED CIRCUIT DESIGN

VLSI Design I; A. Milenkovic 1

Lecture 13: Sequential Circuits, FSM

Scheduling for Reduced CPU Energy

τ gd =Q/I=(CV)/I I d,sat =(µc OX /2)(W/L)(V gs -V TH ) 2 ESE534 Computer Organization Today At Issue Preclass 1 Energy and Delay Tradeoff

EE371 - Advanced VLSI Circuit Design

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Impact of Scaling on The Effectiveness of Dynamic Power Reduction Schemes

Implications on the Design

Digital Logic. CS211 Computer Architecture. l Topics. l Transistors (Design & Types) l Logic Gates. l Combinational Circuits.

Digital Integrated Circuits A Design Perspective

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Moores Law for DRAM. 2x increase in capacity every 18 months 2006: 4GB

! Memory. " RAM Memory. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

Low Power CMOS Dr. Lynn Fuller Webpage:

Today. ESE532: System-on-a-Chip Architecture. Why Care? Message. Scaling. Why Care: Custom SoC

Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

CMOS Transistors, Gates, and Wires

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Lecture 7 Circuit Delay, Area and Power

! VLSI Scaling Trends/Disciplines. ! Effects. ! Alternatives (cheating) " Try to predict where industry going

Lecture 16: Circuit Pitfalls

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ECE 574 Cluster Computing Lecture 20

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems

The Elusive Metric for Low-Power Architecture Research

ASIC FPGA Chip hip Design Pow Po e w r e Di ssipation ssipa Mahdi Shabany

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Power Dissipation. Where Does Power Go in CMOS?

EECS150 - Digital Design Lecture 22 Power Consumption in CMOS. Announcements

Skew-Tolerant Circuit Design

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Semiconductor Memories

Microelectronics Part 1: Main CMOS circuits design rules

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Effectiveness of Reverse Body Bias for Leakage Control in Scaled Dual Vt CMOS ICs

Last Lecture. Power Dissipation CMOS Scaling. EECS 141 S02 Lecture 8

CMPE12 - Notes chapter 1. Digital Logic. (Textbook Chapter 3)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 2-<101>

Introduction to Computer Engineering. CS/ECE 252, Fall 2012 Prof. Guri Sohi Computer Sciences Department University of Wisconsin Madison

Reducing power in using different technologies using FSM architecture

Lecture 4. Adders. Computer Systems Laboratory Stanford University

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic

Moore s Law Technology Scaling and CMOS

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

WARM SRAM: A Novel Scheme to Reduce Static Leakage Energy in SRAM Arrays

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

EE141- Fall 2002 Lecture 27. Memory EE141. Announcements. We finished all the labs No homework this week Projects are due next Tuesday 9am EE141

Fig. 1 CMOS Transistor Circuits (a) Inverter Out = NOT In, (b) NOR-gate C = NOT (A or B)

Nanoscale CMOS Design Issues

Long Channel MOS Transistors

SEMICONDUCTOR MEMORIES

Floating Point Representation and Digital Logic. Lecture 11 CS301

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

Objective and Outline. Acknowledgement. Objective: Power Components. Outline: 1) Acknowledgements. Section 4: Power Components

Dynamic Combinational Circuits. Dynamic Logic

From Physics to Logic

CSE370: Introduction to Digital Design

Tradeoff between Reliability and Power Management

Construction of a reconfigurable dynamic logic cell

EEC 118 Lecture #6: CMOS Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Today. ESE534: Computer Organization. Why Care? Why Care. Scaling. ITRS Roadmap

Lecture 5. Logical Effort Using LE on a Decoder

Chapter 2. Design and Fabrication of VLSI Devices

Transcription:

Power in Digital CMOS Circuits Mark Horowitz Computer Systems Laboratory Stanford University horowitz@stanford.edu Copyright 2004 by Mark Horowitz MAH 1 Fruits of Scaling SpecInt 2000 1000.00 100.00 10.00 intel 386 intel 486 i ntel penti um i ntel penti um 2 i ntel penti um 3 i ntel penti um 4 intel itanium Al pha 21 064 Al pha 21 1 64 Al pha 21 264 Spar c Super Spar c Spar c64 Mips HP PA Power PC AMD K6 AMD K7 1.00 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 MAH 2

The Darker Side of Scaling - Power 100 10 1 intel 386 intel 486 intel pentium intel pentium 2 intel pentium 3 intel pentium 4 intel itanium Al pha 21 064 Al pha 21 1 64 Al pha 21 264 Spar c Super Spar c Spar c64 Mips HP PA Power PC AMD K6 AMD K7 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 At least it is scaling slower than performance MAH 3 Power is Important Three reasons we care about power: 1. Need to get the power into the chip 60W@ 1.2V is many Amps 2. Need to get the power out of the chip How low thermal resistance is possible? Plastic packages w/o forced air High thermal resistances 3. Energy is heavy Need to carry the energy 20Wh/lb MAH 4

Important Questions How did we end up in this situation? Power was not an issue in the mid 80s Scaling theory said power would be constant Energy efficiency would improve Is there any hope for the future? More issues that need to be addressed? Silver bullets to solve the power problem? Techniques that will help MAH 5 Power in CMOS Circuits Dynamic power Proportional to C Vswing Vdd F Dominates most circuits Static power Idc (usually leakage current now) *Vdd Has been very small, and is still small Issue when circuit is idle and dynamic power is zero MAH 6

Historical Power Scaling In current technology shrinks, X,Y, V all scale Implies that C also scales If scale a chip to a new technology, operate at F C, and V both scale, so power DECREASES by α 3 Power decreases by 3x for each technology generation If scale a chip to a new technology, operate at F/α C, and V both scale down, F scales up So power DECREASES by α 2 1.4 times faster chip, for ½ the power Every time you move to a new technology MAH 7 Power Scaling Static Power X,Y, V all scale. Gate oxide is scaling faster than α Now have leakage current through the gate GIDL is another issue Scaling Vth Performance depends on Vdd/Vth ratio Transistor leakage tied to Vth Leakage current increases exponentially If Vth does not scale Leakage power scales as α 2 I leak I. s e V th αv T MAH 8

New Problem Leakage Scaling Vth From.5µ generation, Vdd = 10V/µ *L Seems like this scaling is still on track Vth was Vth = 1.4V/µ *L But that would mean that 0.18µ has 250mV Vth Leakage issues Vdd/Vth of gates are starting to fall Performance of gates will drop Haven t seen this yet, since technologies are pushed Variability of delay will increase Subtracting two large numbers to get overdrive Additional leakage paths gate tunnel current MAH 9 Dilemma What Vdd and Vth Lower Vth Need less Vdd Less dynamic power More leakage current Correct choice depends on operation condition Vth.4.2 Gate Delay Also at low Vdd to Vth ratios the variation in delay will be larger 0 t 1 2 Vdd MAH 10

Energy-Delay Trade-offs Placed lines of constant delay on top of a contour plot of energy Much lower energy/op if operate at low Vdd, Vth If, of course, the circuit is active Et, MAH 11 Do We Have a Power Problem? That depends on your point of view: Cost per operation is MUCH cheaper at same F Look at what we do in cell phones / laptop computers This cost will continue to fall Cost per operation is cheaper even at higher F Get this reduced cost even when you run part faster But we are greedy Want machine to run faster than technology scaling Build the most complex machines possible This combination means power has been growing. Power and performance are related Faster often means much more power MAH 12

Future Scaling Will Be Much Worse If Vdd does not scale Energy/gate scales only as α This is because C scales This means that: Power of gate will be constant if F increases MAH 13 Architecture Convert transistors to performance Use transistors to Exploit parallelism Or create it (speculate) Processor generations Simple machine Pipelined Super-scalar Out-of-order Speculation Each design has more logic to accomplish same task MAH 14

Architecture Scaling Plot of IPC Real IPC Compiler 0.05 0.04 Hardware Grows rapidly More FU Wires don t shrink 0.03 0.02 0.01 0.00 80386 80486 Pentium Pentium II PentiumIII Pentium4 Jan-85 Jan-88 Jan-91 Jan-94 Jan-97 Jan-00 MAH 15 SpecInt/MHz 1.00 0.10 0.01 85 87 89 91 93 95 97 99 01 MAH 16

Clock Frequency Scaling 10000 1000 100 10 85 87 89 91 93 95 97 99 01 MAH 17 100 Clock Cycle in FO4 Alpha 10 85 87 89 91 93 95 97 99 01 03 05 MAH 18

Power Scaling Complexity of chip is scaling as α 2 Freq is scaling as 1/α 2 Means we have more flops/gate Higher average power per gate Power should scale as faster than 1/α For the biggest / fastest chips This scaling has changed the rules of design It is not sustainable Processors have a limited power budget Fastest processor for a given power budget MAH 19 Now What? The power issue is more complex Power efficiency of digital hardware is improving Lots of hardware now can run on small batteries But demands are growing faster Power on top-end processors is growing Technology scaling is becoming more interesting Scaling now has its own set of trade-offs Interesting techniques to deal with these issues MAH 20

Evaluating Power Efficiency Want to select the most energy efficient solution Can t use power, P=CV 2 F Lowering the operating frequency lowers the power Many people use Mips/W This is really an energy metric Mips/W = Reciprocal of Joules/million instructions Energy metrics are kcv 2 Lower Voltage will increase this metric Look at the Xscale processor Highest Mips/Watt for lowest supply voltage Need to plot performance and power MAH 21 Energy Performance Graphs Use two axis Power or Energy/Op Performance Normalize for technology Watt/Spec 1.60 1.40 1.20 1.00 0.80 0.60 Energy scales as α 3 Makes tradeoff clearer At performance level Choose lowest energy At power level Choose highest performance 0.40 0.20 0.00 0.00 5.00 10.00 15.00 20.00 25.00 Spec95 Most processors are roughly on a line of 25 Spec 2 /W in 98 MAH 22

Energy/Instruction vs. Performance Watts/Spec 0.4 intel 386 0.35 intel 486 0.3 0.25 0.2 0.15 Sparc 0.1 Sparc64 Mips HP PA 0.05 AMD K6 0 AMD K7 0 200 400 600 800 Spec2000 intel pentium intel pentium 2 intel pentium 3 intel pentium 4 intel itanium Alpha 21064 Alpha 21164 Alpha 21264 SuperSparc Power PC MAH 23 1 Energy/Instruction Performance Watts/Spec 0.1 0.01 1 10 100 1000 10000 Spec2000 MAH 24

Truths About Power The power is super linear on required performance Lower performance is lower Energy/Op Means you can trade excess performance for power Many low power solutions are really highperformance algorithms run slowly Key enabling of low power signal processing Create parallel solution, and then voltage scale Create new algorithms that use less computing Best way to save power is to do less ops System level power management MAH 25 Low Power Design Define your problem at the correct level Architectural changes make the most difference Turning the RF section off most of the time is easier than building a low power RF design Doing less OPs is easier than building low power OPs Doing OPs in parallel is lower power than sequentially Technology also makes a huge difference Power scales by about 3 to 4x each generation Mobile parts almost always use the best process Circuits is mostly about not wasting power Use the right supply, Vth, etc for the job MAH 26

Solution - Multiple Flavors of Transistors New technologies have many different transistors Vth, Gate Oxide Get to select which technology to use Highest performance, low leakage current, etc Often have a couple of transistors types Dual gate oxides, multiple Vths for the transistors Designer Needs to choose the transistor type Can choose the supply voltage too MAH 27 Solution Multiple Levels for Vdd Basic Concept Gates off the critical path run at VDDL (reduced voltage) Gates on the critical path run at VDDH (higher voltage) Minimize # of level-converters V DDH only level converter V DDH V DDL cluster critical path critical path CVS structure [Usami98a] MAH 28

Problem With Static Approach Two issues with setting Vdd and Vth at design time Optimal point changes with operating task Variability of devices changes what you create Might want to have periods of high-performance Adaptive supplies Might have periods of low activity Adaptive threshold But how do you do this when you have variability? This is still an open question MAH 29 Adaptive Power-Supply Regulator fref Digital System Reference Circuit V f - + Controller Vdd V L u C Buck Converter MAH 30

Adaptive Threshold Control People have used substrate voltage Both forward and reverse bias Impedance control is the big issue Have strap transistors to tie to Vdd, Gnd Only drive the lines when in standby Other groups have tied to control Vth more directly Adjust Ion to Ioff ratios to optimize performance Problem is how to measure leakage If devices don t match, your control device might not be correct, and the optimal point depends on activity anyhow. MAH 31 Crazy Idea Measure Power Directly Assume you change change Vdd and Vth Why not use simple adaptive algorithm Build a system that has a tracking Vdd control loop Vdd adapts to the value needed to run at F Add a small modulation to Vthn Vthp Measure power of the system as it is running Measure the change in power w.r.t. Vthn, Vthp using a synchronous detector. This will tell whether lowering Vthn will increase power or decrease power for this chip. Adapt Vthn Vthp in a slow loop Naturally will track activity ratio, temp, etc. MAH 32

Summary Power efficiency of CMOS logic scales Power per function is scaling about 3-4x per gen This scaling might slow down in a couple gen Trouble with scaling Vdd if Vth does not scale Use adaptation to become more efficient with voltages Power is a large problem for current IC designers We now have the capability to build very hot chips Since we can build very complex circuits in small spaces Need to balance performance and power Best way to lower power is to find solution that needs less stuff done, or can use parallelism MAH 33