Chapter 8. Low-Power VLSI Design Methodology

Similar documents
EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

MODULE III PHYSICAL DESIGN ISSUES

EE241 - Spring 2001 Advanced Digital Integrated Circuits

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

Digital Integrated Circuits A Design Perspective

Issues on Timing and Clocking

Power Dissipation. Where Does Power Go in CMOS?

EE115C Digital Electronic Circuits Homework #4

EE141Microelettronica. CMOS Logic

Nyquist-Rate A/D Converters

Name: Answers. Mean: 83, Standard Deviation: 12 Q1 Q2 Q3 Q4 Q5 Q6 Total. ESE370 Fall 2015

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits Designing Combinational Logic Circuits. Fuyuzhuo

Pass-Transistor Logic

C.K. Ken Yang UCLA Courtesy of MAH EE 215B

Digital Integrated Circuits A Design Perspective

9/18/2008 GMU, ECE 680 Physical VLSI Design

Chapter 5 CMOS Logic Gate Design

Where Does Power Go in CMOS?

Objective and Outline. Acknowledgement. Objective: Power Components. Outline: 1) Acknowledgements. Section 4: Power Components

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Lecture 7 Circuit Delay, Area and Power

Nyquist-Rate D/A Converters. D/A Converter Basics.

Lecture 8-1. Low Power Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

Lecture 2: CMOS technology. Energy-aware computing

Properties of CMOS Gates Snapshot

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

A Gray Code Based Time-to-Digital Converter Architecture and its FPGA Implementation

Semiconductor Memories

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

GMU, ECE 680 Physical VLSI Design 1

School of EECS Seoul National University

Very Large Scale Integration (VLSI)

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

Design of System Elements. Basics of VLSI

MODULE 5 Chapter 7. Clocked Storage Elements

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

EECS 151/251A Homework 5

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing

Digital Integrated Circuits A Design Perspective. Semiconductor. Memories. Memories

! Charge Leakage/Charge Sharing. " Domino Logic Design Considerations. ! Logic Comparisons. ! Memory. " Classification. " ROM Memories.

A Novel LUT Using Quaternary Logic

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ELEC516 Digital VLSI System Design and Design Automation (spring, 2010) Assignment 4 Reference solution

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

VLSI Design I; A. Milenkovic 1

Announcements. EE141- Fall 2002 Lecture 7. MOS Capacitances Inverter Delay Power

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

Semiconductor Memory Classification

Review Problem 1. should be on. door state, false if light should be on when a door is open. v Describe when the dome/interior light of the car

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

Delay and Energy Consumption Analysis of Conventional SRAM

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Chapter 7. VLSI System Components

Digital Integrated Circuits A Design Perspective

THE INVERTER. Inverter

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

EECS 141 F01 Lecture 17

Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

COMP 103. Lecture 16. Dynamic Logic

MTJ-Based Nonvolatile Logic-in-Memory Architecture and Its Application

Lecture 6 Power Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

Lecture 5. MOS Inverter: Switching Characteristics and Interconnection Effects

EECS 141: FALL 05 MIDTERM 1

Combinational Logic Design

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Restore Output. Pass Transistor Logic. How compare.

CMOS Digital Integrated Circuits Lec 13 Semiconductor Memories

EEC 118 Lecture #6: CMOS Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

ECE251. VLSI System Design

L15: Custom and ASIC VLSI Integration

Solution (a) We can draw Karnaugh maps for NS1, NS0 and OUT:

EEC 118 Lecture #5: CMOS Inverter AC Characteristics. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Static CMOS Circuits. Example 1

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

ECE 438: Digital Integrated Circuits Assignment #4 Solution The Inverter

High-to-Low Propagation Delay t PHL

VLSI Signal Processing

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

A novel Capacitor Array based Digital to Analog Converter

Static CMOS Circuits

S.E. Sem. III [ETRX] Digital Circuit Design. t phl. Fig.: Input and output voltage waveforms to define propagation delay times.

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

EEC 216 Lecture #2: Metrics and Logic Level Power Estimation. Rajeevan Amirtharajah University of California, Davis

Topic 4. The CMOS Inverter

Dynamic operation 20

PLA Minimization for Low Power VLSI Designs

CMOS Comparators. Kyungpook National University. Integrated Systems Lab, Kyungpook National University. Comparators

Clock Strategy. VLSI System Design NCKUEE-KJLEE

GMU, ECE 680 Physical VLSI Design

ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ Εαρινό Εξάμηνο 2018

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

ECE 342 Solid State Devices & Circuits 4. CMOS

Transcription:

VLSI Design hapter 8 Low-Power VLSI Design Methodology Jin-Fu Li

hapter 8 Low-Power VLSI Design Methodology Introduction Low-Power Gate-Level Design Low-Power Architecture-Level Design Algorithmic-Level Power Reduction RTL Techniques for Optimizing Power Adiabatic Logic ircuits 2

Introduction Most SO design teams now regard power as one of their top design concerns Why low-power design? Battery lifetime (especially for portable devices) Reliability Power consumption Peak power Average power 3

Overview of Power onsumption Average power consumption Dynamic power consumption Short-circuit power consumption Leakage power consumption Static power consumption Dynamic power dissipation during switching input drain interconnect input 4

Overview of Power onsumption Generic representation of a MOS logic gate for switching power calculation V A V B pmos network V A V B nmos network drain V out + int erconnect + input P avg = 1 T / 2 T T [ dv dt out V ( ) + out load dt ( VDD 0 T / 2 V out )( load dv dt out ) dt] 5

Overview of Power onsumption The average power consumption can be expressed as P avg The node transition rate can be slower than the clock rate. To better represent this behavior, a node transition factor ( ) should be introduced P avg = 1 T = α load T V 2 DD load α T V The switching power expressed above are derived by taking into account the output node load capacitance = 2 DD f load LK V 2 DD f LK 6

Overview of Power onsumption V A V internal V A V B V B internal V internal V out V A V B load V out The generalized expression for the average power dissipation can be rewritten as P avg # ofnodes = α i = 1 Ti V i i V DD f LK 7

Gate-Level Design Technology Mapping The objective of logic minimization is to reduce the boolean function. For low-power design, the signal switching activity is minimized by restructuring a logic circuit The power minimization is constrained by the delay, however, the area may increase. During this phase of logic minimization, the function to be minimized is i P (1 P ) i i i 8

Gate-Level Design Technology Mapping The first step in technology mapping is to decompose each logic function into two-input gates The objective of this decomposition is to minimizing the total power dissipation by reducing the total switching activity A 0.2 B 0.2 α =0.0384 α = 0.0196 α = 0.0099 A B D 0.5 D 0.5 A 0.2 B 0.2 α = 0.0384 0.5 D 0.5 α = 0.1875 α = 0.0099 9

Gate-Level Design Phase Assignment High activity node High activity node A A B B 10

Gate-Level Design Pin Swapping a b c d a b c d d a Switching activity c b a Switching activity b c d d c b a b a d c 11

Gate-Level Design Glitching Power Glitches spurious transitions due to imbalanced path delays A design has more balanced delay paths has fewer glitches, and thus has less power dissipation Note that there will be no glitches in a dynamic MOS logic A B D E A B D E 12

Gate-Level Design Glitching Power A chain structure has more glitches A tree structure has fewer glitches A B hain structure D A B D Tree structure 13

Gate-Level Design Precomputation REG R1 ombinational Logic REG R2 REG R1 ombinational Logic REG R2 Precomputation Logic 14

Gate-Level Design Precomputation A<n-1> B<n-1> REG R1 1-bit omparator (MSB) A<n-2:0> REG R2 B<n-2:0> Precomputation logic Enable REG R3 (n-1)-bit omparator REG R4 F 15

Gate-Level Design Gating lock clk D Q D Q D Q D Q Fail DFT rule checking T clk D Q D Q D Q D Q Add control pin to solve DFT violation problem 16

Gate-Level Design Input Gating f1 clk + select f2 17

Architecture-Level Design Parallelism A 16 R A 16 R f ref B 16 R 16x16 multiplier 32 16 32 16x16 f ref/2 multiplier R M UX 32 f ref f ref/2 Assume that With the same 16x16 multiplier, the power supply can be reduced from V ref to V ref /1.83. Vref f 2 ref Pparallel = 2.2 ref ( ) = 0. 33P 1.83 2 ref B 16 16 R f ref/2 R 16x16 multiplier 32 f ref 18 f ref/2

Architecture-Level Design Pipelining The hardware between the pipeline stages is reduced then the reference voltage V ref can be reduced to V new to maintain the same worst case delay. For example, let a 50MHz multiplier is broken into two equal parts as shown below. The delay between the pipeline stages can be remained at 50MHz when the voltage V new is equal to V ref /1.83 (A,B) 32 REG Half multiplier REG Half multiplier 32 f ref Vref 2 Ppipeline = 1.2 ref ( ) f ref = 0. 36 1.83 P ref 19

Architecture-Level Design Retiming Retiming is a transformation technique used to change the locations of delay elements in a circuit without affecting the input/output characteristics of the circuit. Two versions of an IIR filter. (1) (1) x(n) y(n) x(n) y(n) D D a 2D w(n) (1) (2) b retiming D a (1) D (2) w 1 (n) D b 2D (2) w 2 (n) (2) 20

Architecture-Level Design Retiming Retiming for pipeline design REG 1 (6ns) 2 (2ns) REG 3 (4ns) f ref REG 1 (6ns) REG 2 (2ns) 3 (4ns) f ref 21

Architecture-Level Design Retiming lock cycle is 4 gate delays lock cycle is 2 gate delays 22

Architecture-Level Design Power Management 2 1 1 _FREEZE 2 _FREEZE 2 1 1 _FREEZE 2 _FREEZE 23

Architecture-Level Design Bus Segmentation Avoid the sharing of resources Reduce the switched capacitance For example: a global system bus A single shared bus is connected to all modules, this structure results in a large bus capacitance due to The large number of drivers and receivers sharing the same bus The parasitic capacitance of the long bus line A segmented bus structure Switched capacitance during each bus access is significantly reduced Overall routing area may be increased 24

Architecture-Level Design Bus Segmentation bus bus1 Bus Interface bus1 25

Algorithmic-Level Design f activity Reduction Minimization the switching activity, at high level, is one way to reduce the power dissipation of digital processors. One method to minimize the switching signals, at the algorithmic level, is to use an appropriate coding for the signals rather than straight binary code. The table shown below shows a comparison of 3-bit representation of the binary and Gray codes. Binary ode Gray ode Decimal Equivalent 000 001 010 011 100 101 110 111 000 001 011 010 110 111 101 100 0 1 2 3 4 5 6 7 26

RTL-Level Level Design Signal Gating Simple Decoder Decoder with enable module decoder (a, sel); input [1:0[ a; ouput [3:0] sel; reg [3:0] sel; always @(a) begin case (a) 2 b00: sel=4 b0001; 2 b01: sel=4 b0010; 2 b10: sel=4 b0100; 2 b11: sel=4 b1000; endcase end endmodule module decoder (en,a, sel); input en; input [1:0[ a; ouput [3:0] sel; reg [3:0] sel; always @({en,a}) begin case ({en,a}) 3 b100: sel=4 b0001; 3 b101: sel=4 b0010; 3 b110: sel=4 b0100; 3 b111: sel=4 b1000; default: sel=4 b0000; endcase end endmodule 27

RTL-Level Level Design Datapath Reordering Initial Reordered A<B glitchy Mux stable Mux stable Mux A<B glitchy Mux 28

RTL-Level Level Design Memory Partition din addr write 128x32 dout 32 pre_addr 8 d q addr[7:0] addr[7:1] noe M UX 32 dout clk noe write addr din dout 32 addr0 128x32 29

RTL-Level Level Design Memory Partition Application-driven memory partition 64K bytes Reads ARM ore Data Addr R/W 28K 4K 32K 64K Addr Range 30

RTL-Level Level Design Memory Partition A power-optimal partitioned memory organization Decoder ARM ore Data Addr R/W S Data Addr R/W S Data Addr R/W S 31

Adiabatic Logic ircuits Energy drawn from the power supply during 0-to-V transition is calculated as follows Q E = = V QV = V 2 The amount of stored energy from in the output node is expressed as follows V E = V o 0 vodv = 1 2 V 2 32

Adiabatic Logic ircuits To reduce the dissipation, the designer can minimize the switching events, decrease the capacitance, reduce the voltage swing, or apply a combination of these methods The energy drawn from the power supply is used only once To increase the energy efficiency of logic circuits, other methods can be introduced for recycling the energy drawn from the power supply A novel class of logic circuits called adiabatic logic offers the possibility of further reducing the energy dissipated during switching events, and the possibility of recycling 33

Adiabatic Logic ircuits Adiabatic Switching The equivalent circuit used to model the charge-up event in conventional MOS circuits t=0 R V I source V I 1 ( t) = I V = source source ( t) t t 34

Adiabatic Logic ircuits Adiabatic Switching The amount of energy dissipated in the resistor R from t=0 to t=t can be expressed as E E diss diss = R = T 0 R T I 2 source V 2 A number of simple observations can be obtained The energy is smaller than the conventional case if the charging time T is larger than 2R The dissipated energy is proportional to the resistance R an a portion of the energy stored in the capacitance be reclaimed by reversing the current direction? The possibility is unique to the adiabatic operation dt ( T ) = RI 2 source T 35

Adiabatic Logic ircuits Adiabatic Switching It is possible to reduce the dissipation to an arbitrary degree by increasing the switching time to ever-larger values This is referred as the principle of adiabatic charging The term adiabatic is used to indicate that all charge transfer is to occur without generating heat Switching circuits that charge and discharge their load capacitance adiabatically are said to use adiabatic switching The circuits rely on special power supplies that provide accurate pulsed-power delivery The circuits are useful only if the supplies can deliver power efficiently and recycle the power fed back to them 36

Adiabatic Logic ircuits Adiabatic Logic Gates The adiabatic amplifier circuit V A X X Y Y X X i Y X X Y X X V A Y Y 37

Adiabatic Logic ircuits Adiabatic Logic Gates F charge-up path charge-down path F charge-up path V out inputs V out inputs load2 V DD F load charge-down path V DD F V out load1 38

Adiabatic Logic ircuits Adiabatic Logic Gates A A B B V out V A A A B V out B 39

Adiabatic Logic ircuits Adiabatic Logic Gates low V A constant current V out V A R i c V out load V DD V DD /n V A V out t 40

Adiabatic Logic ircuits Adiabatic Logic Gates V N V out V 2 V 1 V out t 41