Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Size: px
Start display at page:

Download "Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So"

Transcription

1 Performance Metrics & Architectural Adaptivity ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

2 What are the Options? Power Consumption Activity factor (amount of circuit switching) Load Capacitance (size of circuit) Voltage Swing Supply Voltage Clock frequency P total = α( C L V sw V dd f ) clk + I sc V dd + I leakage V dd Dynamic Static Energy per operation Total Energy Consumption E op P dyn / f clk = α C L V sw V dd E total = E op no. of operations Total Run Time T total = no. of operations CPI / f clk H. So, Sp10 Lecture 4 - ELEC8106/6102 2

3 Metrics How do we quantify Performance, Power and Energy? How do we quantify Energy Efficiency? Many different quantities: Power-delay product Energy-delay product Performance/Power Performance 2 /Power H. So, Sp10 Lecture 4 - ELEC8106/6102 3

4 Absolute wall clock time Measures the absolute wall clock time to finish a task Compared to virtual OS time, etc Measure in seconds Includes all overhead from hardware, software, as well as the OS Cons Doesn t tell you the portion of time spent because of time sharing machine, OS overhead, etc Solution: run in single-user mode H. So, Sp10 Lecture 4 - ELEC8106/6102 4

5 IPS, kips, MIPS, bips IPS = instruction per second Measures throughput of a computer A very simple first-order estimation of how much work a computer can perform per unit time. A machine that completes 1 instruction per cycle running at 1 Megahertz = 1 MIPS Varies GREATLY depending on the input benchmark IPS = IPC * Cycle per second (Hz) Very often used, but can be very misleading H. So, Sp10 Lecture 4 - ELEC8106/6102 5

6 FLOPS Floating Point Operations per Second Similarly calculated as MIPS A machine that performs 1 floating point operation per cycle running at 1 MHz is running at 1 FLOPS Similar to MIPS, FLOPS rating of a system depends highly on the input workload Also, system performances, e.g. memory, bus E.g. top500 list use FLOPS to measure super computers using LINPACK The sustained FLOPS of a system is used to rank, not the theoretical peak. H. So, Sp10 Lecture 4 - ELEC8106/6102 6

7 Energy and Power Power: Watt (W) Energy: Joules (J) kwh, (mah?) Power = Energy consumed per unit time H. So, Sp10 Lecture 4 - ELEC8106/6102 7

8 Energy efficiency How much performance can you get per unit of energy spent Three possible measurements: 1. Power/Throughput 2. Power/Throughput 2 3. BIPS/w H. So, Sp10 Lecture 4 - ELEC8106/6102 8

9 Power/Throughput Good for fixed throughput systems Such as DSP systems Data stream into system and processed continuously Power Throughput = Energy/time Operation/time = Energy Operation In such fixed throughput systems, number of operation per unit time is constant Drawback: Doesn t take performance into account System A may be better than System B in terms of Energy/Op, but System B may be much faster Similar to Power-Delay-Product in circuit design H. So, Sp10 Lecture 4 - ELEC8106/6102 9

10 Power/Throughput 2 ETR = E MAX T MAX = Power Throughput 2 Balances both Energy/Op and Throughput If system A has lower ETR than System B, then either 1. A has lower energy/op than B at the same throughput, OR 2. A has higher throughput than B at the same energy/op Similar to Energy-Delay-Product in circuit designs, But hasn t taken into account architectural effects H. So, Sp10 Lecture 4 - ELEC8106/

11 Optimizing ETR Need to apply powerdelay tradeoff Vdd scaling only works within a limited range Frequency scaling alone doesn t work Frequency doesn t affect E/op lowered frequency decrease throughput higher ETR H. So, Sp10 Lecture 4 - ELEC8106/

12 Optimizing ETR Algorithm and Architectural changes most effective! Architecture: If a machine can perform S operations at the same time, throughput is increased by S while energy/op is constant ETR decreased by S Algorithm: If the length of an application is reduced by a factor of S Then, switching capacitance is reduced by S, throughput increased by S, so ETR is reduced by S 2 H. So, Sp10 Lecture 4 - ELEC8106/

13 FPGA coprocessing on ETR Offloading computation to FPGA very likely improve ETR 1. Decreases code size in sw (S 2 factor) 2. Decreases overall run time (overall E) 3. Increases operation/cycle But Increased seconds/cycle decreases operation per second H. So, Sp10 Lecture 4 - ELEC8106/

14 Efficiency Trends and Limits from Comprehensive Microarchitectural Adaptivity Benjamin C. Lee and David Brooks, ASPLOS 08 H. So, Sp10 Lecture 4 - ELEC8106/

15 Overview Studied the effectiveness of architectural adaptation 15 different architectural parameters are adapted to the running application both temporally and spatially H. So, Sp10 Lecture 4 - ELEC8106/

16 Overview Temporal and Spatial sampling to reduce exploration space Genetic algorithm is used to find optimal configuration for an application at a given time H. So, Sp10 Lecture 4 - ELEC8106/

17 BIPS 3 per Watt Lee and Brooks used bips 3 /w as a metrics Argument: voltage invariant w V 2 f and f V w f 3 bips f Similar to Power Throughput 3 bips3 w = k What s the catch? H. So, Sp10 Lecture 4 - ELEC8106/

18 Temporal Adaptivity Improved power Improved performance Optimal architecture evaluated after a block of instructions have been executed Any or all of the 15 parameters can be adapted to their optimal values. Higher temporal adativity higher overall efficiency H. So, Sp10 Lecture 4 - ELEC8106/

19 Parameter Adaptation High temporal adaptivity smaller # of parameters changed smaller parameter delta values more high value changes H. So, Sp10 Lecture 4 - ELEC8106/

20 Spatial Adapativity What if not all 15 parameters can be changed at the same time? Pick the 3 optimal parameters? Which 3? H. So, Sp10 Lecture 4 - ELEC8106/

21 Limited spatial adaptivitiy Most benefits come from the 3 most important parameters But Some benchmarks require adapting all 15 parameters to be effective H. So, Sp10 Lecture 4 - ELEC8106/

22 DVFS as adaptation Each bar relative to case w/o DVFS During each iteration, after spatial parameters are changed, adjust voltage and the corresponding frequency to maximize m3pw Since all bars relative to th H. So, Sp10 Lecture 4 - ELEC8106/

23 DVFS as adaptation Each bar relative to case w/o DVFS During each iteration, after spatial parameters are changed, adjust voltage and the corresponding frequency to maximize m3pw DVFS only useful in a few cases in static (non-adapting) cases Why? H. So, Sp10 Lecture 4 - ELEC8106/

24 In conclusion Metrics for energy efficiency must be carefully constructed to suit actual need Energy/Throughput seems reasonable for this class Throughput 3 /Power also used Offloading computation to reconfigurable hardware accelerators can significantly improve energy efficiency metrics Reconfigurable systems allow flexible tradeoffs in energy efficiency tradeoff Reconfigurable systems allow run time architectural adaptation that can improve energy efficiency by factor of 5 H. So, Sp10 Lecture 4 - ELEC8106/

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION

More information

Lecture 2: Metrics to Evaluate Systems

Lecture 2: Metrics to Evaluate Systems Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video

More information

Performance, Power & Energy

Performance, Power & Energy Recall: Goal of this class Performance, Power & Energy ELE8106/ELE6102 Performance Reconfiguration Power/ Energy Spring 2010 Hayden Kwok-Hay So H. So, Sp10 Lecture 3 - ELE8106/6102 2 What is good performance?

More information

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.

More information

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu Performance Metrics for Computer Systems CASS 2018 Lavanya Ramapantulu Eight Great Ideas in Computer Architecture Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati CSE140L: Components and Design Techniques for Digital Systems Lab Power Consumption in Digital Circuits Pietro Mercati 1 About the final Friday 09/02 at 11.30am in WLH2204 ~2hrs exam including (but not

More information

CS 700: Quantitative Methods & Experimental Design in Computer Science

CS 700: Quantitative Methods & Experimental Design in Computer Science CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,

More information

Goals for Performance Lecture

Goals for Performance Lecture Goals for Performance Lecture Understand performance, speedup, throughput, latency Relationship between cycle time, cycles/instruction (CPI), number of instructions (the performance equation) Amdahl s

More information

Computer Architecture

Computer Architecture Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines

More information

Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design

Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design Harris Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH E158

More information

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining Slide 1 EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining Slide 2 Topics Clocking Clock Parameters Latch Types Requirements for reliable clocking Pipelining Optimal pipelining

More information

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then: Amdahl's Law Useful for evaluating the impact of a change. (A general observation.) Insight: Improving a feature cannot improve performance beyond the use of the feature Suppose we introduce a particular

More information

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University EE 466/586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 8 Power Dissipation in CMOS Gates Power in CMOS gates Dynamic Power Capacitance switching Crowbar

More information

Lecture 2: CMOS technology. Energy-aware computing

Lecture 2: CMOS technology. Energy-aware computing Energy-Aware Computing Lecture 2: CMOS technology Basic components Transistors Two types: NMOS, PMOS Wires (interconnect) Transistors as switches Gate Drain Source NMOS: When G is @ logic 1 (actually over

More information

Power Dissipation. Where Does Power Go in CMOS?

Power Dissipation. Where Does Power Go in CMOS? Power Dissipation [Adapted from Chapter 5 of Digital Integrated Circuits, 2003, J. Rabaey et al.] Where Does Power Go in CMOS? Dynamic Power Consumption Charging and Discharging Capacitors Short Circuit

More information

EECS150 - Digital Design Lecture 22 Power Consumption in CMOS. Announcements

EECS150 - Digital Design Lecture 22 Power Consumption in CMOS. Announcements EECS150 - Digital Design Lecture 22 Power Consumption in CMOS November 22, 2011 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http://www-inst.eecs.berkeley.edu/~cs150

More information

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption EE115C Winter 2017 Digital Electronic Circuits Lecture 6: Power Consumption Four Key Design Metrics for Digital ICs Cost of ICs Reliability Speed Power EE115C Winter 2017 2 Power and Energy Challenges

More information

Where Does Power Go in CMOS?

Where Does Power Go in CMOS? Power Dissipation Where Does Power Go in CMOS? Dynamic Power Consumption Charging and Discharging Capacitors Short Circuit Currents Short Circuit Path between Supply Rails during Switching Leakage Leaking

More information

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements EE241 - Spring 2 Advanced Digital Integrated Circuits Lecture 11 Low Power-Low Energy Circuit Design Announcements Homework #2 due Friday, 3/3 by 5pm Midterm project reports due in two weeks - 3/7 by 5pm

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing CSE140L: Components and Design Techniques for Digital Systems Lab FSMs Instructor: Mohsen Imani Slides from Tajana Simunic Rosing Source: Vahid, Katz 1 FSM design example Moore vs. Mealy Remove one 1 from

More information

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

Design for Manufacturability and Power Estimation. Physical issues verification (DSM) Design for Manufacturability and Power Estimation Lecture 25 Alessandra Nardi Thanks to Prof. Jan Rabaey and Prof. K. Keutzer Physical issues verification (DSM) Interconnects Signal Integrity P/G integrity

More information

Power in Digital CMOS Circuits. Fruits of Scaling SpecInt 2000

Power in Digital CMOS Circuits. Fruits of Scaling SpecInt 2000 Power in Digital CMOS Circuits Mark Horowitz Computer Systems Laboratory Stanford University horowitz@stanford.edu Copyright 2004 by Mark Horowitz MAH 1 Fruits of Scaling SpecInt 2000 1000.00 100.00 10.00

More information

EE241 - Spring 2001 Advanced Digital Integrated Circuits

EE241 - Spring 2001 Advanced Digital Integrated Circuits EE241 - Spring 21 Advanced Digital Integrated Circuits Lecture 12 Low Power Design Self-Resetting Logic Signals are pulses, not levels 1 Self-Resetting Logic Sense-Amplifying Logic Matsui, JSSC 12/94 2

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law Topics 2 Page The Nature of Time real (i.e. wall clock) time = User Time: time spent

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Topics Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law 2 The Nature of Time real (i.e. wall clock) time = User Time: time spent executing

More information

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture Rajeevan Amirtharajah University of California, Davis Outline Announcements Review: PDP, EDP, Intersignal Correlations, Glitching, Top

More information

Last Lecture. Power Dissipation CMOS Scaling. EECS 141 S02 Lecture 8

Last Lecture. Power Dissipation CMOS Scaling. EECS 141 S02 Lecture 8 EECS 141 S02 Lecture 8 Power Dissipation CMOS Scaling Last Lecture CMOS Inverter loading Switching Performance Evaluation Design optimization Inverter Sizing 1 Today CMOS Inverter power dissipation» Dynamic»

More information

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Total Power. Energy and Power Optimization. Worksheet Problem 1

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Total Power. Energy and Power Optimization. Worksheet Problem 1 ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 16: March 20, 2018 Energy and Power Optimization, Design Space Exploration Lecture Outline! Energy and Power Optimization " Tradeoffs! Design

More information

ELEC516 Digital VLSI System Design and Design Automation (spring, 2010) Assignment 4 Reference solution

ELEC516 Digital VLSI System Design and Design Automation (spring, 2010) Assignment 4 Reference solution ELEC516 Digital VLSI System Design and Design Automation (spring, 010) Assignment 4 Reference solution 1) Pulse-plate 1T DRAM cell a) Timing diagrams for nodes and Y when writing 0 and 1 Timing diagram

More information

Energy Delay Optimization

Energy Delay Optimization EE M216A.:. Fall 21 Lecture 8 Energy Delay Optimization Prof. Dejan Marković ee216a@gmail.com Some Common Questions Is sizing better than V DD for energy reduction? What are the optimal values of gate

More information

Lecture 12: Energy and Power. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 12: Energy and Power. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 12: Energy and Power James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L12 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today a working understanding of

More information

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 18: March 27, 2018 Dynamic Logic, Charge Injection Lecture Outline! Sequential MOS Logic " D-Latch " Timing Constraints! Dynamic Logic " Domino

More information

BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power

BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power James C. Hoe Department of ECE Carnegie Mellon niversity Eric S. Chung, et al., Single chip Heterogeneous Computing:

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 17: March 23, 2017 Energy and Power Optimization, Design Space Exploration, Synchronous MOS Logic Lecture Outline! Energy and Power Optimization

More information

8. Design Tradeoffs x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS

8. Design Tradeoffs x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS 8. Design Tradeoffs 6.004x Computation Structures Part 1 Digital Circuits Copyright 2015 MIT EECS 6.004 Computation Structures L08: Design Tradeoffs, Slide #1 There are a large number of implementations

More information

Parallel Processing and Circuit Design with Nano-Electro-Mechanical Relays

Parallel Processing and Circuit Design with Nano-Electro-Mechanical Relays Parallel Processing and Circuit Design with Nano-Electro-Mechanical Relays Elad Alon 1, Tsu-Jae King Liu 1, Vladimir Stojanovic 2, Dejan Markovic 3 1 University of California, Berkeley 2 Massachusetts

More information

8. Design Tradeoffs x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS

8. Design Tradeoffs x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS 8. Design Tradeoffs 6.004x Computation Structures Part 1 Digital Circuits Copyright 2015 MIT EECS 6.004 Computation Structures L08: Design Tradeoffs, Slide #1 There are a large number of implementations

More information

MICROPROCESSOR REPORT. THE INSIDER S GUIDE TO MICROPROCESSOR HARDWARE

MICROPROCESSOR REPORT.   THE INSIDER S GUIDE TO MICROPROCESSOR HARDWARE MICROPROCESSOR www.mpronline.com REPORT THE INSIDER S GUIDE TO MICROPROCESSOR HARDWARE ENERGY COROLLARIES TO AMDAHL S LAW Analyzing the Interactions Between Parallel Execution and Energy Consumption By

More information

COVER SHEET: Problem#: Points

COVER SHEET: Problem#: Points EEL 4712 Midterm 3 Spring 2017 VERSION 1 Name: UFID: Sign here to give permission for your test to be returned in class, where others might see your score: IMPORTANT: Please be neat and write (or draw)

More information

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory L16: Power Dissipation in Digital Systems 1 Problem #1: Power Dissipation/Heat Power (Watts) 100000 10000 1000 100 10 1 0.1 4004 80088080 8085 808686 386 486 Pentium proc 18KW 5KW 1.5KW 500W 1971 1974

More information

Lecture 3, Performance

Lecture 3, Performance Repeating some definitions: Lecture 3, Performance CPI MHz MIPS MOPS Clocks Per Instruction megahertz, millions of cycles per second Millions of Instructions Per Second = MHz / CPI Millions of Operations

More information

COMP 103. Lecture 16. Dynamic Logic

COMP 103. Lecture 16. Dynamic Logic COMP 03 Lecture 6 Dynamic Logic Reading: 6.3, 6.4 [ll lecture notes are adapted from Mary Jane Irwin, Penn State, which were adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] COMP03

More information

Lecture 8-1. Low Power Design

Lecture 8-1. Low Power Design Lecture 8 Konstantinos Masselos Department of Electrical & Electronic Engineering Imperial College London URL: http://cas.ee.ic.ac.uk/~kostas E-mail: k.masselos@ic.ac.uk Lecture 8-1 Based on slides/material

More information

TU Wien. Energy Efficiency. H. Kopetz 26/11/2009 Model of a Real-Time System

TU Wien. Energy Efficiency. H. Kopetz 26/11/2009 Model of a Real-Time System TU Wien 1 Energy Efficiency Outline 2 Basic Concepts Energy Estimation Hardware Scaling Power Gating Software Techniques Energy Sources 3 Why are Energy and Power Awareness Important? The widespread use

More information

Lecture 3, Performance

Lecture 3, Performance Lecture 3, Performance Repeating some definitions: CPI Clocks Per Instruction MHz megahertz, millions of cycles per second MIPS Millions of Instructions Per Second = MHz / CPI MOPS Millions of Operations

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

Chapter 8. Low-Power VLSI Design Methodology

Chapter 8. Low-Power VLSI Design Methodology VLSI Design hapter 8 Low-Power VLSI Design Methodology Jin-Fu Li hapter 8 Low-Power VLSI Design Methodology Introduction Low-Power Gate-Level Design Low-Power Architecture-Level Design Algorithmic-Level

More information

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Espen Stenersen Master of Science in Electronics Submission date: June 2008 Supervisor: Per Gunnar Kjeldsberg, IET Co-supervisor: Torstein

More information

CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When

CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When 1 CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When Inkwon Hwang, Student Member and Massoud Pedram, Fellow, IEEE Abstract

More information

Computer Architecture ELEC2401 & ELEC3441

Computer Architecture ELEC2401 & ELEC3441 Last Time Pipeline Hazard Computer Architecture ELEC2401 & ELEC3441 Lecture 8 Pipelining (3) Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Structural Hazard Hazard Control

More information

Scheduling for Reduced CPU Energy

Scheduling for Reduced CPU Energy Scheduling for Reduced CPU Energy M. Weiser, B. Welch, A. Demers and S. Shenker Appears in "Proceedings of the First Symposium on Operating Systems Design and Implementation," Usenix Association, November

More information

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical & Electronic

More information

ECE321 Electronics I

ECE321 Electronics I ECE321 Electronics I Lecture 1: Introduction to Digital Electronics Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Tuesday 2:00-3:00PM or by appointment E-mail: payman@ece.unm.edu Slide: 1 Textbook

More information

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling Scheduling I Today! Introduction to scheduling! Classical algorithms Next Time! Advanced topics on scheduling Scheduling out there! You are the manager of a supermarket (ok, things don t always turn out

More information

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 15: March 15, 2018 Euler Paths, Energy Basics and Optimization Midterm! Midterm " Mean: 89.7 " Standard Dev: 8.12 2 Lecture Outline! Euler

More information

Session 8C-5: Inductive Issues in Power Grids and Packages. Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs

Session 8C-5: Inductive Issues in Power Grids and Packages. Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs ASP-DAC 2006 Session 8C-5: Inductive Issues in Power Grids and Packages Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs Authors: Brock J. LaMeres Agilent Technologies Kanupriya

More information

EECS 141 F01 Lecture 17

EECS 141 F01 Lecture 17 EECS 4 F0 Lecture 7 With major inputs/improvements From Mary-Jane Irwin (Penn State) Dynamic CMOS In static circuits at every point in time (except when switching) the output is connected to either GND

More information

EE141Microelettronica. CMOS Logic

EE141Microelettronica. CMOS Logic Microelettronica CMOS Logic CMOS logic Power consumption in CMOS logic gates Where Does Power Go in CMOS? Dynamic Power Consumption Charging and Discharging Capacitors Short Circuit Currents Short Circuit

More information

Advanced Testing. EE5375 ADD II Prof. MacDonald

Advanced Testing. EE5375 ADD II Prof. MacDonald Advanced Testing EE5375 ADD II Prof. MacDonald Functional Testing l Original testing method l Run chip from reset l Tester emulates the outside world l Chip runs functionally with internally generated

More information

Lecture 13: Sequential Circuits, FSM

Lecture 13: Sequential Circuits, FSM Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines 1 Clocks A microprocessor is composed of many different circuits that are operating simultaneously if each

More information

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops A Mathematical Solution to Power Optimal Pipeline Design by Utilizing Soft Edge Flip Flops M. Ghasemazar, B. Amelifard, M. Pedram University of Southern California Department of Electrical Engineering

More information

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control The MIPS Pipeline CSCI206 - Computer Organization & Programming Pipeline Datapath and Control zybook: 11.6 Developed and maintained by the Bucknell University Computer Science Department - 2017 Hazard

More information

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of ESE532: System-on-a-Chip Architecture Day 20: November 8, 2017 Energy Today Energy Today s bottleneck What drives Efficiency of Processors, FPGAs, accelerators How does parallelism impact energy? 1 2 Message

More information

Topic 4. The CMOS Inverter

Topic 4. The CMOS Inverter Topic 4 The CMOS Inverter Peter Cheung Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.ic.ac.uk/pcheung/ E-mail: p.cheung@ic.ac.uk Topic 4-1 Noise in Digital Integrated

More information

Announcements. EE141- Fall 2002 Lecture 7. MOS Capacitances Inverter Delay Power

Announcements. EE141- Fall 2002 Lecture 7. MOS Capacitances Inverter Delay Power - Fall 2002 Lecture 7 MOS Capacitances Inverter Delay Power Announcements Wednesday 12-3pm lab cancelled Lab 4 this week Homework 2 due today at 5pm Homework 3 posted tonight Today s lecture MOS capacitances

More information

Midterm. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Lecture Outline. Pass Transistor Logic. Restore Output.

Midterm. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Lecture Outline. Pass Transistor Logic. Restore Output. ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 16: March 21, 2017 Transmission Gates, Euler Paths, Energy Basics Review Midterm! Midterm " Mean: 79.5 " Standard Dev: 14.5 2 Lecture Outline!

More information

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary EECS50 - Digital Design Lecture - Shifters & Counters February 24, 2003 John Wawrzynek Spring 2005 EECS50 - Lec-counters Page Register Summary All registers (this semester) based on Flip-flops: q 3 q 2

More information

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling Scheduling I Today Introduction to scheduling Classical algorithms Next Time Advanced topics on scheduling Scheduling out there You are the manager of a supermarket (ok, things don t always turn out the

More information

ENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic

ENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic ENGG 1203 Tutorial_9 - Review Boolean Algebra 1. Combinational & Sequential Logic 2. Computer Systems 3. Electronic Circuits 4. Signals, Systems, and Control Remark : Multiple Choice Questions : ** Check

More information

Clocked Synchronous State-machine Analysis

Clocked Synchronous State-machine Analysis Clocked Synchronous State-machine Analysis Given the circuit diagram of a state machine: Analyze the combinational logic to determine flip-flop input (excitation) equations: D i = F i (Q, inputs) The input

More information

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders EECS 47 Lecture 11: Power and Energy Reading: 5.55 [Adapted from Irwin and Narayanan] 1 Reminders CAD5 is due Wednesday 10/8 You can submit it by Thursday 10/9 at noon Lecture on 11/ will be taught by

More information

ECE 574 Cluster Computing Lecture 20

ECE 574 Cluster Computing Lecture 20 ECE 574 Cluster Computing Lecture 20 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 18 April 2017 Announcements Project updates, related work. HW#8 was due Big Data: Last HW not

More information

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals ESE570, Spring 2018 Final Monday, Apr 0 5 Problems with point weightings shown.

More information

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of ESE532: System-on-a-Chip Architecture Day 22: April 10, 2017 Today Today s bottleneck What drives Efficiency of Processors, FPGAs, accelerators 1 2 Message dominates Including limiting performance Make

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

EEC 216 Lecture #2: Metrics and Logic Level Power Estimation. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #2: Metrics and Logic Level Power Estimation. Rajeevan Amirtharajah University of California, Davis EEC 216 Lecture #2: Metrics and Logic Level Power Estimation Rajeevan Amirtharajah University of California, Davis Announcements PS1 available online tonight R. Amirtharajah, EEC216 Winter 2008 2 Outline

More information

Designing Sequential Logic Circuits

Designing Sequential Logic Circuits igital Integrated Circuits (83-313) Lecture 5: esigning Sequential Logic Circuits Semester B, 2016-17 Lecturer: r. Adam Teman TAs: Itamar Levi, Robert Giterman 26 April 2017 isclaimer: This course was

More information

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks Yufei Ma, Yu Cao, Sarma Vrudhula,

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 14: Designing for Low Power [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN

More information

In 1980, the yield = 48% and the Die Area = 0.16 from figure In 1992, the yield = 48% and the Die Area = 0.97 from figure 1.31.

In 1980, the yield = 48% and the Die Area = 0.16 from figure In 1992, the yield = 48% and the Die Area = 0.97 from figure 1.31. CS152 Homework 1 Solutions Spring 2004 1.51 Yield = 1 / ((1 + (Defects per area * Die Area / 2))^2) Thus, if die area increases, defects per area must decrease. 1.52 Solving the yield equation for Defects

More information

Announcements. EE141- Spring 2003 Lecture 8. Power Inverter Chain

Announcements. EE141- Spring 2003 Lecture 8. Power Inverter Chain - Spring 2003 Lecture 8 Power Inverter Chain Announcements Homework 3 due today. Homework 4 will be posted later today. Special office hours from :30-3pm at BWRC (in lieu of Tuesday) Today s lecture Power

More information

EE115C Digital Electronic Circuits Homework #4

EE115C Digital Electronic Circuits Homework #4 EE115 Digital Electronic ircuits Homework #4 Problem 1 Power Dissipation Solution Vdd =1.0V onsider the source follower circuit used to drive a load L =20fF shown above. M1 and M2 are both NMOS transistors

More information

CMP 338: Third Class

CMP 338: Third Class CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does

More information

CMP N 301 Computer Architecture. Appendix C

CMP N 301 Computer Architecture. Appendix C CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)

More information

School of EECS Seoul National University

School of EECS Seoul National University 4!4 07$ 8902808 3 School of EECS Seoul National University Introduction Low power design 3974/:.9 43 Increasing demand on performance and integrity of VLSI circuits Popularity of portable devices Low power

More information

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application Administrivia 1. markem/cs333/ 2. Staff 3. Prerequisites 4. Grading Course Objectives 1. Theory and application 2. Benefits 3. Labs TAs Overview 1. What is a computer system? CPU PC ALU System bus Memory

More information

CMOS scaling rules Power density issues and challenges Approaches to a solution: Dimension scaling alone Scaling voltages as well

CMOS scaling rules Power density issues and challenges Approaches to a solution: Dimension scaling alone Scaling voltages as well 6.01 - Microelectronic Devices and Circuits Lecture 16 - CMOS scaling; The Roadmap - Outline Announcements PS #9 - Will be due next week Friday; no recitation tomorrow. Postings - CMOS scaling (multiple

More information

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 17: March 26, 2019 Energy Optimization & Design Space Exploration Penn ESE 570 Spring 2019 Khanna Lecture Outline! Energy Optimization! Design

More information

The Quantum Landscape

The Quantum Landscape The Quantum Landscape Computational drug discovery employing machine learning and quantum computing Contact us! lucas@proteinqure.com Or visit our blog to learn more @ www.proteinqure.com 2 Applications

More information

EECS 579: Logic and Fault Simulation. Simulation

EECS 579: Logic and Fault Simulation. Simulation EECS 579: Logic and Fault Simulation Simulation: Use of computer software models to verify correctness Fault Simulation: Use of simulation for fault analysis and ATPG Circuit description Input data for

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

80386DX. 32-Bit Microprocessor FEATURES: DESCRIPTION: Logic Diagram

80386DX. 32-Bit Microprocessor FEATURES: DESCRIPTION: Logic Diagram 32-Bit Microprocessor 21 1 22 1 2 10 3 103 FEATURES: 32-Bit microprocessor RAD-PAK radiation-hardened agait natural space radiation Total dose hardness: - >100 Krad (Si), dependent upon space mission Single

More information

Lecture: Pipelining Basics

Lecture: Pipelining Basics Lecture: Pipelining Basics Topics: Performance equations wrap-up, Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4:

More information

A novel Capacitor Array based Digital to Analog Converter

A novel Capacitor Array based Digital to Analog Converter Chapter 4 A novel Capacitor Array based Digital to Analog Converter We present a novel capacitor array digital to analog converter(dac architecture. This DAC architecture replaces the large MSB (Most Significant

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles

More information

Lecture 5: Performance and Efficiency. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 5: Performance and Efficiency. James C. Hoe Department of ECE Carnegie Mellon University 18 643 Lecture 5: Performance and Efficiency James C. Hoe Department of ECE Carnegie Mellon University 18 643 F17 L05 S1, James C. Hoe, CMU/ECE/CALCM, 2017 18 643 F17 L05 S2, James C. Hoe, CMU/ECE/CALCM,

More information

Chapter 5 CMOS Logic Gate Design

Chapter 5 CMOS Logic Gate Design Chapter 5 CMOS Logic Gate Design Section 5. -To achieve correct operation of integrated logic gates, we need to satisfy 1. Functional specification. Temporal (timing) constraint. (1) In CMOS, incorrect

More information

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach Hwisung Jung, Massoud Pedram Outline Introduction Background Thermal Management Framework Accuracy of Modeling Policy Representation

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information