Microprocessor Power Analysis by Labeled Simulation
|
|
- Abel Osborne
- 5 years ago
- Views:
Transcription
1 Microprocessor Power Analysis by Labeled Simulation Cheng-Ta Hsieh, Kevin Chen and Massoud Pedram University of Southern California Dept. of EE-Systems Los Angeles CA 989 Outline! Introduction! Problem Formulation! Source and Sink! Architecture Patterns! Propagation Rules! Generalizations! Conclusions
2 Macro-Analysis vs. Micro-Analysis! Macro-Analysis can answer the questions:! How long the battery of a notebook computer can last if we run the Internet Explorer?! Is MIPS more power-efficient than Strong ARM for running Windows CE 2.?! Micro-Analysis can answer the questions:! What is the power consumption of a branch instruction, or a compare instruction?! How much power is consumed in some component for a certain instruction? Instruction Level Macro Modeling! Instruction Base Costs! Individual instruction! Effect of Circuit State! Consecutive instruction pair! Inter-Instruction Effects! Pipeline stall, cache misses 2
3 Review of Instruction Level Model Power Macro Model Instruction Base Power Program Consecutive Instruction Pair Power Program Instruction Stall and Cache Miss Power etc. Base Pair Other E = ( B N ) + ( O N ) + E P i i i, j i, j k i i, j k Problem Formulation! Given N gates, g,g 2, g n in a processor and k active instructions,,,i k! For each gate g i, find an instruction set or a labeling, L j ={, I m, such that the energy consumption of the gate in the current clock cycle is caused by instructions in L i! Calculate instruction power consumption by label propagation L j Gate j 3
4 Cycle-Accurate Energy Calculation! Let gates(i) denote the set of label indices that contain instruction I! Energy consumed by instruction I in the current clock cycle is: 2 EI () = CV j ddswj 2 L j gates( I) j An Example Pipeline IF ID EX MEM WB Control Control Control PC. File ALU 4
5 Example (cont d) I 4 I 5 I 4 A Simple Propagation Rule {I x {I x { {I x { { {I x {I x { {I x D SET CLR Q Q SET D Q {I x CLR Q Next Clock Cycle 5
6 Definitions! Source! Set of gates (or wires) from which the labels are originated! Sink! Set of gates (or flip-flops) where the instruction label is dropped! Label Propagation Rule! A description of how the labels are propagated through FSM s, MUX s, FF s, and primitive gates Architecture Pattern! Name! The handle that is used to described the intended architecture effect (e.g., pipeline-flush, pipeline-stall, data-forwarding)! Description! Explanation of how the pattern is caused and how the processor reacts to the pattern! Liable Set! Set of instructions that are responsible for the power dissipation caused by the architecture pattern! Required Rule! Specification of how the propagation rule should work in response to the pattern 6
7 An Architecture Pattern Example! Name! Streamlined Execution.! Description! Each pipeline stage performs the operation specified by the incoming instruction! Liable Set! The instruction being executed in a pipeline stage is responsible for the power dissipation of that stage! Required Rule! The instruction label in a pipeline is transferred to the next stage in the next clock cycle Feasible Labeling Problem! Given a! set of architecture patterns.! Find the! set of sources! setofsinks! set of label propagation rules! that satisfy all the required rules in the set of architecture patterns 7
8 Implication by Domination! An architecture pattern is dominated by a combination of other patterns if its required rules are covered by the required rules of these architecture patterns! A labeling scheme is feasible for a target processor if all of the architecture patterns of that processor are dominated by the architecture patterns that are captured by the labeling scheme Source PC 44 addr 4 : add $,$2,$3 44 : sub $4,$5,$6 48 : sti $8,$7, 52 I 4 : muli $5,$7,4 56 I 5 : lw $24, ($6)... { Instruction Pipeline Label Source 8
9 Sink add $3,$,$3 mov $, write reg number write Write Back mov $, -to-32 decoder write data (imm value : ) Q register D Q register D... Q D Q register 3 D Instruction Decode read reg num M U X M U X read reg num 2 add $3, $, $3 read data read data 2 Journey of a MIPS Instruction PC Write SR Write REG Read ALU MEM Read REG Write MEM Write 9
10 Pipeline-Stall Pattern sub and sub and sub and sub I2 sub $2,$,$3 and $2,$2,$5 and Hazard Detection Circuit src(i 4 ) dst( ) dst( ) dst( ) {I 4 { == == == Hazard Detection {I 5 {I 4 {I 4 {I 4, {I 4 { Credit Both {I 4, Credit Last {I 4 I 5 I 4
11 Data Forwarding Pattern ID. File M U X M U X EX MEM WB { { { ALU Forwarding Control { src(i 3 ) dst(i 2 ) dst( ) == == { { { { Pipeline Flush Pattern 4 beq $,$3,28 44 and $2,$2,$5 48 or $3,$6,$2 52 add $4,$2,$2 72 lw $4, 5($7)
12 Pipeline Flush Control Circuit control control control Flush from prev. stage L 2 L out to next stage hazard detected L hazard L flush =L Flush L predicted condition actual condition L or{ Queuing Pattern L a ={I 4 L c L b queue L b busy wait busy operands not available 2
13 General Propagation Rules " Primitive Gates in in 2 L 2-input gate L out L 2 " OR Gate in in 2 L out L L L 2 L Priority Rule: If L ={I i, L 2 ={I j, then L ={I max(i,j) Union Rule: L =L L 2 Rules for Primitive Gates (cont d) " AND Gate " XOR Gate in in 2 L out L L 2 L L in in 2 L out L L L L 3
14 MUX Propagation Rule L L 2 L out L s select select L s ==φ L ==φ L 2 ==φ L out x x L s x L x (L s +L ) x x L s x L 2 x (L s ) Generalization Pseudo Instruction Cache Miss Inst Cache (nop) {I cache_miss I 4 I 4 (nop) hazard_detected 4
15 Generalization - Instruction Splitting An ARM arithmetic instruction cond op S Rd Rn shifter_operand Operation if ConditionPassed (<cond>) then Rd = Rn <op> <shifter_operand> if S == and Rd == R5 then CPSR = SPSR else if S == then NFlag=Rd[3] ZFlag=ifRd==thenelse C Flag = CarryFrom (Rn + <shifter_operand>) V Flag = OverflowFrom (Rn + <shifter_operand>) A Design Example: ZILOG DSP CORE 5
16 Energy Calculation P = E sw + C Vdd sw x y ( ini ini ) ( oni i oni ) n= n= 2 Cell Power Net Power P = power dissipation for current clock cycle (µj); x = number of input pins; E in = energy associated with the n th input pin (µw/mhz) y = number of output pins; C on = external capacitive loading Experimental Results Instruction Class Average Energy( -8 J) Instruction Count NON.53 - SL MAC CTRL. 3 CAS.47 7 ALF
17 Summary! Proposed technique reports cycle-accurate (finegrain) power consumption for each instruction being executed in a pipelined (superscalar) machine! Proposed technique helps identify power problems during the processor design phase! Proposed technique is verified against MIPS, ARM, Pentium microprocessor, and a Zilog DSP! Pseudo instruction and instruction splitting are useful for building a high-level macro-model that accounts for hard-to-capture power effects 7
3. (2) What is the difference between fixed and hybrid instructions?
1. (2 pts) What is a "balanced" pipeline? 2. (2 pts) What are the two main ways to define performance? 3. (2) What is the difference between fixed and hybrid instructions? 4. (2 pts) Clock rates have grown
More informationCMP N 301 Computer Architecture. Appendix C
CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)
More informationEXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control
The MIPS Pipeline CSCI206 - Computer Organization & Programming Pipeline Datapath and Control zybook: 11.6 Developed and maintained by the Bucknell University Computer Science Department - 2017 Hazard
More informationProject Two RISC Processor Implementation ECE 485
Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor
More information4. (3) What do we mean when we say something is an N-operand machine?
1. (2) What are the two main ways to define performance? 2. (2) When dealing with control hazards, a prediction is not enough - what else is necessary in order to eliminate stalls? 3. (3) What is an "unbalanced"
More informationComputer Architecture
Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines
More informationCSCI-564 Advanced Computer Architecture
CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA
More information1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?
1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished? 2. (2 )What are the two main ways to define performance? 3. (2 )What
More informationICS 233 Computer Architecture & Assembly Language
ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by
More informationComputer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle
Computer Engineering Department CC 311- Computer Architecture Chapter 4 The Processor: Datapath and Control Single Cycle Introduction The 5 classic components of a computer Processor Input Control Memory
More informationComputer Architecture ELEC2401 & ELEC3441
Last Time Pipeline Hazard Computer Architecture ELEC2401 & ELEC3441 Lecture 8 Pipelining (3) Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Structural Hazard Hazard Control
More informationProcessor Design & ALU Design
3/8/2 Processor Design A. Sahu CSE, IIT Guwahati Please be updated with http://jatinga.iitg.ernet.in/~asahu/c22/ Outline Components of CPU Register, Multiplexor, Decoder, / Adder, substractor, Varity of
More information[2] Predicting the direction of a branch is not enough. What else is necessary?
[2] When we talk about the number of operands in an instruction (a 1-operand or a 2-operand instruction, for example), what do we mean? [2] What are the two main ways to define performance? [2] Predicting
More information[2] Predicting the direction of a branch is not enough. What else is necessary?
[2] What are the two main ways to define performance? [2] Predicting the direction of a branch is not enough. What else is necessary? [2] The power consumed by a chip has increased over time, but the clock
More informationECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)
ECE 3401 Lecture 23 Pipeline Design Control State Register Combinational Control Logic New/ Modified Control Word ISA: Instruction Specifications (for reference) P C P C + 1 I N F I R M [ P C ] E X 0 PC
More informationLecture 13: Sequential Circuits, FSM
Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines 1 Clocks A microprocessor is composed of many different circuits that are operating simultaneously if each
More informationUNIVERSITY OF WISCONSIN MADISON
CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON Prof. Gurindar Sohi TAs: Minsub Shin, Lisa Ossian, Sujith Surendran Midterm Examination 2 In Class (50 minutes) Friday,
More informationSimple Instruction-Pipelining. Pipelined Harvard Datapath
6.823, L8--1 Simple ruction-pipelining Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. I fetch decode & eg-fetch execute memory Clock period
More informationPerformance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So
Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION
More informationCprE 281: Digital Logic
CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital
More informationPerformance, Power & Energy
Recall: Goal of this class Performance, Power & Energy ELE8106/ELE6102 Performance Reconfiguration Power/ Energy Spring 2010 Hayden Kwok-Hay So H. So, Sp10 Lecture 3 - ELE8106/6102 2 What is good performance?
More informationVerilog HDL:Digital Design and Modeling. Chapter 11. Additional Design Examples. Additional Figures
Chapter Additional Design Examples Verilog HDL:Digital Design and Modeling Chapter Additional Design Examples Additional Figures Chapter Additional Design Examples 2 Page 62 a b y y 2 y 3 c d e f Figure
More informationEECS150 - Digital Design Lecture 21 - Design Blocks
EECS150 - Digital Design Lecture 21 - Design Blocks April 3, 2012 John Wawrzynek Spring 2012 EECS150 - Lec21-db3 Page 1 Fixed Shifters / Rotators fixed shifters hardwire the shift amount into the circuit.
More informationLecture 13: Sequential Circuits, FSM
Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines Reminder: midterm on Tue 2/28 will cover Chapters 1-3, App A, B if you understand all slides, assignments,
More informationPipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2
Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
More informationCS 52 Computer rchitecture and Engineering Lecture 4 - Pipelining Krste sanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste! http://inst.eecs.berkeley.edu/~cs52!
More informationWorst-Case Execution Time Analysis. LS 12, TU Dortmund
Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 02, 03 May 2016 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 53 Most Essential Assumptions for Real-Time Systems Upper
More informationSpiral 1 / Unit 3
-3. Spiral / Unit 3 Minterm and Maxterms Canonical Sums and Products 2- and 3-Variable Boolean Algebra Theorems DeMorgan's Theorem Function Synthesis use Canonical Sums/Products -3.2 Outcomes I know the
More informationECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk
ECE290 Fall 2012 Lecture 22 Dr. Zbigniew Kalbarczyk Today LC-3 Micro-sequencer (the control store) LC-3 Micro-programmed control memory LC-3 Micro-instruction format LC -3 Micro-sequencer (the circuitry)
More informationImplementing the Controller. Harvard-Style Datapath for DLX
6.823, L6--1 Implementing the Controller Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 6.823, L6--2 Harvard-Style Datapath for DLX Src1 ( j / ~j ) Src2 ( R / RInd) RegWrite MemWrite
More informationCPSC 3300 Spring 2017 Exam 2
CPSC 3300 Spring 2017 Exam 2 Name: 1. Matching. Write the correct term from the list into each blank. (2 pts. each) structural hazard EPIC forwarding precise exception hardwired load-use data hazard VLIW
More informationDesign for Testability
Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning
More informationDesign for Testability
Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning
More informationSimple Instruction-Pipelining. Pipelined Harvard Datapath
6.823, L8--1 Simple ruction-pipelining Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Pipelined Harvard path 6.823, L8--2. fetch decode & eg-fetch execute
More informationCSE 140 Midterm 3 version A Tajana Simunic Rosing Spring 2015
CSE 140 Midterm 3 version A Tajana Simunic Rosing Spring 2015 Name of the person on your left : Name of the person on your right: 1. 20 points 2. 20 points 3. 20 points 4. 15 points 5. 15 points 6. 10
More informationCSE 140 Midterm 2 Tajana Simunic Rosing. Spring 2008
CSE 14 Midterm 2 Tajana Simunic Rosing Spring 28 Do not start the exam until you are told to. Turn off any cell phones or pagers. Write your name and PID at the top of every page. Do not separate the pages.
More informationNCU EE -- DSP VLSI Design. Tsung-Han Tsai 1
NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using
More informationDesign. Dr. A. Sahu. Indian Institute of Technology Guwahati
CS222: Processor Design: Multi Cycle Design Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati Mid Semester Exam Multi Cycle design Outline Clock periods in single cycle and
More informationEC 413 Computer Organization
EC 413 Computer Organization rithmetic Logic Unit (LU) and Register File Prof. Michel. Kinsy Computing: Computer Organization The DN of Modern Computing Computer CPU Memory System LU Register File Disks
More informationKing Fahd University of Petroleum and Minerals College of Computer Science and Engineering Computer Engineering Department
King Fahd University of Petroleum and Minerals College of Computer Science and Engineering Computer Engineering Department Page 1 of 13 COE 202: Digital Logic Design (3-0-3) Term 112 (Spring 2012) Final
More informationCPU DESIGN The Single-Cycle Implementation
CSE 202 Computer Organization CPU DESIGN The Single-Cycle Implementation Shakil M. Khan (adapted from Prof. H. Roumani) Dept of CS & Eng, York University Sequential vs. Combinational Circuits Digital circuits
More informationCMU Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining
CMU 18-447 Introduction to Computer Architecture, Spring 2015 HW 2: ISA Tradeoffs, Microprogramming and Pipelining Instructor: Prof Onur Mutlu TAs: Rachata Ausavarungnirun, Kevin Chang, Albert Cho, Jeremie
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH
More informationCircuit Modeling for Practical Many-core Architecture Design Exploration
Circuit Modeling for Practical Many-core Architecture Design Exploration Redefining design abstractions Dean Truong Bevan Baas VLSI Computation Lab University of California, Davis Outline Motivation Circuit
More informationALU A functional unit
ALU A functional unit that performs arithmetic operations such as ADD, SUB, MPY logical operations such as AND, OR, XOR, NOT on given data types: 8-,16-,32-, or 64-bit values A n-1 A n-2... A 1 A 0 B n-1
More informationCSE Computer Architecture I
Execution Sequence Summary CSE 30321 Computer Architecture I Lecture 17 - Multi Cycle Control Michael Niemier Department of Computer Science and Engineering Step name Instruction fetch Instruction decode/register
More informationL07-L09 recap: Fundamental lesson(s)!
L7-L9 recap: Fundamental lesson(s)! Over the next 3 lectures (using the IPS ISA as context) I ll explain:! How functions are treated and processed in assembly! How system calls are enabled in assembly!
More informationCOE 328 Final Exam 2008
COE 328 Final Exam 2008 1. Design a comparator that compares a 4 bit number A to a 4 bit number B and gives an Output F=1 if A is not equal B. You must use 2 input LUTs only. 2. Given the following logic
More informationCMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design
CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 19: Adder Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L19
More informationLecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995
Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995 RHK.F95 1 Technology Trends: Microprocessor Capacity 100000000 10000000 Pentium Transistors 1000000
More informationReview for Final Exam
CSE140: Components and Design Techniques for Digital Systems Review for Final Exam Mohsen Imani CAPE Please submit your evaluations!!!! RTL design Use the RTL design process to design a system that has
More informationCSE140: Components and Design Techniques for Digital Systems. Midterm Information. Instructor: Mohsen Imani. Sources: TSR, Katz, Boriello & Vahid
CSE140: Components and Design Techniques for Digital Systems Midterm Information Instructor: Mohsen Imani Midterm Topics In general: everything that was covered in homework 1 and 2 and related lectures,
More informationCOVER SHEET: Problem#: Points
EEL 4712 Midterm 3 Spring 2017 VERSION 1 Name: UFID: Sign here to give permission for your test to be returned in class, where others might see your score: IMPORTANT: Please be neat and write (or draw)
More informationALU, Latches and Flip-Flops
CSE14: Components and Design Techniques for Digital Systems ALU, Latches and Flip-Flops Tajana Simunic Rosing Where we are. Last time: ALUs Plan for today: ALU example, latches and flip flops Exam #1 grades
More informationENEE350 Lecture Notes-Weeks 14 and 15
Pipelining & Amdahl s Law ENEE350 Lecture Notes-Weeks 14 and 15 Pipelining is a method of processing in which a problem is divided into a number of sub problems and solved and the solu8ons of the sub problems
More informationEECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary
EECS50 - Digital Design Lecture - Shifters & Counters February 24, 2003 John Wawrzynek Spring 2005 EECS50 - Lec-counters Page Register Summary All registers (this semester) based on Flip-flops: q 3 q 2
More informationECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University
ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University Prof. Mi Lu TA: Ehsan Rohani Laboratory Exercise #4 MIPS Assembly and Simulation
More informationEECS150 - Digital Design Lecture 25 Shifters and Counters. Recap
EECS150 - Digital Design Lecture 25 Shifters and Counters Nov. 21, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John
More informationFundamentals of Computer Systems
Fundamentals of Computer Systems Review for the Final Stephen A. Edwards Columbia University Summer 25 The Final 2 hours 8 problems Closed book Simple calculators are OK, but unnecessary One double-sided
More informationProfessor Lee, Yong Surk. References. Topics Microprocessor & microcontroller. High Performance Microprocessor Architecture Overview
This lecture was mae by a generous contribution of C & S Technology corporation. (http://( http://www.cnstec.com) There is no copyright on this lecture. Processor Laboratory homepage, (http://( http://mpu.yonsei.ac.kr)
More informationEE 660: Computer Architecture Out-of-Order Processors
EE 660: Computer Architecture Out-of-Order Processors Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa Based on the slides of Prof. David entzlaff Agenda I4 Processors I2O2
More informationDesign at the Register Transfer Level
Week-7 Design at the Register Transfer Level Algorithmic State Machines Algorithmic State Machine (ASM) q Our design methodologies do not scale well to real-world problems. q 232 - Logic Design / Algorithmic
More informationReview: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec 09 Counters Outline.
Review: esigning with FSM EECS 150 - Components and esign Techniques for igital Systems Lec 09 Counters 9-28-0 avid Culler Electrical Engineering and Computer Sciences University of California, Berkeley
More informationIssue = Select + Wakeup. Out-of-order Pipeline. Issue. Issue = Select + Wakeup. OOO execution (2-wide) OOO execution (2-wide)
Out-of-order Pipeline Buffer of instructions Issue = Select + Wakeup Select N oldest, read instructions N=, xor N=, xor and sub Note: ma have execution resource constraints: i.e., load/store/fp Fetch Decode
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.
More informationINF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)
INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder
More informationMenu. Excitation Tables (Bonus Slide) EEL3701 EEL3701. Registers, RALU, Asynch, Synch
Menu Registers >Storage Registers >Shift Registers More LSI Components >Arithmetic-Logic Units (ALUs) > Carry-Look-Ahead Circuitry (skip this) Asynchronous versus Synchronous Look into my... 1 Excitation
More informationCS/COE0447: Computer Organization
CS/COE0447: Computer Organization and Assembly Language Logic Design Review Sangyeun Cho Dept. of Computer Science Logic design? Digital hardware is implemented by way of logic design Digital circuits
More informationComputer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1
Computer Architecture ECE 361 Lecture 5: The Design Process & Design 361 design.1 Quick Review of Last Lecture 361 design.2 MIPS ISA Design Objectives and Implications Support general OS and C- style language
More informationA Second Datapath Example YH16
A Second Datapath Example YH16 Lecture 09 Prof. Yih Huang S365 1 A 16-Bit Architecture: YH16 A word is 16 bit wide 32 general purpose registers, 16 bits each Like MIPS, 0 is hardwired zero. 16 bit P 16
More informationNumbers and Arithmetic
Numbers and Arithmetic See: P&H Chapter 2.4 2.6, 3.2, C.5 C.6 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register file alu
More informationDigital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.
Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience and IEEE Press, January 2003 Nov. 14,
More informationAmdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:
Amdahl's Law Useful for evaluating the impact of a change. (A general observation.) Insight: Improving a feature cannot improve performance beyond the use of the feature Suppose we introduce a particular
More informationReview: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec09 Counters Outline.
Review: Designing with FSM EECS 150 - Components and Design Techniques for Digital Systems Lec09 Counters 9-28-04 David Culler Electrical Engineering and Computer Sciences University of California, Berkeley
More informationLecture 12: Energy and Power. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 12: Energy and Power James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L12 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today a working understanding of
More informationFundamentals of Digital Design
Fundamentals of Digital Design Digital Radiation Measurement and Spectroscopy NE/RHP 537 1 Binary Number System The binary numeral system, or base-2 number system, is a numeral system that represents numeric
More informationVHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES
VHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES 1.Devarasetty Vinod Kumar/ M.tech,2. Dr. Tata Jagannadha Swamy/Professor, Dept of Electronics and Commn. Engineering, Gokaraju Rangaraju
More informationNumbers and Arithmetic
Numbers and Arithmetic See: P&H Chapter 2.4 2.6, 3.2, C.5 C.6 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register file alu
More informationState and Finite State Machines
State and Finite State Machines See P&H Appendix C.7. C.8, C.10, C.11 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register
More informationMicro-architecture Pipelining Optimization with Throughput- Aware Floorplanning
Micro-architecture Pipelining Optimization with Throughput- Aware Floorplanning Yuchun Ma* Zhuoyuan Li* Jason Cong Xianlong Hong Glenn Reinman Sheqin Dong* Qiang Zhou *Department of Computer Science &
More informationSchool of EECS Seoul National University
4!4 07$ 8902808 3 School of EECS Seoul National University Introduction Low power design 3974/:.9 43 Increasing demand on performance and integrity of VLSI circuits Popularity of portable devices Low power
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH
More informationSimple Instruction-Pipelining (cont.) Pipelining Jumps
6.823, L9--1 Simple ruction-pipelining (cont.) + Interrupts Updated March 6, 2000 Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Src1 ( j / ~j ) Src2 ( / Ind) Pipelining Jumps
More informationDepartment of Electrical and Computer Engineering University of Wisconsin - Madison. ECE/CS 752 Advanced Computer Architecture I.
Last (family) name: Solution First (given) name: Student I.D. #: Department of Electrical and Computer Engineering University of Wisconsin - Madison ECE/CS 752 Advanced Computer Architecture I Midterm
More informationFall 2008 CSE Qualifying Exam. September 13, 2008
Fall 2008 CSE Qualifying Exam September 13, 2008 1 Architecture 1. (Quan, Fall 2008) Your company has just bought a new dual Pentium processor, and you have been tasked with optimizing your software for
More informationFall 2011 Prof. Hyesoon Kim
Fall 2011 Prof. Hyesoon Kim Add: 2 cycles FE_stage add r1, r2, r3 FE L ID L EX L MEM L WB L add add sub r4, r1, r3 sub sub add add mul r5, r2, r3 mul sub sub add add mul sub sub add add mul sub sub add
More informationIntroduction EE 224: INTRODUCTION TO DIGITAL CIRCUITS & COMPUTER DESIGN. Lecture 6: Sequential Logic 3 Registers & Counters 5/9/2010
EE 224: INTROUCTION TO IGITAL CIRCUITS & COMPUTER ESIGN Lecture 6: Sequential Logic 3 Registers & Counters 05/10/2010 Avinash Kodi, kodi@ohio.edu Introduction 2 A Flip-Flop stores one bit of information
More informationEECS150 - Digital Design Lecture 27 - misc2
EECS150 - Digital Design Lecture 27 - misc2 May 1, 2002 John Wawrzynek Spring 2002 EECS150 - Lec27-misc2 Page 1 Outline Linear Feedback Shift Registers Theory and practice Simple hardware division algorithms
More informationTable of Content. Chapter 11 Dedicated Microprocessors Page 1 of 25
Chapter 11 Dedicated Microprocessors Page 1 of 25 Table of Content Table of Content... 1 11 Dedicated Microprocessors... 2 11.1 Manual Construction of a Dedicated Microprocessor... 3 11.2 FSM + D Model
More informationCS/COE0447: Computer Organization
Logic design? CS/COE0447: Computer Organization and Assembly Language Logic Design Review Digital hardware is implemented by way of logic design Digital circuits process and produce two discrete values:
More informationToday. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of
ESE532: System-on-a-Chip Architecture Day 20: November 8, 2017 Energy Today Energy Today s bottleneck What drives Efficiency of Processors, FPGAs, accelerators How does parallelism impact energy? 1 2 Message
More informationCMP 334: Seventh Class
CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative
More informationSISD SIMD. Flynn s Classification 8/8/2016. CS528 Parallel Architecture Classification & Single Core Architecture C P M
8/8/26 S528 arallel Architecture lassification & Single ore Architecture arallel Architecture lassification A Sahu Dept of SE, IIT Guwahati A Sahu Flynn s lassification SISD Architecture ategories M SISD
More informationCHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.
CHPTER 9 2008 Pearson Education, Inc. 9-. log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C 7 Z = F 7 + F 6 + F 5 + F 4 + F 3 + F 2 + F + F 0 N = F 7 9-3.* = S + S = S + S S S S0 C in C 0 dder
More informationState & Finite State Machines
State & Finite State Machines Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Appendix C.7. C.8, C.10, C.11 Big Picture: Building a Processor memory inst register file
More informationEECS Components and Design Techniques for Digital Systems. FSMs 9/11/2007
EECS 150 - Components and Design Techniques for Digital Systems FSMs 9/11/2007 Sarah Bird Electrical Engineering and Computer Sciences University of California, Berkeley Slides borrowed from David Culler
More informationGATE 2014 A Brief Analysis (Based on student test experiences in the stream of CS on 1 st March, Second Session)
GATE 4 A Brief Analysis (Based on student test experiences in the stream of CS on st March, 4 - Second Session) Section wise analysis of the paper Mark Marks Total No of Questions Engineering Mathematics
More informationEEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis
EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture Rajeevan Amirtharajah University of California, Davis Outline Announcements Review: PDP, EDP, Intersignal Correlations, Glitching, Top
More informationEXPERIMENT Bit Binary Sequential Multiplier
12.1 Objectives EXPERIMENT 12 12. -Bit Binary Sequential Multiplier Introduction of large digital system design, i.e. data path and control path. To apply the above concepts to the design of a sequential
More informationCS470: Computer Architecture. AMD Quad Core
CS470: Computer Architecture Yashwant K. Malaiya, Professor malaiya@cs.colostate.edu AMD Quad Core 1 Architecture Layers Building blocks Gates, flip-flops Functional bocks: Combinational, Sequential Instruction
More information