Computer Architecture

Similar documents
PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Lecture 2: Metrics to Evaluate Systems

CMP 338: Third Class

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

CMP N 301 Computer Architecture. Appendix C

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

CMP 334: Seventh Class

Microprocessor Power Analysis by Labeled Simulation

Measurement & Performance

Measurement & Performance

Performance of Computers. Performance of Computers. Defining Performance. Forecast

CS 700: Quantitative Methods & Experimental Design in Computer Science

Lecture: Pipelining Basics

Goals for Performance Lecture

Lecture 3, Performance

Lecture 3, Performance

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Project Two RISC Processor Implementation ECE 485

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

ECE 3401 Lecture 23. Pipeline Design. State Table for 2-Cycle Instructions. Control Unit. ISA: Instruction Specifications (for reference)

CSE Computer Architecture I

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

3. (2) What is the difference between fixed and hybrid instructions?

Processor Design & ALU Design

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

[2] Predicting the direction of a branch is not enough. What else is necessary?

ICS 233 Computer Architecture & Assembly Language

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

[2] Predicting the direction of a branch is not enough. What else is necessary?

CSCI-564 Advanced Computer Architecture

4. (3) What do we mean when we say something is an N-operand machine?

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?

EC 413 Computer Organization

CPSC 3300 Spring 2017 Exam 2

Enrico Nardelli Logic Circuits and Computer Architecture

Review: Single-Cycle Processor. Limits on cycle time

Simple Instruction-Pipelining. Pipelined Harvard Datapath

Performance, Power & Energy

ENEE350 Lecture Notes-Weeks 14 and 15

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

Verilog HDL:Digital Design and Modeling. Chapter 11. Additional Design Examples. Additional Figures

CPU DESIGN The Single-Cycle Implementation

CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.

Simple Instruction-Pipelining. Pipelined Harvard Datapath

Computer Architecture ELEC2401 & ELEC3441

Lecture 13: Sequential Circuits, FSM

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati

L07-L09 recap: Fundamental lesson(s)!

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

Lecture 13: Sequential Circuits, FSM

Designing MIPS Processor

Lecture 12: Energy and Power. James C. Hoe Department of ECE Carnegie Mellon University

UNIVERSITY OF WISCONSIN MADISON

Implementing the Controller. Harvard-Style Datapath for DLX

Modern Computer Architecture

Digital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.


COVER SHEET: Problem#: Points

A glance on the analytical model of power-performance performance trade-off in VLSI microprocessor design. Sapienza University of Rome

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

A Second Datapath Example YH16

Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1

CprE 281: Digital Logic

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

Summarizing Measured Data

Lecture 2: CMOS technology. Energy-aware computing

ALU A functional unit

Introduction to CMOS VLSI Design Lecture 1: Introduction

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

CS61C : Machine Structures

Chapter 8. Low-Power VLSI Design Methodology

Figure 4.9 MARIE s Datapath

Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995

CS61C : Machine Structures

Summarizing Measured Data

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

LECTURE 28. Analyzing digital computation at a very low level! The Latch Pipelined Datapath Control Signals Concept of State

2

Practice Homework Solution for Module 4

Implications on the Design

TEST 1 REVIEW. Lectures 1-5

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

CMOS Digital Integrated Circuits Lec 13 Semiconductor Memories

Basic Computer Organization and Design Part 3/3

COMP9334: Capacity Planning of Computer Systems and Networks

Topics: A multiple cycle implementation. Distributed Notes

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it...

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati

Department of Electrical and Computer Engineering The University of Texas at Austin

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary

Unit 6: Branch Prediction

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory

From Sequential Circuits to Real Computers

CIS 371 Computer Organization and Design

Transcription:

Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1

Previous Lecture CPU Evolution What is? 2

Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in the design of computers 3

Major Design Challenges Power CPU time Memory latency/bandwidth Storage latency/bandwidth Transactions per second Intercommunication Dependability Power Performance Communication Everything Looks a Little Different 4

Power Consumption Charge external capacitance Discharge external capacitance Q = C L V DD R n V DD R p current E dynamic = Q V DD = C L V DD 2 current C L 0V C L 0V V DD ½ E d thermal energy on R P ½ E d stored on C L (since E CL = ½ C L V DD2 ) ½ E dynamic stored on C L becomes thermal energy on R N P dynamic = ½ C L V DD 2 frequency 5

Measuring Power 6

Power and Energy Energy to complete operation (Joules) Corresponds approximately to battery life (Battery energy capacity actually depends on rate of discharge) Peak power dissipation (Watts = Joules/second) Affects packaging (power and ground pins, thermal design) di/dt, peak change in supply current (Amps/second) Affects power supply noise (power and ground pins, decoupling capacitors) 7

Peak Power versus Lower Energy Peak A Power Peak B Integrate power curve to get energy Time System A has higher peak power, but lower total energy System B has lower peak power, but higher total energy 8

Measuring Reliability (Dependability) 10 9 (MTBF = MTTF + MTTR) MTTF = 1,000,000 hours FIT =? 9

Comparing design alternatives 10

Benchmark Suites (SPEC = Standard Performance Evaluation Corporation) SPECrate, SPECWeb 11

Summarizing performance 12

Summarizing performance (cont.) running time of programs 13

Summarizing performance (cont.) Used by SPEC98, SPEC92, SPEC95,, SPEC2006 14

Pros and cons of geometric means 15

Qualitative principles of design (Spatial and temporal locality) 16

Qualitative principles of design (cont.) 17

Amdahl s Law fraction fraction Best possible: 18

Amdahl s Law example New CPU 10X faster I/O bound server, so 60% time waiting for I/O Speedup overall 1 1 Fraction 1 0.4 0.4 10 1 enhanced Fraction Speedup 1 0.64 1.56 enhanced enhanced Apparently, its human nature to be attracted by 10X faster, vs. keeping in perspective its just 1.6X faster 19

Computer Performance CPU Performance s s 20

Cycles Per Instruction (Throughput) CPU Performance Average Cycles per Instruction CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count CPU time Cycle Time n CPI j I j1 j CPI n CPI j1 j F j where F j I Instruction j Count Instruction Frequency 21

Example: Calculating CPI bottom up CPU Performance Run benchmark and collect workload characterization (simulate, machine counters, or sampling) Base Machine (Reg / Reg) Op Freq CPI i F*CPI i (% Time) ALU 50% 1.5 (33%) Load 20% 2.4 (27%) Store 10% 2.2 (13%) Branch 20% 2.4 (27%) Typical Mix of instruction types in program Design guideline: Make the common case fast MIPS 1% rule: only consider adding an instruction if it is shown to add 1% performance improvement on reasonable benchmarks. 1.5 22

Processor Performance 23

TPM (Transactions Per Minute) TPM *$1000 / cost Price/performance What about maintenance and power? 24

Conclusion 25

WB Data Next Lecture : Pipelining Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc Memory Access Write Back Next PC Next SEQ PC Next SEQ PC MUX 4 Adder RS1 Zero? Address IR <= mem[pc]; PC <= PC + 4 A <= Reg[IR rs ]; B <= Reg[IR rt ] Memory IF/ID RS2 Imm Reg File Sign Extend ID/EX MUX MUX ALU EX/MEM RD RD RD Data Memory MEM/WB MUX rslt <= A op IRop B WB <= rslt Reg[IR rd ] <= WB Data stationary control local decode for each instruction phase / pipeline stage 26