Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1
Previous Lecture CPU Evolution What is? 2
Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in the design of computers 3
Major Design Challenges Power CPU time Memory latency/bandwidth Storage latency/bandwidth Transactions per second Intercommunication Dependability Power Performance Communication Everything Looks a Little Different 4
Power Consumption Charge external capacitance Discharge external capacitance Q = C L V DD R n V DD R p current E dynamic = Q V DD = C L V DD 2 current C L 0V C L 0V V DD ½ E d thermal energy on R P ½ E d stored on C L (since E CL = ½ C L V DD2 ) ½ E dynamic stored on C L becomes thermal energy on R N P dynamic = ½ C L V DD 2 frequency 5
Measuring Power 6
Power and Energy Energy to complete operation (Joules) Corresponds approximately to battery life (Battery energy capacity actually depends on rate of discharge) Peak power dissipation (Watts = Joules/second) Affects packaging (power and ground pins, thermal design) di/dt, peak change in supply current (Amps/second) Affects power supply noise (power and ground pins, decoupling capacitors) 7
Peak Power versus Lower Energy Peak A Power Peak B Integrate power curve to get energy Time System A has higher peak power, but lower total energy System B has lower peak power, but higher total energy 8
Measuring Reliability (Dependability) 10 9 (MTBF = MTTF + MTTR) MTTF = 1,000,000 hours FIT =? 9
Comparing design alternatives 10
Benchmark Suites (SPEC = Standard Performance Evaluation Corporation) SPECrate, SPECWeb 11
Summarizing performance 12
Summarizing performance (cont.) running time of programs 13
Summarizing performance (cont.) Used by SPEC98, SPEC92, SPEC95,, SPEC2006 14
Pros and cons of geometric means 15
Qualitative principles of design (Spatial and temporal locality) 16
Qualitative principles of design (cont.) 17
Amdahl s Law fraction fraction Best possible: 18
Amdahl s Law example New CPU 10X faster I/O bound server, so 60% time waiting for I/O Speedup overall 1 1 Fraction 1 0.4 0.4 10 1 enhanced Fraction Speedup 1 0.64 1.56 enhanced enhanced Apparently, its human nature to be attracted by 10X faster, vs. keeping in perspective its just 1.6X faster 19
Computer Performance CPU Performance s s 20
Cycles Per Instruction (Throughput) CPU Performance Average Cycles per Instruction CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count CPU time Cycle Time n CPI j I j1 j CPI n CPI j1 j F j where F j I Instruction j Count Instruction Frequency 21
Example: Calculating CPI bottom up CPU Performance Run benchmark and collect workload characterization (simulate, machine counters, or sampling) Base Machine (Reg / Reg) Op Freq CPI i F*CPI i (% Time) ALU 50% 1.5 (33%) Load 20% 2.4 (27%) Store 10% 2.2 (13%) Branch 20% 2.4 (27%) Typical Mix of instruction types in program Design guideline: Make the common case fast MIPS 1% rule: only consider adding an instruction if it is shown to add 1% performance improvement on reasonable benchmarks. 1.5 22
Processor Performance 23
TPM (Transactions Per Minute) TPM *$1000 / cost Price/performance What about maintenance and power? 24
Conclusion 25
WB Data Next Lecture : Pipelining Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc Memory Access Write Back Next PC Next SEQ PC Next SEQ PC MUX 4 Adder RS1 Zero? Address IR <= mem[pc]; PC <= PC + 4 A <= Reg[IR rs ]; B <= Reg[IR rt ] Memory IF/ID RS2 Imm Reg File Sign Extend ID/EX MUX MUX ALU EX/MEM RD RD RD Data Memory MEM/WB MUX rslt <= A op IRop B WB <= rslt Reg[IR rd ] <= WB Data stationary control local decode for each instruction phase / pipeline stage 26