Qualitative vs Quantitative metrics

Similar documents
CS 700: Quantitative Methods & Experimental Design in Computer Science

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

CPU Scheduling. CPU Scheduler

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Lecture 2: Metrics to Evaluate Systems

Measurement & Performance

Measurement & Performance

Introduction to statistics

Physics E-1ax, Fall 2014 Experiment 3. Experiment 3: Force. 2. Find your center of mass by balancing yourself on two force plates.

What are these standards? Who decides what they are? How many Standards should we have?

CSE 380 Computer Operating Systems

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

ww.padasalai.net

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling

Newton s Cooling Model in Matlab and the Cooling Project!

CPU Scheduling. Heechul Yun

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis

Chapter 6: CPU Scheduling

MAT137 - Term 2, Week 2

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design Boolean Algebra, Logic Gates

CMP 338: Third Class

Heat and Temperature

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson

BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power

Outline. policies for the first part. with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014

Module 5: CPU Scheduling

CPU SCHEDULING RONG ZHENG

An Overview of HPC at the Met Office

Every physical or chemical change in matter involves a change in energy.

EXPERIMENT 2 Reaction Time Objectives Theory

ECE/CS 250 Computer Architecture

CPU scheduling. CPU Scheduling

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Last class: Today: Threads. CPU Scheduling

CHAPTER 5 - PROCESS SCHEDULING

Novel Devices and Circuits for Computing

Quantum Computation. Dr Austin Fowler Centre for Quantum Computer Technology. New Scientist, 10/11/07

Performance of Computers. Performance of Computers. Defining Performance. Forecast

Knowledge Discovery and Data Mining 1 (VO) ( )

Random Bit Generation

Heat Transfer. Thermal energy

Simulation of Process Scheduling Algorithms

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

Summarizing Measured Data

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

2 Standards of Measurement

CS425: Algorithms for Web Scale Data

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

Computer Architecture

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Please bring the task to your first physics lesson and hand it to the teacher.

Lecture 5: Performance (Sequential) James C. Hoe Department of ECE Carnegie Mellon University

One Optimized I/O Configuration per HPC Application

CS162 Operating Systems and Systems Programming Lecture 17. Performance Storage Devices, Queueing Theory

P214 Efficient Computation of Passive Seismic Interferometry

The Nature of Science

ECE/CS 250: Computer Architecture. Basics of Logic Design: Boolean Algebra, Logic Gates. Benjamin Lee

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

Infrared Experiments of Thermal Energy and Heat Transfer

LSN 15 Processor Scheduling

Conducting Energy and Heat. Energy Likes to Move. Radiating Energy

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Computational Complexity. This lecture. Notes. Lecture 02 - Basic Complexity Analysis. Tom Kelsey & Susmit Sarkar. Notes

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

Lab 12.1 Calorimetry. Name School Date

Chapter: Measurement

Chemistry I Chapter 3 Scientific Measurement

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures

REPORT, RE0813, MIL, ENV, 810G, TEMP, IN-HOUSE, , PASS

Opleiding Informatica

Parallel Transposition of Sparse Data Structures

The Gram-Schmidt Process

A Tale of Two Erasure Codes in HDFS

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis

What does temperature have to do with energy? What three temperature scales are commonly used? What makes things feel hot or cold?

Good morning everyone, and welcome again to MSLs World Metrology Day celebrations.

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Conceptual Chemistry

Reliability at Scale

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines

Special Theory of Relativity Prof. Dr. Shiva Prasad Department of Physics Indian Institute of Technology, Bombay

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Small Data, Mid data, Big Data vs. Algebra, Analysis, and Topology

MITOCW watch?v=vjzv6wjttnc

2nd Grade Matter

SCI 531: AP Physics 1

Chapter: Measurement

Achilles: Now I know how powerful computers are going to become!

Experiment 2. Reaction Time. Make a series of measurements of your reaction time. Use statistics to analyze your reaction time.

CS 550 Operating Systems Spring CPU scheduling I

From Non-Negative Matrix Factorization to Deep Learning

You need your Calculator!

Transcription:

Qualitative vs Quantitative metrics Quantitative: hard numbers, measurable Time, Energy, Space Signal-to-Noise, Frames-per-second, Memory Usage Money (?) Qualitative: feelings, opinions Complexity: Simple, Tricky Design Effort: Easy, Hard User Experience: snappy, intuitive, pretty Try to use quantitative measurements when possible Repeatable: all measurements yield the same result Verifiable: third-parties should get the same result HPCE / dt10 / 2014 / 13.1

Types of scale Nominal: categories with no intrinsic order Apples, Oranges, Pears Ordinal: categories which can be compared Bad, Ok, Good Interval: numbers with meaningful differences Year Ratio: numbers with differences and a meaningful zero Frames per second HPCE / dt10 / 2014 / 13.2

Choosing scales A metric can be defined in terms of many scales Temperature Nominal: Balmy, Temperate Ordinal: Cold, Chilly, Warm, Hot Interval: Degrees Fahrenheit Ratio: Degrees Kelvin Can create mapping from ratio to other scales Fahrenheit = 9 * K / 5 459.67 Cold = {K<280}, Chilly = {K>=280 && K<=288},... Balmy = {K>293 && K<300}; Temperate = {K>288 && K<=293} Can t go from nominal to ratio without inventing information Cold represents a range, but Kelvin is a point HPCE / dt10 / 2014 / 13.3

The joys of ratio scales Easy to see which system is better If metric(a) < metric(b) then B is better than A Easy to see by how much the system is better A is better than B by a factor metric(a)/metric(b) We can define meaningful combination metrics metric(x) = max[metric(a),metric(b)] metric(y) = metric(a) + metric(b) metric(z) = sqrt[ metric(a) 2 + metric(b) 2 ] HPCE / dt10 / 2014 / 13.4

Averaging of ratio metrics Many metrics are formed from a number of measurements Benchmarks: perform the same task on many different inputs Smoothing: remove statistical noise from different runs Ratio metrics can be meaningfully averaged But usually not using the arithmetic mean Geometric mean g(x) Good for combining values on different scales Useful for summarising improvements Harmonic mean h(x) Sometimes useful for averaging rates Usually better to stick to geometric a( x) g( x) h( x) h( x) 1 n n n i 1 n i 1 n n 1 x i 1 x x g( x) a( x) i i i How not to lie with statistics: the correct way to summarize benchmark results, P. Fleming, J. Wallace, 1986, http://dx.doi.org/10.1145%2f5666.5673 HPCE / dt10 / 2014 / 13.5

Execution Time How long does it take to complete some task X? Fundamental property of a compute system Scale is elapsed time: seconds, hours, years(!) Ratio scale: can t take less than zero seconds Ratio scale: one second is 2x better than two seconds Need to carefully define what the components mean What precisely is task X? When does timing start, when does it stop? What sort of time is being measured Wall-clock time: e.g. human with a stop-watch CPU time: only measure time when the task was executing HPCE / dt10 / 2014 / 13.6

Measuring elapsed time This is often surprisingly difficult! We need some sort of timer as a base Precision: Smallest representable time difference Resolution: Smallest time difference you can measure Accuracy: How reliable is the timer Timer can be part of the system, or external External: you with a stop-watch, or pulse timer Internal: the system itself records time during execution Internal timers are convenient, but tricky Performing timing may change the systems performance! Timer may not be consistent across the system HPCE / dt10 / 2014 / 13.7

Time Execution Time HPCE / dt10 / 2014 / 13.8

S=time() E=time() print(e-s) HPCE / dt10 / 2014 / 13.9

o+r o+r S=time() o+r o+3r E=time() print(e-s) r o+3r o+4r o+4r HPCE / dt10 / 2014 / 13.10

o+r o+r o+3r S=time() E=time() print(e-s) o+r r S=time() E=time() print(e-s) 0 o+3r o+4r o+4r HPCE / dt10 / 2014 / 13.11

o+r S=time() o+r S=time() o o+r o+3r o+4r E=time() print((e-s)/3) 2r/3 E=time() print((e-s)/3) o+3r r o+3r o+4r HPCE / dt10 / 2014 / 13.12

o+r o+r o+3r o+4r o+5r o+6r S=time() E=time() print(e-s) o+4r 2r S=time() E=time() print(e-s) o+3r o+4r r o+3r o+4r o+5r o+6r o+7r o+7r HPCE / dt10 / 2014 / 13.13

Internal timers Most platforms provide calls which will return elapsed time Windows Linux x86 CPU Timer QueryPerforman cecounter() clock() RDTSC Resolution micro-second milli-second nano-second Measures Wall-Clock CPU-Time Clock-Cycles Consistency Entire System Current Thread Current CPU HPCE / dt10 / 2014 / 13.14

Problems when measuring time Start-up overhead: the first execution usually takes longer Filling caches; compiling code; performing initialisation Cold-start: restart system; time first execution of task Warm-start: execute task once; measure the second execution Interference: other tasks are running on the system Multi-user system: other users may execute system Operating-system tasks: e.g. disk utilities affect performance Intrinsic variability: computers are not very deterministic Cache contents differ significantly on each run Unpredictable scheduling of thread by system Interaction between resolution of timer and length of task General policy: measure over multiple executions and average HPCE / dt10 / 2014 / 13.15

Throughput At what rate can task X be executed Unit is executions per time Throughput is not just the inverse of elapsed time Elapsed time is latency time for a single task Throughput is the execution rate for many tasks There is often a trade-off between latency and throughput Parallel systems can do lots of tasks in parallel: good throughput Difficult to split a single task into many pieces: poor latency Frames-per-second our obvious throughput example Gaming: need high frames/second and low latency Video: need high frames/second, but latency is not important HPCE / dt10 / 2014 / 13.16

Energy Definitions of energy consumption are application specific How much energy is used while executing task X? What is the average power of a system capable of task X? What is the peak power of a system executing task X? Closely related to temperature metrics How hot does the device get while doing X? What are the cooling requirements for continually doing X? Computers must consume energy: Landauer s principle Any bit-wise operation (e.g. and, or) requires k*t*log(2) joules k: Boltzmann constant; T: circuit temperature e.g. a task requires 2 64 bit-wise operations at 60 degrees C 3.2*10-21 joules/bit -> ~0.05 joules We re not that close to the limit yet... HPCE / dt10 / 2014 / 13.17

Memory constraints All levels of storage: memory, cache, disk All the working memory needed during calculation May include input and output storage: are they the same place? Memory constraints are balanced against communication costs There is an explicit size/bandwidth constraint Level Capacity Bandwidth Remote Disk (SAN) 10 EB 100 MB/sec Local Disk (SSD) 10 TB 500 MB/sec Memory (GDDR) 8 GB 300 GB/sec Cache (local mem) 128 KB 2 TB/sec Often we will trade-off computation versus storage Or programmer effort versus IO HPCE / dt10 / 2014 / 13.18

Quality There is often a space of answers: how good is any given one? Compression quality Signal to noise ratio Qualitative quality metrics can often be made quantitative User-interface responsiveness -> Input to display latency Image quality -> root-mean-square-error against reference model If quality is a ratio measure, then there exists a perfect answer But: realities of floating-point maths means this may not happen Understanding quality is critical for high performance Achieving the perfect answer is often computationally intractable Usually a trade-off between quality and speed Need to choose the best tradeoff between the two HPCE / dt10 / 2014 / 13.19

Making sure metrics are meaningful Some things are quantifiable, but not very useful CPU performance: MHz is not the same as performance Cameras: Mega-Pixels is not the same as quality Consistent and quantifiable metrics provide open competition Suppliers of systems always want to use the best metrics Metrics should be defined by users, not suppliers People will optimise for metrics (it s what they are for!) Poor metrics lead to poor design and optimisation Part of the specification problem HPCE / dt10 / 2014 / 13.20