Summarizing Measured Data
|
|
- Edmund Newton
- 5 years ago
- Views:
Transcription
1 Summarizing Measured Data Dr. John Mellor-Crummey Department of Computer Science Rice University COMP 528 Lecture 7 3 February 2005
2 Goals for Today Finish discussion of Normal Distribution and its properties Finish material on summarizing measured data Solve a problem using PMF 2
3 Normal Distribution N(µ,σ) most commonly used distribution in data analysis pdf = f (x) = 1 " 2# e$(x$µ ) 2 / 2" 2,$% & x & % µ = mean σ = std dev (also known as a Gaussian distribution) N(µ=0,σ=1) unit normal distribution pdf(x) f (x) = 1 2" e#x 2 / 2 3
4 Quantile, Percentile, Median & Mode α -quantile: the x value at which the CDF takes value α denoted as x α P(x " x # ) = F(x # ) = # 100α -percentile: the x value at which the CDF takes value α Median = 50-percentile =.5-quantile Mode = most likely value for a discrete variable, the x i that has the highest probability for a continuous variable, the x where pdf is maximum 4
5 Quantiles of the Normal Distribution z α : α -quantile of the unit normal variate z ~ N(0,1) If x has a normal distribution: x ~ N(µ,σ) PDF N(0,1) P( x " µ # $ z a ) = % or equivalently, CDF N(0,1) P(x " µ + #z a ) = $.8-quantile, 80-percentile.5-quantile, 50-percentile 5
6 Properties of the Normal Distribution Linearity sum of n independent normal variates is a normal variate if x i ~ N(µ i, σ i ), then x = " n a i x i=1 i has a normal distribution with mean and variance µ = " n a i µ i=1 i n # " 2 = a i i=1 2 2 µ i 6
7 Central Limit Theorem Sum of a large number of independent observations from any distribution tends to have a normal distribution true for observations from all distributions thus, experimental errors, which arise from many factors, are modeled with a normal distribution 7
8 Means and Their Uses 8
9 Arithmetic Mean arithmetic mean of values {x 1,x 2,,x n } x = 1 n " n x i=1 i Caution: arithmetic mean is not always appropriate index of central tendency Is data categorical? no Is total of interest? no Is distribution skewed? no use mean use mode use mean use median Median = 50th percentile value Mode = most frequent e.g. most frequent destination for packets 9
10 Common Misuses of Arithmetic Means Mean of significantly different values correct index, but useless nonetheless not useful: mean CPU time is 505ms when values are 10ms and 1000ms Using mean without considering skew if variability is too large, mean may not be a representative value e.g. mean({5,5,5,4,31}) = 10 : typical value is 5, mean is useless Multiplying arithmetic means to get the mean of a product the mean of a product of random variables is only equal to the product of the means if values of the variables are independent 10
11 Geometric Mean Geometric mean of a sample {x 1, x 2,, x n } x = n " x i i=1 Arithmetic mean vs. geometric mean geometric: if product of terms is of interest arithmetic: if sum of observations is of interest Examples of metrics that work in a multiplicative manner cache miss ratios over several levels of cache L3misses = Loads * L1missrate * L2missrate * L3missrate Avg miss rate per level = (L1missrate * L2missrate * L3missrate) 1/3 percentage improvement between successive versions average error rate per hop in multi-hop network # % $ & ( ' 1/ n 11
12 Harmonic Mean Harmonic mean of a sample {x 1, x 2,, x n } x = Use whenever an arithmetic mean can be justified for 1/x i Example: MIPS rate suppose benchmark has m million instructions MIPS rate x i from ith repetition is m/t i avg. time: use arith. mean, since avg. time has physical meaning avg MIPS for multiple runs of one benchmark: harmonic mean n 1/ x 1 +1/ x / x n (sum of 1/x i has physical meaning) x = 1 m /t1 n + 1 m /t m /t n = m (1/n)(t 1 + t t n ) 12
13 Mean of Ratios Problem: given a set of n ratios, summarize them as a single number Example summarize MIPS rate for a processor for different workloads harmonic mean unsuitable " has no meaning Approach: i t i /m i consider additivity of numerators and denominators separately 13
14 Rules for Means of Ratios - I If numerator and denominator each have meaning compute average of ratios as ratio of averages e.g. average MIPS for different workloads Average( m 1 t 1, m 2 t 2,..., m n t n ) = e.g. mean CPU utilization = If denominator is a constant and numerator has meaning " e.g. resource utilization per constant interval (page faults over one hour intervals) i= n m i=1 i i= n t i=1 i " = m t Average( p 1 t, p 2 t,..., p n t ) = sum of CPU busy times sum of measurement durations " i= n p i=1 i nt 14
15 Rules for Means of Ratios - II If numerator is constant and denominator has meaning harmonic mean of the ratios should be used to summarize them e.g. computing mean MIPS rate for processor using n observations of same benchmark Average( m t 1, m t 2,..., m t n ) = If numerator and denominator ~ follow multiplicative property i.e. a i = cb i, where c is approximately a constant being estimated estimate c from geometric mean of a i /b i n t 1 /m + t 2 /m t n /m = nm " n t i=1 i 15
16 SPEC Metrics? The elapsed time in seconds for each of the benchmarks in the CINT2000 or CFP2000 suite is given and the ratio to the reference machine (Sun Ultra 10) is calculated. How should one compute a summary ratio? The SPECint_base2000 and SPECfp_base2000 metrics are calculated as a Geometric Mean of the individual ratios, where each ratio is based on the median execution time from an odd number of runs, greater than or equal to 3. 16
17 Code Size Optimization with a GA Cooper, Schielke and Subrarnanian, LCTES 99 How should one compute a summary ratio? 17
18 Summarizing Variability 18
19 Selecting the Index of Dispersion Is the Distribution Bounded yes Use range Is the Distribution unimodal, symmetrical yes Use C.O.V. use percentiles or SIQR 19
20 Determining Distribution of Data Can summarize data by its average variability More complete summary: type of distribution e.g. number of I/O calls uniformly distributed 1-25 more meaningful than mean 13, variance is 48 Distribution useful for simulation or analytical modeling How to determine distribution? determine range, divide into cells, plot histogram of observations guideline: if cell has < 5 observations, increase cell size or use variable cell size histogram quantile-quantile plot 20
21 Quantile-Quantile Plots Compare observed quantiles with those of theoretical distribution Suppose y (j) is the observed α j quantile sort observations, α quantile is x [α(n-1)+1] Use the theoretical distribution to compute α j quantile x j to determine x j, need to invert CDF: α j = F(x j ); then x j = F-1 (α j ) if CDF is invertible, then great! if not, use tables and interpolate, or compute iteratively Plot (x j, y (j) ) If the observations come from the theoretical distribution, the quantile-quantile plot will be linear 21
22 Using Quantile-Quantile Plots Difference between measured and predicted values on a system is modeling error Modeling error for 8 predictions {-.04,-.19,.14,-.09,-.14,.19,.09,.04} j α j = (j-.5)/n y j x j 1 1/16 = /16 = /16 = /16 = /16 = /16 = /16 = /16 = CDF for N(0,1) 22
23 Using Quantile-Quantile Plots Difference between measured and predicted values on a system is modeling error Modeling error for 8 predictions {-.04,-.19,.14,-.09,-.14,.19,.09,.04} x j =4.91[α j (1- α j ) 0.14 ] approximates inversion of N(0,1) j α j = (j-.5)/n 1/16 = /16 = /16 = /16 = /16 = /16 = /16 = /16 =.9375 y j x j CDF for N(0,1) 23
24 Using Quantile-Quantile Plots Difference between measured and predicted values on a system is modeling error Modeling error for 8 predictions {-.04,-.19,.14,-.09,-.14,.19,.09,.04} j 1 α j = j-.5/n.0625 y i -.19 x i
25 Interpreting Normal Quantile-Quantile Plots Normal Long tails Assymmetric Short tails 25
26 Working with PMF Traffic arriving at a gateway is bursty. The burst size is distributed geometrically with the following PMF f (x) = (1" p) x"1 p x = 1, 2,, Compute the mean burst size Compute the variance of the burst size Compute the standard deviation of the burst size 26
Summarizing Measured Data
Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,
More informationSummarizing Measured Data
Performance Evaluation: Summarizing Measured Data Hongwei Zhang http://www.cs.wayne.edu/~hzhang The object of statistics is to discover methods of condensing information concerning large groups of allied
More informationPerformance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu
Performance Metrics for Computer Systems CASS 2018 Lavanya Ramapantulu Eight Great Ideas in Computer Architecture Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance
More informationSUMMARIZING MEASURED DATA. Gaia Maselli
SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability
More informationComparing Systems Using Sample Data
Comparing Systems Using Sample Data Dr. John Mellor-Crummey Department of Computer Science Rice University johnmc@cs.rice.edu COMP 528 Lecture 8 10 February 2005 Goals for Today Understand Population and
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More informationIntroduction to statistics
Introduction to statistics Literature Raj Jain: The Art of Computer Systems Performance Analysis, John Wiley Schickinger, Steger: Diskrete Strukturen Band 2, Springer David Lilja: Measuring Computer Performance:
More informationIAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES
IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the
More informationChapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations
Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models
More informationCS 700: Quantitative Methods & Experimental Design in Computer Science
CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,
More informationChapter 4: Continuous Random Variables and Probability Distributions
Chapter 4: and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 (Slide 1 of 37) Chapter Overview Continuous random variables
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number
More informationII. The Normal Distribution
II. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationMeasurement & Performance
Measurement & Performance Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law Topics 2 Page The Nature of Time real (i.e. wall clock) time = User Time: time spent
More informationMeasurement & Performance
Measurement & Performance Topics Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law 2 The Nature of Time real (i.e. wall clock) time = User Time: time spent executing
More informationSummary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1
Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction
More informationSTAT Chapter 5 Continuous Distributions
STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific
More informationMath 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.
Math 8A Lecture 6 Friday May 7 th Epectation Recall the three main probability density functions so far () Uniform () Eponential (3) Power Law e, ( ), Math 8A Lecture 6 Friday May 7 th Epectation Eample
More informationTwo-Factor Full Factorial Design with Replications
Two-Factor Full Factorial Design with Replications Dr. John Mellor-Crummey Department of Computer Science Rice University johnmc@cs.rice.edu COMP 58 Lecture 17 March 005 Goals for Today Understand Two-factor
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationEEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 11
EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture
More informationDescriptive Univariate Statistics and Bivariate Correlation
ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to
More informationThis exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.
TEST #3 STA 536 December, 00 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. You will have access to a copy
More informationModule 3. Function of a Random Variable and its distribution
Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given
More information200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
Ana Jerončić 200 participants [EUR] about half (71+37=108) 200 = 54% of the bills are small, i.e. less than 30 EUR (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
More informationLearning Objectives for Stat 225
Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:
More informationExpectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom
Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Be able to compute and interpret expectation, variance, and standard
More informationLast Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics
Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different
More informationPreliminary Statistics. Lecture 3: Probability Models and Distributions
Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationCIVL 7012/8012. Collection and Analysis of Information
CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationQuelques éléments pour l expérimentation en informatique. Mescal
Quelques éléments pour l expérimentation en informatique Jean-Marc.Vincent@imag.fr Mescal 1 / 53 Plan de l exposé 1 Introduction 2 Experimentation 3 Analysis of Experiments 4 Synthesis 2 / 53 Plan de l
More informationLecture 2. Descriptive Statistics: Measures of Center
Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences
More informationMidrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used
Measures of Central Tendency Mode: most frequent score. best average for nominal data sometimes none or multiple modes in a sample bimodal or multimodal distributions indicate several groups included in
More informationSTAT 200 Chapter 1 Looking at Data - Distributions
STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationP8130: Biostatistical Methods I
P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data
More informationChapter 2 Descriptive Statistics
Chapter 2 Descriptive Statistics Lecture 1: Measures of Central Tendency and Dispersion Donald E. Mercante, PhD Biostatistics May 2010 Biostatistics (LSUHSC) Chapter 2 05/10 1 / 34 Lecture 1: Descriptive
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More informationBNG 495 Capstone Design. Descriptive Statistics
BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus
More informationSTATISTICS 1 REVISION NOTES
STATISTICS 1 REVISION NOTES Statistical Model Representing and summarising Sample Data Key words: Quantitative Data This is data in NUMERICAL FORM such as shoe size, height etc. Qualitative Data This is
More informationChapter 1 - Lecture 3 Measures of Location
Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What
More informationEEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19
EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture
More informationEEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 18
EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 18 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture
More information2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS
Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics
More informationModeling Uncertainty in the Earth Sciences Jef Caers Stanford University
Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,
More informationOutline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.
EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Outline Simulation of a Single-Server Queueing System Review of midterm # Department of Electrical and Computer Engineering
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More informationCounting principles, including permutations and combinations.
1 Counting principles, including permutations and combinations. The binomial theorem: expansion of a + b n, n ε N. THE PRODUCT RULE If there are m different ways of performing an operation and for each
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationBIOS 2041: Introduction to Statistical Methods
BIOS 2041: Introduction to Statistical Methods Abdus S Wahed* *Some of the materials in this chapter has been adapted from Dr. John Wilson s lecture notes for the same course. Chapter 0 2 Chapter 1 Introduction
More informationComputer Architecture
Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines
More informationNetwork Simulation Chapter 5: Traffic Modeling. Chapter Overview
Network Simulation Chapter 5: Traffic Modeling Prof. Dr. Jürgen Jasperneite 1 Chapter Overview 1. Basic Simulation Modeling 2. OPNET IT Guru - A Tool for Discrete Event Simulation 3. Review of Basic Probabilities
More informationThe Binomial distribution. Probability theory 2. Example. The Binomial distribution
Probability theory Tron Anders Moger September th 7 The Binomial distribution Bernoulli distribution: One experiment X i with two possible outcomes, probability of success P. If the experiment is repeated
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.
More informationCollege Mathematics
Wisconsin Indianhead Technical College 10804107 College Mathematics Course Outcome Summary Course Information Description Instructional Level Total Credits 3.00 Total Hours 48.00 This course is designed
More informationMath 475. Jimin Ding. August 29, Department of Mathematics Washington University in St. Louis jmding/math475/index.
istical A istic istics : istical Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html August 29, 2013 istical August 29, 2013 1 / 18 istical A istic
More informationTastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?
Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)
More informationIndependent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring
Independent Events Two events are independent if knowing that one occurs does not change the probability of the other occurring Conditional probability is denoted P(A B), which is defined to be: P(A and
More informationIntroduction to Probability
LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute
More informationMeelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03
Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look
More informationStat 20: Intro to Probability and Statistics
Stat 20: Intro to Probability and Statistics Lecture 5: Summary Statistics Tessa L. Childers-Day UC Berkeley 30 June 2014 By the end of this lecture... You will be able to: Describe a data set by its:
More informationECE Homework Set 2
1 Solve these problems after Lecture #4: Homework Set 2 1. Two dice are tossed; let X be the sum of the numbers appearing. a. Graph the CDF, FX(x), and the pdf, fx(x). b. Use the CDF to find: Pr(7 X 9).
More informationCS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.
Department of Computer Science Virginia Tech Blacksburg, Virginia Copyright c 2015 by Clifford A. Shaffer Computer Science Title page Computer Science Clifford A. Shaffer Fall 2015 Clifford A. Shaffer
More informationSTAT100 Elementary Statistics and Probability
STAT100 Elementary Statistics and Probability Exam, Sample Test, Summer 014 Solution Show all work clearly and in order, and circle your final answers. Justify your answers algebraically whenever possible.
More informationAmdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:
Amdahl's Law Useful for evaluating the impact of a change. (A general observation.) Insight: Improving a feature cannot improve performance beyond the use of the feature Suppose we introduce a particular
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More informationSTAT 418: Probability and Stochastic Processes
STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical
More informationCH.8 Statistical Intervals for a Single Sample
CH.8 Statistical Intervals for a Single Sample Introduction Confidence interval on the mean of a normal distribution, variance known Confidence interval on the mean of a normal distribution, variance unknown
More informationContinuous Expectation and Variance, the Law of Large Numbers, and the Central Limit Theorem Spring 2014
Continuous Expectation and Variance, the Law of Large Numbers, and the Central Limit Theorem 18.5 Spring 214.5.4.3.2.1-4 -3-2 -1 1 2 3 4 January 1, 217 1 / 31 Expected value Expected value: measure of
More informationBrief Review of Probability
Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions
More informationContinuous random variables
Continuous random variables Can take on an uncountably infinite number of values Any value within an interval over which the variable is definied has some probability of occuring This is different from
More informationa table or a graph or an equation.
Topic (8) POPULATION DISTRIBUTIONS 8-1 So far: Topic (8) POPULATION DISTRIBUTIONS We ve seen some ways to summarize a set of data, including numerical summaries. We ve heard a little about how to sample
More informationExploring, summarizing and presenting data. Berghold, IMI, MUG
Exploring, summarizing and presenting data Example Patient Nr Gender Age Weight Height PAVK-Grade W alking Distance Physical Functioning Scale Total Cholesterol Triglycerides 01 m 65 90 185 II b 200 70
More informationClass 11 Maths Chapter 15. Statistics
1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class
More informationUnit 2. Describing Data: Numerical
Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient
More informationICS 233 Computer Architecture & Assembly Language
ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by
More informationChapter 1 Descriptive Statistics
MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 1 Descriptive Statistics Nao Mimoto Contents 1 Overview 2 2 Pictorial Methods in Descriptive Statistics 3 2.1 Different Kinds
More information2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table
2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations
More informationLecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)
Lecture Notes 2 Random Variables Definition Discrete Random Variables: Probability mass function (pmf) Continuous Random Variables: Probability density function (pdf) Mean and Variance Cumulative Distribution
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationSlides 8: Statistical Models in Simulation
Slides 8: Statistical Models in Simulation Purpose and Overview The world the model-builder sees is probabilistic rather than deterministic: Some statistical model might well describe the variations. An
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33
BIO5312 Biostatistics Lecture 03: Discrete and Continuous Probability Distributions Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 9/13/2016 1/33 Introduction In this lecture,
More informationThe normal distribution
The normal distribution Patrick Breheny September 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/28 A common histogram shape The normal curve Standardization Location-scale families A histograms
More informationClosed book and notes. 120 minutes. Cover page, five pages of exam. No calculators.
IE 230 Seat # Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators. Score Final Exam, Spring 2005 (May 2) Schmeiser Closed book and notes. 120 minutes. Consider an experiment
More information11/16/2017. Chapter. Copyright 2009 by The McGraw-Hill Companies, Inc. 7-2
7 Chapter Continuous Probability Distributions Describing a Continuous Distribution Uniform Continuous Distribution Normal Distribution Normal Approximation to the Binomial Normal Approximation to the
More informationECE 313 Probability with Engineering Applications Fall 2000
Exponential random variables Exponential random variables arise in studies of waiting times, service times, etc X is called an exponential random variable with parameter λ if its pdf is given by f(u) =
More informationEXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS
EXAM Exam # Math 3342 Summer II, 2 July 2, 2 ANSWERS i pts. Problem. Consider the following data: 7, 8, 9, 2,, 7, 2, 3. Find the first quartile, the median, and the third quartile. Make a box and whisker
More informationComputer Systems Modelling
Computer Systems Modelling Computer Laboratory Computer Science Tripos, Part II Michaelmas Term 2003 R. J. Gibbens Problem sheet William Gates Building JJ Thomson Avenue Cambridge CB3 0FD http://www.cl.cam.ac.uk/
More informationMEASURES OF LOCATION AND SPREAD
MEASURES OF LOCATION AND SPREAD Frequency distributions and other methods of data summarization and presentation explained in the previous lectures provide a fairly detailed description of the data and
More informationNetwork Simulation Chapter 6: Output Data Analysis
Network Simulation Chapter 6: Output Data Analysis Prof. Dr. Jürgen Jasperneite 1 Contents Introduction Types of simulation output Transient detection When to terminate a simulation 2 Prof. Dr. J ürgen
More informationMATH Notebook 5 Fall 2018/2019
MATH442601 2 Notebook 5 Fall 2018/2019 prepared by Professor Jenny Baglivo c Copyright 2004-2019 by Jenny A. Baglivo. All Rights Reserved. 5 MATH442601 2 Notebook 5 3 5.1 Sequences of IID Random Variables.............................
More informationProf. Thistleton MAT 505 Introduction to Probability Lecture 13
Prof. Thistleton MAT 55 Introduction to Probability Lecture 3 Sections from Text and MIT Video Lecture: Sections 5.4, 5.6 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-4- probabilisticsystems-analysis-and-applied-probability-fall-2/video-lectures/lecture-8-continuousrandomvariables/
More informationBasics of Stochastic Modeling: Part II
Basics of Stochastic Modeling: Part II Continuous Random Variables 1 Sandip Chakraborty Department of Computer Science and Engineering, INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR August 10, 2016 1 Reference
More informationBrief reminder on statistics
Brief reminder on statistics by Eric Marsden 1 Summary statistics If you have a sample of n values x i, the mean (sometimes called the average), μ, is the sum of the
More informationAnalysis of Experimental Designs
Analysis of Experimental Designs p. 1/? Analysis of Experimental Designs Gilles Lamothe Mathematics and Statistics University of Ottawa Analysis of Experimental Designs p. 2/? Review of Probability A short
More informationLecture 2: Metrics to Evaluate Systems
Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video
More information