Network Simulation Chapter 5: Traffic Modeling. Chapter Overview

Size: px
Start display at page:

Download "Network Simulation Chapter 5: Traffic Modeling. Chapter Overview"

Transcription

1 Network Simulation Chapter 5: Traffic Modeling Prof. Dr. Jürgen Jasperneite 1 Chapter Overview 1. Basic Simulation Modeling 2. OPNET IT Guru - A Tool for Discrete Event Simulation 3. Review of Basic Probabilities and Statistics 4. Building valid, credible Simulation Models 5. Traffic Modeling 6. Output Data Analysis 0: Overview 2 Prof. Dr. J ürgen Jasperneite 1

2 Chapter Overview 1. Basic Simulation Modeling 2. OMNeT++ - A Tool for Discrete Event Simulation 3. Review of Basic Probabilities and Statistics 4. Building valid, credible Simulation Models 5. Traffic Modeling 6. Output Data Analysis 0: Overview 3 Overview Introduction Quantifying models Goodness of fit tests 4 Prof. Dr. J ürgen Jasperneite 2

3 Introduction load parameter System parameter Workload Traffic Source System under study metrics 5 Introduction Part of modeling what input probability distributions to use as input to simulation for: e.g. Interarrival times, message lengths, message types Characterization of traffic is very important Results of a simulation are only as good as the input > Inappropriate input distribution(s) can lead to incorrect output, bad decisions. Many different methods are used to generate traffic sources. Each method has advantages/disadvantages Development time Flexibility Accuracy 6 Prof. Dr. J ürgen Jasperneite 3

4 Introduction Traffic categories include: Statistical sources Exponential distributed IA times ON-OFF Network applications FTP HTTP Voice Video... Captured packet traces (trace-driven simulation) 7 Simple Statistical Distributions Statistical distributions are commonly used in performance analysis Poisson (Application Traffic, Interarrival times) Normal (Packet Sizes) Uniform (Destination Addresses) 8 Prof. Dr. J ürgen Jasperneite 4

5 Overview Introduction Quantifying models Goodness of fit tests 9 Introduction Usually, have observed data on input quantities options for use: Use Pros Cons Trace-driven Use actual data values to drive simulation Valid vis-a-vis real world direct Not generalizable 10 Empirical distribution Use data values to define a connect- the-dots distribution Fitted standard deviation Use data to fit a classical distribution (exp, uniform, Poisson, etc.) Fairly valid Simple Fairly direct Generalizable fills in holes in data May limit range of generated variates (depending on form) May not be valid May be difficult Prof. Dr. J ürgen Jasperneite 5

6 Extracting distributions out of traces How to overcome finiteness of a trace? How to characterize a trace in general? Consider a trace as a set {X 1,, X n } of individual values Assumption: all samples come from the same distribution Construct the empirical distribution function from this set Sort the {X 1,, X n } in increasing order such that X (1) X (n) Define a piecewise-linear distribution function: 0 i 1 x X ( i) F ( x) n 1 ( n 1)( X X ) ( i 1) ( i) 1 if if x X X ( i ) if X x X ( n) (1) x ( i 1) for i 1,..., n 1 11 Empirical distribution - example Figure shows an empirical distribution function for six data points F(x) Empirical distribution X (1) 5 X (2) 10 X (4) X(3) X (5) X (6) 12 Prof. Dr. J ürgen Jasperneite 6

7 Empirical distributions Discussion For realistic sample sizes, no or few data are available for the tail of a distribution. Empirical distributions as defined above do not allow to generate values larger than maximum X j, which might be desirable Adding an exponential tail to the data is possible and often useful 14 Traces vs. empirical distributions Going from traces to empirical distributions seems to be quite attractive Infinite number of samples can be easily generated Is there a downside? Example: Suppose you want to use the waiting time of customers in a queue as an input to some other simulation model Trace-driven: generate a long list of many individual waiting times of customers (either by measurement or by simulation), store this list, and whenever a waiting time is needed, use one entry of this list. Empirical distribution function: take the list, compute an empirical distribution, and generate a random variate when a waiting time is required. 15 Prof. Dr. J ürgen Jasperneite 7

8 Traces vs. empirical distributions Difference? Generating random variates from distribution happens one at a time, no information about the previous values is stored All values are identically distributed (they come from the same distribution) and are independent of each other Their corresponding random variables are called IID variables In a trace, the history of the system, how the values were generated, is still maintained (though implicitly) Such history could result in a mutual dependence of values Consider a queue: When the person before you has to wait long, it is quite likely that you will also have to wait long Waiting times in a queue are positively correlated Correlation structure of traces is destroyed by simulation using empirical distributions! 16 Traces vs. empirical distribution Example Consider the waiting times in an M/M/1 queue Compute the empirical distribution from one simulation run Use this empirical distribution to generate random numbers according to it Plot shows both distributions; they are reasonably similar We will see what is reasonable shortly Cumulat ive distribut ion function Randomly generated according to empirical distribution Empirical distribution Waiting time 17 Prof. Dr. J ürgen Jasperneite 8

9 Traces vs. empirical distributions Example 18 But: look at the autocorrelation of the two sets of numbers (measured from the simulation and randomly generated): Compare the graphs on the right Note the slowly decaying autocorrelation for the simulated/measured data Randomly generated data is practically uncorrelated Generating random numbers in a direct fashion destroys correlation structure A utocorrela tion p C Randomly generated data Simulated/Measured data C i, i j Lag j 2 2 i i j E [( X )( X )] i, i j i i i j i j Generalizing empirical distributions Empirical distributions are essentially a big set of data points Unwieldy, big description Generating random numbers based on such an empirical distribution is quite time-consuming We will soon see how Is there a possibility to have a more compact, smaller representation? Yes: look for an analytically described (closed-form) distribution function that matches the empirical distribution function! 19 Prof. Dr. J ürgen Jasperneite 9

10 Fitting empirical distributions To replace an empirical distribution by an analytically described distribution, the following steps are required Find an analytical distribution which fits the overall shape of the empirical distribution As an analytical distribution is usually parameterized, find appropriate values for these parameters Determine the quality of the fit 20 Finding proper families of distributions To find a suitable family of analytical distributions, often prior knowledge about the underlying empirical distribution is available E.g., certain assumptions about arrival process directly result in Poisson distributions, etc. Negative selection is also possible: Some values have natural upper or lower bounds E.g., values that can only be positive should not be modeled with distributions that take on negative value 21 Prof. Dr. J ürgen Jasperneite 10

11 Heuristics to choose distributions How to choose distributions to fit data when no prior knowledge is available? Some heuristics exist Summary statistics Histogram Note that most of these heuristics (as well as procedures to check the quality of a fit) require the underlying data (from which the empirical distribution has been generated) to be independent One means to check independency is autocovariance 22 Summary statistics Compute summary statistics such as mean, median, variance, coefficient of variation, or skewness (measure of symmetry) from the original sample Compare these results with properties of a possible distribution E.g., for symmetric distributions mean and median are equal For some distributions, coefficient of variation must be smaller than 1, equal to 1 (exponential distribution) More a means to quickly weed out inappropriate distributions from a large set of possible ones. 23 Prof. Dr. J ürgen Jasperneite 11

12 Histograms Compute a histogram of the original data Typically, equidistant buckets are useful Compare the shape of the histogram with that of the density of possible distributions Many shapes are quite characteristic and easily recognized Ignore differences in location and scale How to choose width/number k of buckets? Sturges s rule: k 1 log 2 n where n is the number of data points Better: rely on optical impression smooth shape, buckets neither too wide (detail is lost, spikes at crucial points could be missed) nor too small (small differences are overemphasized) Histogram can often indicate whether density is sum of two individual densities 24 Histograms Example of a multimodal distribution Histogram shows Data traffic between a Logic Controller (PLC) and a Human-Maschine Interface (HMI). [1] Jasperneite, Jürgen: Analyse und Modellierung von Kommunikationslasten in der Fertigungstechnik. in: at - Automatisierungstechnik, R. Oldenbourg Verlag(49) S.: , Apr 2001 Result of a keep-alive function, where every 5 sec. Packets will be exchanged. 25 Prof. Dr. J ürgen Jasperneite 12

13 Overview Introduction Quantifying models Goodness of fit tests 26 Goodness of fit tests Based on a hypothesized distribution along with estimated parameters, how to tell how good this hypothesis matches real data? Heuristic procedures Density/Histogram overplots: Plot both empirical histogram and estimated density function in one graph, look for differences Frequency comparison: Plot empirical histogram and calculated histogram side by side, look for differences Distribution Function Difference Plot: Compute difference between empirical and estimated distribution, plot this difference. Ideally, result is a horizontal line at 0 Directly comparing two plots of distributions is difficult for most humans Probability/Quantile Plots see below 27 Prof. Dr. J ürgen Jasperneite 13

14 QQ-Plots Way of plotting the difference between two distributions: Q-Q plots A quantile is the variable-value that corresponds to a fixed cumulative frequency. First quartile = 0.25 quantile Second quartile = median = 0.5 quantile Third quartile = 0.75 quantile Can read any quantile from the cdf plot 28 QQ-Plot..... compare two univariate 1) distributions.. is a plot of matching quantiles > a straight line implies that the two distributions have the same shape... has units of the data.. emphasize differences in the tails 1) Involving one variable, as opposed to two (bivariate) or many (multivariate) 29 Prof. Dr. J ürgen Jasperneite 14

15 Example : QQ-Plot Sample Normal Example: Old faithful inter-eruption times Data describing times between eruptions from a geyser (in minutes): 3.600,1.800,3.333,2.283,4.533,2.883,4.700,3.600,1.950,4.350,1.833,3.917,4.200,1.750,4. 700,2.167,1.750,4.800,1.600,4.250,1.800,1.750,3.450,3.067,4.533,3.600,1.967,4.083,3.85 0,4.433,4.300,4.467,3.367,4.033,3.833,2.017,1.867,4.833,1.833,4.783,4.350,1.883,4.567, 1.750,4.533,3.317,3.833,2.100,4.633,2.000,4.800,4.716,1.833,4.833,1.733,4.883,3.717,1.6 67,4.567,4.317,2.233,4.500,1.750,4.800,1.817,4.400,4.167,4.700,2.067,4.700,4.033,1.967, 4.500,4.000,1.983,5.067,2.017,4.567,3.883,3.600,4.133,4.333,4.100,2.633,4.067,4.933,3. 950,4.517,2.167,4.000,2.200,4.333,1.867,4.817,1.833,4.300,4.667,3.750,1.867,4.900,2.48 3,4.367,2.100,4.500,4.050,1.867,4.700,1.783,4.850,3.683,4.733,2.300,4.900,4.417,1.700, 4.633,2.317,4.600,1.817,4.417,2.617,4.067,4.250,1.967,4.600,3.767,1.917,4.500,2.267,4.6 50,1.867,4.167,2.800,4.333,1.833,4.383,1.883,4.933,2.033,3.733,4.233,2.233,4.533,4.817,4.333,1.983,4.633,2.017,5.100,1.800,5.033,4.000,2.400,4.600,3.567,4.000,4.500,4.083,1.800,3.967,2.200,4.150,2.000,3.833,3.500,4.583,2.367,5.000,1.933,4.617,1.917,2.083,4.5 83,3.333,4.167,4.333,4.500,2.417,4.000,4.167,1.883,4.583,4.250,3.767,2.033,4.433,4.08 3,1.833,4.417,2.183,4.800,1.833,4.800,4.100,3.966,4.233,3.500,4.366,2.250,4.667,2.100, 4.350,4.133,1.867,4.600,1.783,4.367,3.850,1.933,4.500,2.383,4.700,1.867,3.833,3.417,4. 233,2.400,4.800,2.000,4.150,1.867,4.267,1.750,4.483,4.000,4.117,4.083,4.267,3.917,4.55 0,4.083,2.417,4.183,2.217,4.450,1.883,1.850,4.283,3.950,2.333,4.150,2.350,4.933,2.900, 4.583,3.833,2.083,4.367,2.133,4.350,2.200,4.450,3.567,4.500,4.150,3.817,3.917,4.450,2. 000,4.283,4.767,4.533,1.850,4.250,1.983,2.250,4.750,4.117,2.150,4.417,1.817, Prof. Dr. J ürgen Jasperneite 15

16 Histogram of eruption data Density Histogram of eruptions eruptions 32 Empirical distribution of eruption data Fn(x) ecdf(eruptions) x Data evidently bi-modal -> no standard distribution will fit What about looking at only the, e.g.,upper part? 33 Prof. Dr. J ürgen Jasperneite 16

17 Restricted empirical distribution Fn(x) ecdf(long) x 34 Looks like a reasonable fit with a normal distribution Check with Q-Q plot! Q-Q plot for eruption data 35 Sample Quantiles Normal Q-Q Plot Theoretical Quantiles Reasonable fit, but some differences in the tail Shifted mean for the theoretical quantiles not taken into account Example taken from the R manual (see Web page ) Prof. Dr. J ürgen Jasperneite 17

18 Traffic Modeling Introduction Quantifying models Goodness of fit tests 36 Prof. Dr. J ürgen Jasperneite 18

Network Simulation Chapter 6: Output Data Analysis

Network Simulation Chapter 6: Output Data Analysis Network Simulation Chapter 6: Output Data Analysis Prof. Dr. Jürgen Jasperneite 1 Contents Introduction Types of simulation output Transient detection When to terminate a simulation 2 Prof. Dr. J ürgen

More information

Summarizing Measured Data

Summarizing Measured Data Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,

More information

SUMMARIZING MEASURED DATA. Gaia Maselli

SUMMARIZING MEASURED DATA. Gaia Maselli SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability

More information

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests) Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting

More information

ASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful

ASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful ASSIGNMENT 3 SIMPLE LINEAR REGRESSION In the simple linear regression model, the mean of a response variable is a linear function of an explanatory variable. The model and associated inferential tools

More information

Uniform random numbers generators

Uniform random numbers generators Uniform random numbers generators Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2707/ OUTLINE: The need for random numbers; Basic steps in generation; Uniformly

More information

Discrete-event simulations

Discrete-event simulations Discrete-event simulations Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/elt-53606/ OUTLINE: Why do we need simulations? Step-by-step simulations; Classifications;

More information

Overall Plan of Simulation and Modeling I. Chapters

Overall Plan of Simulation and Modeling I. Chapters Overall Plan of Simulation and Modeling I Chapters Introduction to Simulation Discrete Simulation Analytical Modeling Modeling Paradigms Input Modeling Random Number Generation Output Analysis Continuous

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models

More information

MATH4427 Notebook 4 Fall Semester 2017/2018

MATH4427 Notebook 4 Fall Semester 2017/2018 MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their

More information

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model Chapter Output Analysis for a Single Model. Contents Types of Simulation Stochastic Nature of Output Data Measures of Performance Output Analysis for Terminating Simulations Output Analysis for Steady-state

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution

More information

The Instability of Correlations: Measurement and the Implications for Market Risk

The Instability of Correlations: Measurement and the Implications for Market Risk The Instability of Correlations: Measurement and the Implications for Market Risk Prof. Massimo Guidolin 20254 Advanced Quantitative Methods for Asset Pricing and Structuring Winter/Spring 2018 Threshold

More information

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2 STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square

More information

Queueing Theory and Simulation. Introduction

Queueing Theory and Simulation. Introduction Queueing Theory and Simulation Based on the slides of Dr. Dharma P. Agrawal, University of Cincinnati and Dr. Hiroyuki Ohsaki Graduate School of Information Science & Technology, Osaka University, Japan

More information

Chapter 4: Continuous Random Variables and Probability Distributions

Chapter 4: Continuous Random Variables and Probability Distributions Chapter 4: and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 (Slide 1 of 37) Chapter Overview Continuous random variables

More information

Introduction to statistics

Introduction to statistics Introduction to statistics Literature Raj Jain: The Art of Computer Systems Performance Analysis, John Wiley Schickinger, Steger: Diskrete Strukturen Band 2, Springer David Lilja: Measuring Computer Performance:

More information

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 12 Probability Distribution of Continuous RVs (Contd.)

More information

1.225J J (ESD 205) Transportation Flow Systems

1.225J J (ESD 205) Transportation Flow Systems 1.225J J (ESD 25) Transportation Flow Systems Lecture 9 Simulation Models Prof. Ismail Chabini and Prof. Amedeo R. Odoni Lecture 9 Outline About this lecture: It is based on R16. Only material covered

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Chapter 2 Queueing Theory and Simulation

Chapter 2 Queueing Theory and Simulation Chapter 2 Queueing Theory and Simulation Based on the slides of Dr. Dharma P. Agrawal, University of Cincinnati and Dr. Hiroyuki Ohsaki Graduate School of Information Science & Technology, Osaka University,

More information

Network Traffic Characteristic

Network Traffic Characteristic Network Traffic Characteristic Hojun Lee hlee02@purros.poly.edu 5/24/2002 EL938-Project 1 Outline Motivation What is self-similarity? Behavior of Ethernet traffic Behavior of WAN traffic Behavior of WWW

More information

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67 Chapter 6 The Standard Deviation as a Ruler and the Normal Model 1 /67 Homework Read Chpt 6 Complete Reading Notes Do P129 1, 3, 5, 7, 15, 17, 23, 27, 29, 31, 37, 39, 43 2 /67 Objective Students calculate

More information

Probability Distribution

Probability Distribution Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

Stat 2300 International, Fall 2006 Sample Midterm. Friday, October 20, Your Name: A Number:

Stat 2300 International, Fall 2006 Sample Midterm. Friday, October 20, Your Name: A Number: Stat 2300 International, Fall 2006 Sample Midterm Friday, October 20, 2006 Your Name: A Number: The Midterm consists of 35 questions: 20 multiple-choice questions (with exactly 1 correct answer) and 15

More information

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Key message Spatial dependence First Law of Geography (Waldo Tobler): Everything is related to everything else, but near things

More information

Summarizing Measured Data

Summarizing Measured Data Summarizing Measured Data Dr. John Mellor-Crummey Department of Computer Science Rice University johnmc@cs.rice.edu COMP 528 Lecture 7 3 February 2005 Goals for Today Finish discussion of Normal Distribution

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 0 Output Analysis for a Single Model Purpose Objective: Estimate system performance via simulation If θ is the system performance, the precision of the

More information

(Re)introduction to Statistics Dan Lizotte

(Re)introduction to Statistics Dan Lizotte (Re)introduction to Statistics Dan Lizotte 2017-01-17 Statistics The systematic collection and arrangement of numerical facts or data of any kind; (also) the branch of science or mathematics concerned

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Key message Spatial dependence First Law of Geography (Waldo Tobler): Everything is related to everything else, but near things

More information

Eruptions of the Old Faithful geyser

Eruptions of the Old Faithful geyser Eruptions of the Old Faithful geyser geyser2.ps Topics covered: Bimodal distribution. Data summary. Examination of univariate data. Identifying subgroups. Prediction. Key words: Boxplot. Histogram. Mean.

More information

STAT Chapter 5 Continuous Distributions

STAT Chapter 5 Continuous Distributions STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS EX 1 Given the following series of data on Gender and Height for 8 patients, fill in two frequency tables one for each Variable, according to the model

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Continuous-Valued Probability Review

Continuous-Valued Probability Review CS 6323 Continuous-Valued Probability Review Prof. Gregory Provan Department of Computer Science University College Cork 2 Overview Review of discrete distributions Continuous distributions 3 Discrete

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Frequency Analysis & Probability Plots

Frequency Analysis & Probability Plots Note Packet #14 Frequency Analysis & Probability Plots CEE 3710 October 0, 017 Frequency Analysis Process by which engineers formulate magnitude of design events (i.e. 100 year flood) or assess risk associated

More information

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com 1 Chapter 1: Introduction Material from Devore s book (Ed 8), and Cengagebrain.com Populations and Samples An investigation of some characteristic of a population of interest. Example: Say you want to

More information

Summarizing Measured Data

Summarizing Measured Data Performance Evaluation: Summarizing Measured Data Hongwei Zhang http://www.cs.wayne.edu/~hzhang The object of statistics is to discover methods of condensing information concerning large groups of allied

More information

Prof. Thistleton MAT 505 Introduction to Probability Lecture 13

Prof. Thistleton MAT 505 Introduction to Probability Lecture 13 Prof. Thistleton MAT 55 Introduction to Probability Lecture 3 Sections from Text and MIT Video Lecture: Sections 5.4, 5.6 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-4- probabilisticsystems-analysis-and-applied-probability-fall-2/video-lectures/lecture-8-continuousrandomvariables/

More information

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1 Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction

More information

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS Zeynep F. EREN DOGU PURPOSE & OVERVIEW Stochastic simulations involve random inputs, so produce random outputs too. The quality of the output is

More information

Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03

Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03 Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look

More information

ntopic Organic Traffic Study

ntopic Organic Traffic Study ntopic Organic Traffic Study 1 Abstract The objective of this study is to determine whether content optimization solely driven by ntopic recommendations impacts organic search traffic from Google. The

More information

Lecture 2: Probability Distributions

Lecture 2: Probability Distributions EAS31136/B9036: Statistics in Earth & Atmospheric Sciences Lecture 2: Probability Distributions Instructor: Prof. Johnny Luo www.sci.ccny.cuny.edu/~luo Dates Topic Reading (Based on the 2 nd Edition of

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

ALGEBRA I CURRICULUM OUTLINE

ALGEBRA I CURRICULUM OUTLINE ALGEBRA I CURRICULUM OUTLINE 2013-2014 OVERVIEW: 1. Operations with Real Numbers 2. Equation Solving 3. Word Problems 4. Inequalities 5. Graphs of Functions 6. Linear Functions 7. Scatterplots and Lines

More information

SPLITTING AND MERGING OF PACKET TRAFFIC: MEASUREMENT AND MODELLING

SPLITTING AND MERGING OF PACKET TRAFFIC: MEASUREMENT AND MODELLING SPLITTING AND MERGING OF PACKET TRAFFIC: MEASUREMENT AND MODELLING Nicolas Hohn 1 Darryl Veitch 1 Tao Ye 2 1 CUBIN, Department of Electrical & Electronic Engineering University of Melbourne, Vic 3010 Australia

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

More on Input Distributions

More on Input Distributions More on Input Distributions Importance of Using the Correct Distribution Replacing a distribution with its mean Arrivals Waiting line Processing order System Service mean interarrival time = 1 minute mean

More information

Queueing Theory. VK Room: M Last updated: October 17, 2013.

Queueing Theory. VK Room: M Last updated: October 17, 2013. Queueing Theory VK Room: M1.30 knightva@cf.ac.uk www.vincent-knight.com Last updated: October 17, 2013. 1 / 63 Overview Description of Queueing Processes The Single Server Markovian Queue Multi Server

More information

b. ( ) ( ) ( ) ( ) ( ) 5. Independence: Two events (A & B) are independent if one of the conditions listed below is satisfied; ( ) ( ) ( )

b. ( ) ( ) ( ) ( ) ( ) 5. Independence: Two events (A & B) are independent if one of the conditions listed below is satisfied; ( ) ( ) ( ) 1. Set a. b. 2. Definitions a. Random Experiment: An experiment that can result in different outcomes, even though it is performed under the same conditions and in the same manner. b. Sample Space: This

More information

Introduction to Queueing Theory

Introduction to Queueing Theory Introduction to Queueing Theory Raj Jain Washington University in Saint Louis Jain@eecs.berkeley.edu or Jain@wustl.edu A Mini-Course offered at UC Berkeley, Sept-Oct 2012 These slides and audio/video recordings

More information

Fundamentals of Applied Probability and Random Processes

Fundamentals of Applied Probability and Random Processes Fundamentals of Applied Probability and Random Processes,nd 2 na Edition Oliver C. Ibe University of Massachusetts, LoweLL, Massachusetts ip^ W >!^ AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS

More information

U.S. - Canadian Border Traffic Prediction

U.S. - Canadian Border Traffic Prediction Western Washington University Western CEDAR WWU Honors Program Senior Projects WWU Graduate and Undergraduate Scholarship 12-14-2017 U.S. - Canadian Border Traffic Prediction Colin Middleton Western Washington

More information

EE/CpE 345. Modeling and Simulation. Fall Class 9

EE/CpE 345. Modeling and Simulation. Fall Class 9 EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior

More information

Estimation of multivariate critical layers: Applications to rainfall data

Estimation of multivariate critical layers: Applications to rainfall data Elena Di Bernardino, ICRA 6 / RISK 2015 () Estimation of Multivariate critical layers Barcelona, May 26-29, 2015 Estimation of multivariate critical layers: Applications to rainfall data Elena Di Bernardino,

More information

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra September The Building Blocks of Algebra Rates, Patterns and Problem Solving Variables and Expressions The Commutative and Associative Properties The Distributive Property Equivalent Expressions Seeing

More information

Chapter 4a Probability Models

Chapter 4a Probability Models Chapter 4a Probability Models 4a.2 Probability models for a variable with a finite number of values 297 4a.1 Introduction Chapters 2 and 3 are concerned with data description (descriptive statistics) where

More information

STAT Section 3.4: The Sign Test. The sign test, as we will typically use it, is a method for analyzing paired data.

STAT Section 3.4: The Sign Test. The sign test, as we will typically use it, is a method for analyzing paired data. STAT 518 --- Section 3.4: The Sign Test The sign test, as we will typically use it, is a method for analyzing paired data. Examples of Paired Data: Similar subjects are paired off and one of two treatments

More information

CPSC 531 Systems Modeling and Simulation FINAL EXAM

CPSC 531 Systems Modeling and Simulation FINAL EXAM CPSC 531 Systems Modeling and Simulation FINAL EXAM Department of Computer Science University of Calgary Professor: Carey Williamson December 21, 2017 This is a CLOSED BOOK exam. Textbooks, notes, laptops,

More information

Bounded Delay for Weighted Round Robin with Burst Crediting

Bounded Delay for Weighted Round Robin with Burst Crediting Bounded Delay for Weighted Round Robin with Burst Crediting Sponsor: Sprint Kert Mezger David W. Petr Technical Report TISL-0230-08 Telecommunications and Information Sciences Laboratory Department of

More information

You may use a calculator. Translation: Show all of your work; use a calculator only to do final calculations and/or to check your work.

You may use a calculator. Translation: Show all of your work; use a calculator only to do final calculations and/or to check your work. GROUND RULES: Print your name at the top of this page. This is a closed-book and closed-notes exam. You may use a calculator. Translation: Show all of your work; use a calculator only to do final calculations

More information

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (1)

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (1) B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (1) Introduction The basic, most serious disadvantage of simulation is that we don t get exact answers. Two different runs of the same model

More information

Introduction to Queueing Theory with Applications to Air Transportation Systems

Introduction to Queueing Theory with Applications to Air Transportation Systems Introduction to Queueing Theory with Applications to Air Transportation Systems John Shortle George Mason University February 28, 2018 Outline Why stochastic models matter M/M/1 queue Little s law Priority

More information

Random Processes. DS GA 1002 Probability and Statistics for Data Science.

Random Processes. DS GA 1002 Probability and Statistics for Data Science. Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Modeling quantities that evolve in time (or space)

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 33 Probability Models using Gamma and Extreme Value

More information

Chapter 1 Descriptive Statistics

Chapter 1 Descriptive Statistics MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 1 Descriptive Statistics Nao Mimoto Contents 1 Overview 2 2 Pictorial Methods in Descriptive Statistics 3 2.1 Different Kinds

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

HEAVY-TRAFFIC EXTREME-VALUE LIMITS FOR QUEUES

HEAVY-TRAFFIC EXTREME-VALUE LIMITS FOR QUEUES HEAVY-TRAFFIC EXTREME-VALUE LIMITS FOR QUEUES by Peter W. Glynn Department of Operations Research Stanford University Stanford, CA 94305-4022 and Ward Whitt AT&T Bell Laboratories Murray Hill, NJ 07974-0636

More information

Class 11 Maths Chapter 15. Statistics

Class 11 Maths Chapter 15. Statistics 1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class

More information

Additional Problems Additional Problem 1 Like the http://www.stat.umn.edu/geyer/5102/examp/rlike.html#lmax example of maximum likelihood done by computer except instead of the gamma shape model, we will

More information

Sample Problems for the Final Exam

Sample Problems for the Final Exam Sample Problems for the Final Exam 1. Hydraulic landing assemblies coming from an aircraft rework facility are each inspected for defects. Historical records indicate that 8% have defects in shafts only,

More information

Some Background Information on Long-Range Dependence and Self-Similarity On the Variability of Internet Traffic Outline Introduction and Motivation Ch

Some Background Information on Long-Range Dependence and Self-Similarity On the Variability of Internet Traffic Outline Introduction and Motivation Ch On the Variability of Internet Traffic Georgios Y Lazarou Information and Telecommunication Technology Center Department of Electrical Engineering and Computer Science The University of Kansas, Lawrence

More information

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0.

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0. IEOR 46: Introduction to Operations Research: Stochastic Models Chapters 5-6 in Ross, Thursday, April, 4:5-5:35pm SOLUTIONS to Second Midterm Exam, Spring 9, Open Book: but only the Ross textbook, the

More information

Exploratory Data Analysis August 26, 2004

Exploratory Data Analysis August 26, 2004 Exploratory Data Analysis August 26, 2004 Exploratory Data Analysis p. 1/?? Agent Orange Case Study (SS: Ch 3) Dioxin concentrations in parts per trillion (ppt) for 646 Vietnam veterans and 97 veterans

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

ALGEBRA 1 PACING GUIDE

ALGEBRA 1 PACING GUIDE Unit 8 Graphing Quadratic Functions F-BF.3 F-IF.2 F-IF.4 F-IF.7a F-BF.1 Identify the effect on the graph of replacing f(x) by f(x) + k, k f(x), f(kx), and f(x + k) for specific values of k (both positive

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 9 Verification and Validation of Simulation Models Purpose & Overview The goal of the validation process is: To produce a model that represents true

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

CSE 312, 2017 Winter, W.L. Ruzzo. 7. continuous random variables

CSE 312, 2017 Winter, W.L. Ruzzo. 7. continuous random variables CSE 312, 2017 Winter, W.L. Ruzzo 7. continuous random variables The new bit continuous random variables Discrete random variable: values in a finite or countable set, e.g. X {1,2,..., 6} with equal probability

More information

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017 CPSC 531: System Modeling and Simulation Carey Williamson Department of Computer Science University of Calgary Fall 2017 Quote of the Day A person with one watch knows what time it is. A person with two

More information

Discrete Random Variables (1) Solutions

Discrete Random Variables (1) Solutions STAT/MATH 394 A - PROBABILITY I UW Autumn Quarter 06 Néhémy Lim Discrete Random Variables ( Solutions Problem. The probability mass function p X of some discrete real-valued random variable X is given

More information

Chapter 3 Balance equations, birth-death processes, continuous Markov Chains

Chapter 3 Balance equations, birth-death processes, continuous Markov Chains Chapter 3 Balance equations, birth-death processes, continuous Markov Chains Ioannis Glaropoulos November 4, 2012 1 Exercise 3.2 Consider a birth-death process with 3 states, where the transition rate

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19 EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture

More information

Discrete probability distributions

Discrete probability distributions Discrete probability s BSAD 30 Dave Novak Fall 08 Source: Anderson et al., 05 Quantitative Methods for Business th edition some slides are directly from J. Loucks 03 Cengage Learning Covered so far Chapter

More information

ANÁLISE DOS DADOS. Daniela Barreiro Claro

ANÁLISE DOS DADOS. Daniela Barreiro Claro ANÁLISE DOS DADOS Daniela Barreiro Claro Outline Data types Graphical Analysis Proimity measures Prof. Daniela Barreiro Claro Types of Data Sets Record Ordered Relational records Video data: sequence of

More information