Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Similar documents
Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Slides 8: Statistical Models in Simulation

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

EE/CpE 345. Modeling and Simulation. Fall Class 5 September 30, 2002

B.N.Bandodkar College of Science, Thane. Subject : Computer Simulation and Modeling.

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

Lecture Notes 2 Random Variables. Random Variable

3 Modeling Process Quality

Part I Stochastic variables and Markov chains

Random variables. DS GA 1002 Probability and Statistics for Data Science.

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

The exponential distribution and the Poisson process

15 Discrete Distributions

Brief Review of Probability

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

ECON 5350 Class Notes Review of Probability and Distribution Theory

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

Continuous random variables

Probability Distributions Columns (a) through (d)

Probability and Distributions

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

STAT Chapter 5 Continuous Distributions

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, < x <. ( ) X s. Real Line

Class 26: review for final exam 18.05, Spring 2014

Learning Objectives for Stat 225

Continuous-time Markov Chains

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Chapter 5. Chapter 5 sections

3 Continuous Random Variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Plotting data is one method for selecting a probability distribution. The following

THE QUEEN S UNIVERSITY OF BELFAST

Lecture 2: Repetition of probability theory and statistics

Expected value of r.v. s

1.1 Review of Probability Theory

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

Ching-Han Hsu, BMES, National Tsing Hua University c 2015 by Ching-Han Hsu, Ph.D., BMIR Lab. = a + b 2. b a. x a b a = 12

Guidelines for Solving Probability Problems

ECE 313 Probability with Engineering Applications Fall 2000

Introduction to Probability Theory for Graduate Economics Fall 2008

CDA5530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

Special distributions

Exponential Distribution and Poisson Process

1 Review of Probability and Distributions

Recall Discrete Distribution. 5.2 Continuous Random Variable. A probability histogram. Density Function 3/27/2012

Northwestern University Department of Electrical Engineering and Computer Science

Modeling and Performance Analysis with Discrete-Event Simulation

Continuous Probability Spaces

Review for the previous lecture

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x!

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 18

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Common ontinuous random variables

MATH : EXAM 2 INFO/LOGISTICS/ADVICE

b. ( ) ( ) ( ) ( ) ( ) 5. Independence: Two events (A & B) are independent if one of the conditions listed below is satisfied; ( ) ( ) ( )

Discrete Random Variables

ARCONES MANUAL FOR THE SOA EXAM P/CAS EXAM 1, PROBABILITY, SPRING 2010 EDITION.

57:022 Principles of Design II Final Exam Solutions - Spring 1997

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Random Variables and Their Distributions

Things to remember when learning probability distributions:

Week 1 Quantitative Analysis of Financial Markets Distributions A

Chapter 3.3 Continuous distributions

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Probability and Statistics Concepts

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Basic concepts of probability theory

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

1 Presessional Probability

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Chapter 2. Discrete Distributions

Chapter 4. Probability-The Study of Randomness

IE 581 Introduction to Stochastic Simulation

A) Questions on Estimation

Algorithms for Uncertainty Quantification

Birth-Death Processes

Probability Distribution

1 Probability and Random Variables

Mathematical Statistics 1 Math A 6330

Chapter 6: Random Processes 1

Contents 1. Contents

Summarizing Measured Data

Continuous Random Variables

Chapter 4: Continuous Probability Distributions

Continuous Random Variables

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( )

Glossary availability cellular manufacturing closed queueing network coefficient of variation (CV) conditional probability CONWIP

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

STAT 3610: Review of Probability Distributions

STAT509: Continuous Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Transcription:

Chapter 5 Statistical Models in Simulations 5.1

Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models 5.

Purpose & Overview The world the model-builder sees is probabilistic rather than deterministic. Some statistical model might well describe the variations. An appropriate model can be developed by sampling the phenomenon of interest: Select a known distribution through educated guesses Make estimate of the parameters Test for goodness of fit In this chapter: Review several important probability distributions Present some typical application of these models 5.3

Basic Probability Theory Concepts 5.4

Review of Terminology and Concepts In this section, we will review the following concepts: Discrete random variables Continuous random variables Cumulative distribution function Epected value Variance Standard deviation Covariance Correlation Quantile 5.5

Discrete Random Variables X is a discrete random variable if the number of possible values of X is finite, or countable infinite. Eample: Consider packets arriving at a router. Let X be the number of packets arriving each second at a router. R X possible values of X range space of X {0,1,, } p i probability the random variable X is i, p i PX i p i, i 1,, must satisfy: 1. p i 0, for all i. p i 1 i1 The collection of pairs i, p i, i 1,,, is called the probability distribution of X, and p i is called the probability mass function PMF of X. 5.6

Continuous Random Variables X is a continuous random variable if its range space R X is an interval or a collection of intervals. The probability that X lies in the interval [a, b] is given by: f is called the probability density function PDF of X, and satisfies: PX [a,b] f 1. f 0, for all in R. 3. R Properties 1.. X f d 1 f 0, P X P a X if b is not in P a X b f d X R 0 0, because P a < X X 0 0 f d b b a 0 P a X < b a P a < X b < b 5.7

Continuous Random Variables Eample: Life of an inspection device is given by X, a continuous random variable with PDF: f 1 e 0 /, 0,otherwise X has eponential distribution with mean years Probability that the device s life is between and 3 years is: P 3 1 3 e / d 0.145 5.8

Cumulative Distribution Function Cumulative Distribution Function CDF is denoted by F, where F PX F If X is discrete, then F p i i 1 If X is continuous, then F f t dt Properties 1. F is nondecreasing function. If a b, then F a. 3. lim F 1 lim F 0 F b All probability questions about X can be answered in terms of the CDF: P a X b F b F a, for all a b 5.9

Cumulative Distribution Function Eample: The inspection device has CDF: 1 F e 0 t / 1 e / The probability that the device lasts for less than years: dt P0 X F F0 F 1 e 1 0.63 The probability that it lasts between and 3 years: P X 3 F3 F 1 1e 1e 0. 145 3 5.10

Epected value The epected value of X is denoted by EX If X is discrete E X p all i i i If X is continuous E X f d a.k.a the mean, m, µ, or the 1 st moment of X A measure of the central tendency 5.11

Variance The variance of X is denoted by VX or VarX or σ Definition: VX E X E[X] Also VX EX EX A measure of the spread or variation of the possible values of X around the mean f σ large f σ small µ µ 5.1

Standard deviation The standard deviation SD of X is denoted by σ Definition: σ V The standard deviation is epressed in the same units as the mean Interprete σ always together with the mean Attention: The standard deviation of two different data sets may be difficult to compare 5.13

5.14 Epected value and variance: Eample Eample: The mean of life of the previous inspection device is: To compute the variance of X, we first compute EX : Hence, the variance and standard deviation of the device s life are: / 1 0 / 0 0 / + d e d e X E e 8 / 1 0 / 0 0 / + d e d e X E e 4 8 X V X V σ

5.15 Epected value and variance: Eample 1 1 1 1 ' ' Set ' ' Integration Partial / 1 / 0 0 / 0 / / / 0 / 0 0 / d e e d e X E e v u e v u d v u v u d v u d e d e X E e +

Mean and variance of sums If 1,,, k are k random variables and if a 1, a,, a k are k constants, then Ea 1 1 +a + +a k k a 1 E 1 +a E + +a k E k For independent variables Var a + a + + a a Var + a Var + + a Var 1 1 k k 1 1 k k 5.16

Coefficient of variation The ratio of the standard deviation to the mean is called coefficient of variation C.O.V. Dimensionless Normalized measure of dispersion standard deviation σ C. OV., µ > mean µ 0 Can be used to compare different datasets, instead the standard deviation. 5.17

Covariance Given two random variables and y with µ and µ y, their covariance is defined as Cov, y σ y E[-µ y-µ y ] Ey - E Ey Cov, y measures the dependency of and y, i.e., how and y vary together. For independent variables, the covariance is zero, since Ey EEy 5.18

Correlation coefficient The normalized value of covariance is called the correlation coefficient or simply correlation Correlatio n, y ρ, y σ σ y σ y The correlation lies between -1 and +1 5.19

Quantile The value at which the CDF takes a value α is called the α-quantile or 100α-percentile. It is denoted by α. PX α F α α, α [0,1] F 1 α Relationship: The median is the 50-percentile or 0.5-quantile α 5.0

Mean, median, and mode Three different indices for the central tendency of a distribution: Mean: E X n i 1 i i f µ p d Median: The 0.5-quantile, i.e., the i for that half of the values are smaller and the other half is larger. Mode: The most likely value, i.e., the i that has the highest probabiliy p i or the at which the PDF is maimum. 5.1

Mean, median, and mode Median Mean Mode Modes No Mode pdf, f pdf, f Median Mean pdf, f Median Mean Mode Mode pdf, f Median Mean pdf, f Median Mean 5.

Selecting among mean, median, and mode Select central tendency Is data categorial? Yes Use mode No Is total of interest? Yes Use mean No Is distribution skewed? Yes Use median No Use mean 5.3

Relationship between simulation and probability theory 5.4

Central limit theorem Let Z n be the random variable Z n X n µ and F n z be the distribution function of Z n for a sample size of n, i.e., F n zpz n z, then F n σ n z Θ z n where Θz is normal distribution with µ0 and σ 1 1 y Θ z e π z dy for < z < 5.5

Strong law of large numbers Let X 1, X,, X n be IID random variables with mean µ. X n µ n with probability 1 Sample mean 5.6

Strong law of large numbers 5.7

Discrete Distributions 5.8

Discrete Distributions Discrete random variables are used to describe random phenomena in which only integer values can occur. In this section, we will learn about: Bernoulli trials and Bernoulli distribution Binomial distribution Geometric and negative binomial distribution Poisson distribution 5.9

Bernoulli Trials and Bernoulli Distribution Bernoulli trials: Consider an eperiment consisting of n trials, each can be a success or a failure. X j 1 0 if the j- th eperiment is a success if the j- th eperiment is a failure failure success The Bernoulli distribution one trial: p, j 1 p j j p j, j 1,,..., n q : 1 p, j 0 where EX j p and VX j p 1-p p q 5.30

Bernoulli Trials and Bernoulli Distribution Bernoulli process: n Bernoulli trials where trials are independent: p 1,,, n p 1 1 p p n n 5.31

Binomial Distribution The number of successes in n Bernoulli trials, X, has a binomial distribution. 1 n n p 0, p q n, 0,1,,..., n otherwise The number of outcomes having the required number of successes and failures Probability that there are successes and n- failures The mean, E p + p + + p n p The variance, VX pq + pq + + pq n pq 5.3

Geometric Distribution Geometric distribution The number of Bernoulli trials, X, to achieve the 1 st success: success p 1 q p, 0,1,,..., n 0, otherwise E 1/p, and VX q/p 5.33

Negative Binomial Distribution Negative binomial distribution The number of Bernoulli trials, X, until the k-th success If X is a negative binomial distribution with parameters p and k, then: 1 p k 1 0, k EX k/p, and VX kq/p q p k-1 successes 1 k k 1 p q p k 1 k, k, k otherwise p k th success k-th success + 1, k +,... 5.34

Poisson Distribution Poisson distribution describes many random processes quite well and is mathematically quite simple. where α > 0, PDF and CDF are: α p e! 0, EX α VX α, 0,1,... otherwise F i α i 0 i! e α 5.35

Poisson Distribution Eample: A computer repair person is beeped each time there is a call for service. The number of beeps per hour ~ Poissonα per hour. The probability of three beeps in the net hour: p3 3 /3! e - 0.18 also, p3 F3 F 0.857-0.6770.18 The probability of two or more beeps in an 1-hour period: p or more 1 p0 + p1 1 F1 0.594 5.36

Continuous Distributions 5.37

Continuous Distributions Continuous random variables can be used to describe random phenomena in which the variable can take on any value in some interval. In this section, the distributions studied are: Uniform Eponential Weibull Normal Lognormal 5.38

Uniform Distribution A random variable X is uniformly distributed on the interval a, b, Ua, b, if its PDF and CDF are: 1 0, < a, a b a f b a F, a 0, otherwise b a 1, b Properties P 1 < X < is proportional to the length of the interval [F F 1-1 /b-a] EX a+b/ VX b-a /1 < b U0,1 provides the means to generate random numbers, from which random variates can be generated. 5.39

Eponential Distribution A random variable X is eponentially distributed with parameter λ > 0 if its PDF and CDF are: f λe 0, λ, 0 elsewhere F 0, λe dt 1 e 0 λt λ, < 0 0 EX 1/λ VX 1/λ 5.40

Eponential Distribution Used to model interarrival times when arrivals are completely random, and to model service times that are highly variable For several different eponential PDF s see figure, the value of intercept on the vertical ais is λ, and all PDF s eventually intersect. 5.41

Eponential Distribution Memoryless property For all s and t greater or equal to 0: PX > s+t X > s PX > t Eample: A lamp ~epλ 1/3 per hour, hence, on average, 1 failure per 3 hours. The probability that the lamp lasts longer than its mean life is: PX > 3 1 PX < 3 1 1 e -3/3 e -1 0.368 The probability that the lamp lasts between to 3 hours is: P X 3 F3 F 0.145 The probability that it lasts for another hour given it is operating for.5 hours: PX > 3.5 X >.5 PX > 1 e -1/3 0.717 5.4

5.43 Eponential Distribution Memoryless property t X P e e e s X P t s X P s X t s X P t s t s > > + > > + > + λ λ λ

Weibull Distribution A random variable X has a Weibull distribution if its PDF has the form: β ν f α α 0, 3 parameters: Location parameter: υ, Scale parameter: β, β > 0 Shape parameter: α, > 0 β 1 ν ep α < ν < β, ν otherwise Eample: υ 0 and α 1: 5.44

5.45 Weibull Distribution Weibull Distribution For β 1, υ0 otherwise 0,, 1 ν α α e f otherwise 0,, ep 1 ν α ν α ν α β β β f When β 1, X ~ epλ 1/α

Normal Distribution A random variable X is normally distributed if it has the PDF: f 1 e σ π 1 µ σ, < < Mean: Variance: < µ < σ > 0 Denoted as X ~ Nµ,σ Properties: lim f 0, and lim f 0 fµ - fµ + ; the PDF is symmetric about µ. The maimum value of the PDF occurs at µ Æ the mean and mode are equal 5.46

Normal Distribution Evaluating the distribution: Use numerical methods no closed form Independent of µ and σ, using the standard normal distribution: Z ~ N0,1 µ Transformation of variables: let Z X, σ F P X µ / σ P Z 1 z e dz π µ σ µ / σ φ z dz Φ µ σ, where Φ z z 1 e π t / dt 5.47

Normal Distribution Eample: The time required to load an oceangoing vessel, X, is distributed as N1,4, µ1, σ The probability that the vessel is loaded in less than 10 hours: 10 1 F 10 Φ Φ 1 1 Φ1 0.1587 Using the symmetry property, Φ1 is the complement of Φ-1, i.e., Φ- 1-Φ 5.48

Normal Distribution Why is the normal distribution important? The most commonly used distribution in data analysis The sum of n independent normal variates is a normal variate. The sum of a large number of independent observations from any distribution has a normal distribution. 5.49

Lognormal Distribution A random variable X has a lognormal distribution if its pdf has the form: ln µ 1 ep, > 0 f µ1, πσ σ σ 0.5,1,. 0, otherwise Mean EX e µ+σ / Variance VX e µ+σ / e σ - 1 Relationship with normal distribution When Y ~ Nµ, σ, then X e Y ~ lognormalµ, σ Parameters µ and σ are not the mean and variance of the lognormal random variable X 5.50

Poisson Process 5.51

Poisson Process Definition: Nt is a counting function that represents the number of events occurred in [0,t]. A counting process {Nt, t 0} is a Poisson process with mean rate λ if: Arrivals occur one at a time {Nt, t 0} has stationary increments Number of arrivals in [t, t+s] depends only on s, not on starting point t Arrivals are completely random {Nt, t 0} has independent increments Number of arrivals during non-overlapping time intervals are independent Future arrivals occur completely random 5.5

Poisson Process Properties P λt n! n λt N t n e, for t 0 and n 0,1,,... Equal mean and variance: E[Nt] V[Nt] λ t Stationary increment: The number of arrivals in time s to t, with s<t, is also Poissondistributed with mean λt-s 5.53

Poisson Process: Interarrival Times Consider the interarrival times of a Poisson process A 1, A,, where A i is the elapsed time between arrival i and arrival i+1 The 1 st arrival occurs after time t iff there are no arrivals in the interval [0, t], hence: PA 1 > t PNt 0 e -λt PA 1 t 1- PA 1 > t 1 e -λt [CDF of epλ] Interarrival times, A 1, A,, are eponentially distributed and independent with mean 1/λ Arrival counts ~ Poissonλ Stationary & Independent Interarrival time ~ ep1/λ Memoryless 5.54

Poisson Process: Splitting and Pooling Splitting: Suppose each event of a Poisson process can be classified as Type I, with probability p and Type II, with probability 1-p. Nt N1t + Nt, where N1t and Nt are both Poisson processes with rates λ p and λ 1-p Ν t ~ Poissonλ λ λp λ1-p N1t ~ Poissonλp Nt ~ Poissonλ1-p 5.55

5.56 Poisson Process: Splitting and Pooling Pooling: Suppose two Poisson processes are pooled together N1t + Nt Nt, where Nt is a Poisson processes with rates λ 1 + λ n n t n j j n j n t n j j n j n t n j j n j n t n j j n j t t n j t j n t j n j n t e j n n t e j n j n n t e j n j t e j n t j t e e e j n t e j t j n N P j N P n N N P!!!!!!!!!!!! 1 0 1 0 1 0 1 0 1 0 1 1 0 1 1 1 1 1 1 1 λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ + + + + + + Nt ~ Poissonλ 1 + λ N1t ~ Poissonλ 1 Nt ~ Poissonλ λ 1 + λ λ 1 λ

Empirical Distributions 5.57

Empirical Distributions A distribution whose parameters are the observed values in a sample of data. May be used when it is impossible or unnecessary to establish that a random variable has any particular parametric distribution. Advantage: no assumption beyond the observed values in the sample. Disadvantage: sample might not cover the entire range of possible values. 5.58

Empirical Distributions: Eample Customers arrive in groups from 1 to 8 persons Observation of the last 300 groups has been reported Summary in the table below Group Size Frequency Relative Frequency Cumulative Relative Frequency 1 30 0.10 0.10 110 0.37 0.47 3 45 0.15 0.6 4 71 0.4 0.86 5 1 0.04 0.90 6 13 0.04 0.94 7 7 0.0 0.96 8 1 0.04 1.00 5.59

Empirical Distributions: Eample 0,4 0,35 0,3 0,5 0, 0,15 0,1 0,05 0 Relative Frequency 1 3 4 5 6 7 8 1, 1 0,8 0,6 0,4 0, 0 Empirical CDF of the Group Sizes 1 3 4 5 6 7 8 5.60

Relationships among distributions 5.61

Relationships among distributions Distributions 10 discrete 5 continous Relationships Dashed arrow shows asymptotic relations Solid arrow shows transformations or special cases 5.6

Relationships among distributions 5.63

Useful Statistical Models 5.64

Useful Statistical Models In this section, statistical models appropriate to some application areas are presented. The areas include: Queueing systems Inventory and supply-chain systems Reliability and maintainability Limited data 5.65

Useful models: Queueing Systems In a queueing system, interarrival and service-time patterns can be probabilistic. Sample statistical models for interarrival or service time distribution: Eponential distribution: if service times are completely random Normal distribution: fairly constant but with some random variability either positive or negative Truncated normal distribution: similar to normal distribution but with restricted values. Gamma and Weibull distributions: more general than eponential involving location of the modes of PDF s and the shapes of tails. Calling population Waiting line Server 5.66

Useful models: Inventory and supply chain In realistic inventory and supply-chain systems, there are at least three random variables: The number of units demanded per order or per time period The time between demands The lead time Time between placing an order and the receipt of that order Sample statistical models for lead time distribution: Gamma Sample statistical models for demand distribution: Poisson: simple and etensively tabulated. Negative binomial distribution: longer tail than Poisson more large demands. Geometric: special case of negative binomial given at least one demand has occurred. 5.67

Useful models: Reliability and maintainability Time to failure TTF Eponential: failures are random Gamma: for standby redundancy where each component has an eponential TTF Weibull: failure is due to the most serious of a large number of defects in a system of components Normal: failures are due to wear runtime downtime runtime Time 0 eor eod eor Time 5.68

Useful models: Other areas For cases with limited data, some useful distributions are: Uniform Triangular Beta f Other distribution: Bernoulli Binomial Hypereponential 5.69

Summary The world that the simulation analyst sees is probabilistic, not deterministic. In this chapter: Reviewed several important probability distributions. Showed applications of the probability distributions in a simulation contet. Important task in simulation modeling is the collection and analysis of input data, e.g., hypothesize a distributional form for the input data. Student should know: Difference between discrete, continuous, and empirical distributions. Poisson process and its properties. 5.70