Stochastic Simulation

Similar documents
1 Introduction to reducing variance in Monte Carlo simulations

4. Partial Sums and the Central Limit Theorem

Output Analysis (2, Chapters 10 &11 Law)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.


MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Lecture 19: Convergence

Output Analysis and Run-Length Control

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Distribution of Random Samples & Limit theorems

Lecture 33: Bootstrap

Lecture 2: Monte Carlo Simulation

Random Variables, Sampling and Estimation

Topic 9: Sampling Distributions of Estimators

An Introduction to Randomized Algorithms

6.3 Testing Series With Positive Terms

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

There is no straightforward approach for choosing the warmup period l.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

7.1 Convergence of sequences of random variables

Topic 9: Sampling Distributions of Estimators

Frequentist Inference

Bayesian Methods: Introduction to Multi-parameter Models

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Expectation and Variance of a random variable

Stat 421-SP2012 Interval Estimation Section

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

The standard deviation of the mean

Advanced Stochastic Processes.

Topic 9: Sampling Distributions of Estimators

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Monte Carlo Integration

Basics of Probability Theory (for Theory of Computation courses)

Simulation. Two Rule For Inverting A Distribution Function

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Unbiased Estimation. February 7-12, 2008

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course

Fall 2013 MTH431/531 Real analysis Section Notes

Estimation of the Mean and the ACVF

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Convergence of random variables. (telegram style notes) P.J.C. Spreij

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

32 estimating the cumulative distribution function

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Chapter 6 Infinite Series

Sequences and Series of Functions

Lecture Chapter 6: Convergence of Random Sequences

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

NCSS Statistical Software. Tolerance Intervals

Lecture 4: April 10, 2013

Infinite Sequences and Series

Lecture 3 The Lebesgue Integral

Chapter 2 The Monte Carlo Method

This section is optional.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

1.010 Uncertainty in Engineering Fall 2008

Statistical Inference Based on Extremum Estimators

CS284A: Representations and Algorithms in Molecular Biology

Chapter 6 Principles of Data Reduction

Probability and statistics: basic terms

5. Likelihood Ratio Tests

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Statistics 511 Additional Materials

Notes for Lecture 11

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Section 14. Simple linear regression.

CHAPTER 10 INFINITE SEQUENCES AND SERIES

( ) = p and P( i = b) = q.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

7.1 Convergence of sequences of random variables

Vector Quantization: a Limiting Case of EM

MATH 10550, EXAM 3 SOLUTIONS

Learning Theory: Lecture Notes

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 5

Problem Set 4 Due Oct, 12

PRACTICE PROBLEMS FOR THE FINAL

EE 4TM4: Digital Communications II Probability Theory

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

Problem Set 2 Solutions

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

HOMEWORK I: PREREQUISITES FROM MATH 727

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Chapter 6 Sampling Distributions

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Machine Learning Brett Bernstein

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Lecture 20: Multivariate convergence and the Central Limit Theorem


Notes 19 : Martingale CLT

Application to Random Graphs

1 Approximating Integrals using Taylor Polynomials

A Question. Output Analysis. Example. What Are We Doing Wrong? Result from throwing a die. Let X be the random variable

Solution to Chapter 2 Analytical Exercises

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

Transcription:

Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso etworks) I the performace egieerig cotext, a importat class of stochastic models is the class of Jackso etworks. Backgroud See Reversibility ad Stochastic Networks by F. P. Kelly. Let (t) = ( 1 (t),..., d (t)) be the vector umber-i-system process for a irreducible d-statio ope Jackso etwork. The, = ( (t) : t 0) is a Markov jump process with state space Z d +. Whe is positive recurret, the equilibrium distributio of ca be computed i closed form. However, o correspodig aalytical theory exists for computig trasiet probabilities of the form α = P x ( (t) A). Goal: Compute α = P x ( (t) A). Oe approach (perhaps the oe most commoly suggested i books o stochastic modelig) is to compute α by solvig the backwards equatios for. I particular, compute α by solvig d u(t) = (Qu)(t) dt s/t u(0) = I A, where u(t) = (u(t, y ) : y Z d +) ad I A ( y ) = 1 if y A ad 0 otherwise. Here, Q is the rate matrix of ad α is obtaied from the relatio u(t, x ) = P x ( (t) A). Difficulty 1: computatio: The state space Z d + is coutably ifiite, so it must be trucated i order to do umerical How do we costruct a trucated problem? (i.e. There are may possible ways of doig this. Which is the best?) How do we costruct a good error boud? 1

Difficulty 2: The state space is d-dimesioal. If oe trucates each dimesio to {0, 1,..., r}, the umber of states i the trucated state space is (r + 1) d. Whe d is large, we must choose r small. This is problematic. To deal with these problems, a alterative is to sample (i.e. simulate stochastic trajectories of ). Suppose we simulate iid realizatios 1,..., of over [0, t]. We estimate α via α = 1 I( i (t) A). The LLN asserts that this method is cosistet as, so that as. Furthermore, the CLT states here that α α 1/2 (α α) α(1 α)n (0, 1) (1.1) as, where N (0, 1) is a (scalar) ormal rv with mea zero ad uit variace. Of course, (??) suggests the approximatio D α(1 α) α α + N (0, 1) for large ( D meas has approximately the same distributio as ). The error is approximately: ormal decays as 1/2 (i.e. slow ) error s decay rate is idepedet of dimesio d More geerally, i applyig samplig-based methods to compute α = E, we replicate iid copies of 1,..., of. If var <, the α = 1 i is cosistet for α, ad the CLT yields the approximatio α D α + σ N (0, 1) (1.2) for large, where σ = var. Note that the approximatio (??) idicates that for large, the error depeds o the uderlyig problem data through oly a simple scalar parameter, amely σ! So, while simulatio is slow, it is: dimesioally isesitive has a error that depeds o the problem data i a simple way Additioal advatages: flexibility (thik of what happes if we chage the service times to be a o-expoetial distributio) visualizatio 2

Problem 1.1 to compute Cosider the M/M/1 umber-i-system process = ((t) : t 0). Suppose we wish give that the system is started at t = 0 empty. α = P((t) = j), a.) Write dow the specific backwards equatios for computig this probability. b.) Suppose that we ow decide to chage the service time distributio so that it is uiformly distributed with the same mea. Write dow the correspodig itegral-differetial equatios for computig α. Problem 1.2 I the approximatio (??), oe might feel ucomfortable with the fact that we have ivoked the CLT. So, for ay give sample size, we have o a priori guaratee o the error. Show that the umber of samples required for a give absolute precisio ɛ (with prescribed probability 1 δ) does ideed scale as 1/ɛ 2 (i.e. square root covergece rate) ad does ideed deped oly o var. (Hit: Apply Chebyshev iequality.) Model Problem 2 represeted as Let = ((t) : t 0) be a geometric Browia motio, so that ca be (t) = (0) exp(µt + σb(t)) for some costats µ ad σ 2, where B = (B(t) : t 0) is a stadard Browia motio. Our goal is to compute the value of a Asia call We will solve this problem via simulatio. [ t + α = E x (s) ds K]. 0 Difficulty: There exists o exact algorithm for samplig the rv t 0 (s) ds. (1.3) However, it s easy to sample a close approximatio to (??), amely the Riema approximatio m 1 Similar issues arise i may other applicatios! it m m. (1.4) The radom object of iterest ca ot be exactly sampled. Oly a approximatio to the radom object ca be sampled. The fact that we are usig a approximatio reflects itself i the bias of the estimator: [ m 1 bias E x ( ) ] + it it m m K α. 3

This bias is the systematic error that is preset i the samplig-based procedure, ad is preset regardless of the umber of idepedet samples of (??) that oe simulates. Hece, to compute the value of the Asia call accurately requires choosig both m large ad large. But there is a tesio betwee m ad, sice m roughly equals the total computatioal effort (measured, say, i terms of floatig poit operatios i.e. flops). So, oe eeds to trade-off m ad ( variace-bias trade-off). This problem also offers us the opportuity to exploit the problem structure so as to improve computatioal efficiecy. (This will be a major emphasis of this course.) Note that ca be easily computed i closed form. So, C ca be easily geerated, alog with m 1 Z m 1 E x it m m it m 1 m m E x [ m 1 ( ) ] + it it m m K it m m Observe that EC = 0, so Z λc has the same expectatio as our (biased) estimator for α. This suggests that we choose λ so as to miimize var(z λc): λ cov(z, C) = varc. With λ chose i this way, we ow compute [ m 1 E x by samplig (Z, C) iid times ad estimate (??) via 1 ( ) ] + it it m m K (1.5) (Z j λ C j ). j=1 The use of the cotrol variate C ca substatially reduce the variace of the estimator, particularly whe the optio is deep i the moey. This cotrol variates techique is a special case of variace reductio. Problem 1.3 Let (Z, C ) be a joitly distributed radom vector i which Z is scalar ad C is a (radom) colum vector. Assume that EZ 2 < ad E C 2 <. We further presume that E C = 0 ad that the covariace matrix Σ = E C C T is o-sigular. 4

a.) Let λ be a colum vector ad cosider the cotrol variate estimator 1 (Z i λ T C i ), where (Z 1, C 1 ),..., (Z, C ) are iid copies of (Z, C ). What is the miimal variace choice of λ, assumig that the goal is to compute EZ? b.) I practice, the variace-miimizig choice λ must be estimated from the sample data (Z 1, C 1 ),..., (Z, C ). Propose a estimator λ for λ ad carefully prove that 1/2 (Z i λ C i ) σ(λ )N (0, 1) as, where σ 2 (λ ) is the miimal variace. (I other words, at the level of the CLT approximatio, there is o asymptotic loss associated with havig to estimate λ.) Problem 1.4 Show that [ m 1 E x as m, ad compute a ad δ. m it m K ] + E x [ t 0 ] + (s) ds K am δ 2 Variate Geeratio Readig Assigmet: Read Sectios 1, 2, 3a, 3b, 4 of Chapter 2 of the text. Problem 2.1 Cosider the Jackso etwork example i which the d statios are i tadem (i.e i series). Suppose that each statio is a sigle-server statio with µ i µ for i 1. Provide a algorithm for geeratig paths of that has a expected complexity (i terms of flops) that scales liearly i d. Problem 2.2 Problem 5.3 of the text. 3 Output Aalysis Readig Assigmet: Read Chapter 3 of text. Problem 3.1 Suppose that we wish to compute q p, where q p is the root of P( q p ) = p, so that q p is the p th quatile of. We assume that is a cotiuous rv with a strictly positive ad cotiuous desity f. We estimate q p via Q p, where Q p is the p the order statistic of a iid sample 1,..., from the distributio of. Prove rigorously that p(1 p) 1/2 (Q p q p ) N (0, 1) f(q p ) as. (Hit: Reduce the problem to oe i which the i s are sampled from a uiform (0,1) populatio.) 5

Problem 3.2 Prove that as, ad compute a ad δ. EQ p q p a δ Problem 3.3 Suppose that (Y 1, τ 1 ),..., (Y, τ ) is a iid sequece sampled from the populatio of (Y, τ), where τ 0 a.s.. Assume that there exists c < such that Y i cτ i a.s.. If Eτ > 0 ad Eτ p < for p 4, prove that as. p/2 1 E(Y /τ ) = EY/Eτ + a j j + o ( p/2 +1) j=1 Sequetial Stoppig: Suppose that we wish to compute α = E to a give absolute precisio ɛ. We wish to cotiue drawig observatios util we achieve precisio ɛ. More precisely, defie Ñ(ɛ) = if{ 1 : z 2 s 2 / ɛ 2 } (3.1) (where z is chose so that P( z N (0, 1) z) = 1 δ). Is it the case that this sequetial cofidece iterval (with radom sample size Ñ(ɛ)) is a asymptotic 100(1 δ)% cofidece iterval, i the sese that as ɛ 0? P α α en(ɛ) zs e N(ɛ) Ñ(ɛ), α en(ɛ) + zs N(ɛ) e Ñ(ɛ) 1 δ No!! The problem is that s 2 ca be uusually small (eve zero) for small sample sizes like = 2, 3, 4,.... This is kow as the problem of early stoppig. Oe theoretical way to avoid this is to modify the sequetial rule (??) to N(ɛ) = if{ 1 : a + z 2 s 2 / ɛ 2 } where (a : 1) is a determiistic positive o-icreasig sequece for which a = o(1/). With the presece of (a : 1) i the defiitio of N(ɛ), N(ɛ) l(ɛ), where l(ɛ) = if{ 1 : a ɛ 2 }. Hece, N(ɛ) a.s. as ɛ 0, removig the possibility of early stoppig. Assumig that E 2 < with σ 2 var > 0, we ca rigorously prove that [ α N(ɛ) zs N(ɛ), α N(ɛ) + zs ] N(ɛ) N(ɛ) N(ɛ) is a asymptotic 100(1 δ)% cofidece iterval for E as ɛ 0. The key steps are: i.) Prove that ɛ 2 N(ɛ) z 2 σ 2 a.s. as ɛ 0. 6

ii.) Prove that as ɛ 0. N(ɛ) (α N(ɛ) α) s N(ɛ) N (0, 1) To prove ii.), we ivolve the followig result (see, for example, A Course i Probability Theory by K. L. Chug): Theorem 1 Let Y 1, Y 2,... be a iid sequece of rv s with EY 2 1 <. Suppose that (T ɛ : ɛ > 0) is a family of Z + -valued rv s for which there exists a (determiistic) fuctio a ɛ 0 as ɛ 0 ad determiistic β satisfyig a ɛ T ɛ β as ɛ 0. The, as ɛ 0. ( Tɛ ) Tɛ Y i /T ɛ EY 1 vary 1 N (0, 1) The great majority of the fixed sample-size procedures that we will discuss i this course have sequetial aalogs to the above sequetial procedure described i the simple settig of computig E. 7