Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso etworks) I the performace egieerig cotext, a importat class of stochastic models is the class of Jackso etworks. Backgroud See Reversibility ad Stochastic Networks by F. P. Kelly. Let (t) = ( 1 (t),..., d (t)) be the vector umber-i-system process for a irreducible d-statio ope Jackso etwork. The, = ( (t) : t 0) is a Markov jump process with state space Z d +. Whe is positive recurret, the equilibrium distributio of ca be computed i closed form. However, o correspodig aalytical theory exists for computig trasiet probabilities of the form α = P x ( (t) A). Goal: Compute α = P x ( (t) A). Oe approach (perhaps the oe most commoly suggested i books o stochastic modelig) is to compute α by solvig the backwards equatios for. I particular, compute α by solvig d u(t) = (Qu)(t) dt s/t u(0) = I A, where u(t) = (u(t, y ) : y Z d +) ad I A ( y ) = 1 if y A ad 0 otherwise. Here, Q is the rate matrix of ad α is obtaied from the relatio u(t, x ) = P x ( (t) A). Difficulty 1: computatio: The state space Z d + is coutably ifiite, so it must be trucated i order to do umerical How do we costruct a trucated problem? (i.e. There are may possible ways of doig this. Which is the best?) How do we costruct a good error boud? 1
Difficulty 2: The state space is d-dimesioal. If oe trucates each dimesio to {0, 1,..., r}, the umber of states i the trucated state space is (r + 1) d. Whe d is large, we must choose r small. This is problematic. To deal with these problems, a alterative is to sample (i.e. simulate stochastic trajectories of ). Suppose we simulate iid realizatios 1,..., of over [0, t]. We estimate α via α = 1 I( i (t) A). The LLN asserts that this method is cosistet as, so that as. Furthermore, the CLT states here that α α 1/2 (α α) α(1 α)n (0, 1) (1.1) as, where N (0, 1) is a (scalar) ormal rv with mea zero ad uit variace. Of course, (??) suggests the approximatio D α(1 α) α α + N (0, 1) for large ( D meas has approximately the same distributio as ). The error is approximately: ormal decays as 1/2 (i.e. slow ) error s decay rate is idepedet of dimesio d More geerally, i applyig samplig-based methods to compute α = E, we replicate iid copies of 1,..., of. If var <, the α = 1 i is cosistet for α, ad the CLT yields the approximatio α D α + σ N (0, 1) (1.2) for large, where σ = var. Note that the approximatio (??) idicates that for large, the error depeds o the uderlyig problem data through oly a simple scalar parameter, amely σ! So, while simulatio is slow, it is: dimesioally isesitive has a error that depeds o the problem data i a simple way Additioal advatages: flexibility (thik of what happes if we chage the service times to be a o-expoetial distributio) visualizatio 2
Problem 1.1 to compute Cosider the M/M/1 umber-i-system process = ((t) : t 0). Suppose we wish give that the system is started at t = 0 empty. α = P((t) = j), a.) Write dow the specific backwards equatios for computig this probability. b.) Suppose that we ow decide to chage the service time distributio so that it is uiformly distributed with the same mea. Write dow the correspodig itegral-differetial equatios for computig α. Problem 1.2 I the approximatio (??), oe might feel ucomfortable with the fact that we have ivoked the CLT. So, for ay give sample size, we have o a priori guaratee o the error. Show that the umber of samples required for a give absolute precisio ɛ (with prescribed probability 1 δ) does ideed scale as 1/ɛ 2 (i.e. square root covergece rate) ad does ideed deped oly o var. (Hit: Apply Chebyshev iequality.) Model Problem 2 represeted as Let = ((t) : t 0) be a geometric Browia motio, so that ca be (t) = (0) exp(µt + σb(t)) for some costats µ ad σ 2, where B = (B(t) : t 0) is a stadard Browia motio. Our goal is to compute the value of a Asia call We will solve this problem via simulatio. [ t + α = E x (s) ds K]. 0 Difficulty: There exists o exact algorithm for samplig the rv t 0 (s) ds. (1.3) However, it s easy to sample a close approximatio to (??), amely the Riema approximatio m 1 Similar issues arise i may other applicatios! it m m. (1.4) The radom object of iterest ca ot be exactly sampled. Oly a approximatio to the radom object ca be sampled. The fact that we are usig a approximatio reflects itself i the bias of the estimator: [ m 1 bias E x ( ) ] + it it m m K α. 3
This bias is the systematic error that is preset i the samplig-based procedure, ad is preset regardless of the umber of idepedet samples of (??) that oe simulates. Hece, to compute the value of the Asia call accurately requires choosig both m large ad large. But there is a tesio betwee m ad, sice m roughly equals the total computatioal effort (measured, say, i terms of floatig poit operatios i.e. flops). So, oe eeds to trade-off m ad ( variace-bias trade-off). This problem also offers us the opportuity to exploit the problem structure so as to improve computatioal efficiecy. (This will be a major emphasis of this course.) Note that ca be easily computed i closed form. So, C ca be easily geerated, alog with m 1 Z m 1 E x it m m it m 1 m m E x [ m 1 ( ) ] + it it m m K it m m Observe that EC = 0, so Z λc has the same expectatio as our (biased) estimator for α. This suggests that we choose λ so as to miimize var(z λc): λ cov(z, C) = varc. With λ chose i this way, we ow compute [ m 1 E x by samplig (Z, C) iid times ad estimate (??) via 1 ( ) ] + it it m m K (1.5) (Z j λ C j ). j=1 The use of the cotrol variate C ca substatially reduce the variace of the estimator, particularly whe the optio is deep i the moey. This cotrol variates techique is a special case of variace reductio. Problem 1.3 Let (Z, C ) be a joitly distributed radom vector i which Z is scalar ad C is a (radom) colum vector. Assume that EZ 2 < ad E C 2 <. We further presume that E C = 0 ad that the covariace matrix Σ = E C C T is o-sigular. 4
a.) Let λ be a colum vector ad cosider the cotrol variate estimator 1 (Z i λ T C i ), where (Z 1, C 1 ),..., (Z, C ) are iid copies of (Z, C ). What is the miimal variace choice of λ, assumig that the goal is to compute EZ? b.) I practice, the variace-miimizig choice λ must be estimated from the sample data (Z 1, C 1 ),..., (Z, C ). Propose a estimator λ for λ ad carefully prove that 1/2 (Z i λ C i ) σ(λ )N (0, 1) as, where σ 2 (λ ) is the miimal variace. (I other words, at the level of the CLT approximatio, there is o asymptotic loss associated with havig to estimate λ.) Problem 1.4 Show that [ m 1 E x as m, ad compute a ad δ. m it m K ] + E x [ t 0 ] + (s) ds K am δ 2 Variate Geeratio Readig Assigmet: Read Sectios 1, 2, 3a, 3b, 4 of Chapter 2 of the text. Problem 2.1 Cosider the Jackso etwork example i which the d statios are i tadem (i.e i series). Suppose that each statio is a sigle-server statio with µ i µ for i 1. Provide a algorithm for geeratig paths of that has a expected complexity (i terms of flops) that scales liearly i d. Problem 2.2 Problem 5.3 of the text. 3 Output Aalysis Readig Assigmet: Read Chapter 3 of text. Problem 3.1 Suppose that we wish to compute q p, where q p is the root of P( q p ) = p, so that q p is the p th quatile of. We assume that is a cotiuous rv with a strictly positive ad cotiuous desity f. We estimate q p via Q p, where Q p is the p the order statistic of a iid sample 1,..., from the distributio of. Prove rigorously that p(1 p) 1/2 (Q p q p ) N (0, 1) f(q p ) as. (Hit: Reduce the problem to oe i which the i s are sampled from a uiform (0,1) populatio.) 5
Problem 3.2 Prove that as, ad compute a ad δ. EQ p q p a δ Problem 3.3 Suppose that (Y 1, τ 1 ),..., (Y, τ ) is a iid sequece sampled from the populatio of (Y, τ), where τ 0 a.s.. Assume that there exists c < such that Y i cτ i a.s.. If Eτ > 0 ad Eτ p < for p 4, prove that as. p/2 1 E(Y /τ ) = EY/Eτ + a j j + o ( p/2 +1) j=1 Sequetial Stoppig: Suppose that we wish to compute α = E to a give absolute precisio ɛ. We wish to cotiue drawig observatios util we achieve precisio ɛ. More precisely, defie Ñ(ɛ) = if{ 1 : z 2 s 2 / ɛ 2 } (3.1) (where z is chose so that P( z N (0, 1) z) = 1 δ). Is it the case that this sequetial cofidece iterval (with radom sample size Ñ(ɛ)) is a asymptotic 100(1 δ)% cofidece iterval, i the sese that as ɛ 0? P α α en(ɛ) zs e N(ɛ) Ñ(ɛ), α en(ɛ) + zs N(ɛ) e Ñ(ɛ) 1 δ No!! The problem is that s 2 ca be uusually small (eve zero) for small sample sizes like = 2, 3, 4,.... This is kow as the problem of early stoppig. Oe theoretical way to avoid this is to modify the sequetial rule (??) to N(ɛ) = if{ 1 : a + z 2 s 2 / ɛ 2 } where (a : 1) is a determiistic positive o-icreasig sequece for which a = o(1/). With the presece of (a : 1) i the defiitio of N(ɛ), N(ɛ) l(ɛ), where l(ɛ) = if{ 1 : a ɛ 2 }. Hece, N(ɛ) a.s. as ɛ 0, removig the possibility of early stoppig. Assumig that E 2 < with σ 2 var > 0, we ca rigorously prove that [ α N(ɛ) zs N(ɛ), α N(ɛ) + zs ] N(ɛ) N(ɛ) N(ɛ) is a asymptotic 100(1 δ)% cofidece iterval for E as ɛ 0. The key steps are: i.) Prove that ɛ 2 N(ɛ) z 2 σ 2 a.s. as ɛ 0. 6
ii.) Prove that as ɛ 0. N(ɛ) (α N(ɛ) α) s N(ɛ) N (0, 1) To prove ii.), we ivolve the followig result (see, for example, A Course i Probability Theory by K. L. Chug): Theorem 1 Let Y 1, Y 2,... be a iid sequece of rv s with EY 2 1 <. Suppose that (T ɛ : ɛ > 0) is a family of Z + -valued rv s for which there exists a (determiistic) fuctio a ɛ 0 as ɛ 0 ad determiistic β satisfyig a ɛ T ɛ β as ɛ 0. The, as ɛ 0. ( Tɛ ) Tɛ Y i /T ɛ EY 1 vary 1 N (0, 1) The great majority of the fixed sample-size procedures that we will discuss i this course have sequetial aalogs to the above sequetial procedure described i the simple settig of computig E. 7