IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α% cofidece itervals for the quatity, θ, we are tryig to estimate. We also describe methods to estimate the umber of samples that are required to achieve a give cofidece level ad we ed with a discussio of the bootstrap method for performig output aalysis. 1 Output Aalysis Recall the simulatio framework that we use whe we wat to estimate θ := E[h(X where X R. We first simulate X 1,..., X IID ad the set θ = h(x 1 +... + h(x The Strog Law of Large Numbers (SLLN the implies θ θ as w.p. 1. But at this poit we do t kow how large should be so that we ca have cofidece i θ as a estimator of θ. Put aother way, for a fixed value of, what ca we say about the quality of θ? We will ow aswer this questio ad to simplify our otatio we will take Y i := h(x i. 1.1 Cofidece Itervals Oe way to aswer this questio is to use a cofidece iterval. Suppose the that we wat to estimate θ ad we have a radom vector Y = (Y 1,..., Y whose distributio depeds o θ. The we seek L(Y ad U(Y such that P (L(Y θ U(Y = 1 α where 0 α 1 is a pre-specified umber. We the say that [L(Y, U(Y is a 100(1 α% cofidece iterval for θ. Note that [L(Y, U(Y is a radom iterval. However, oce we replace Y with a sample vector, y, the [L(y, U(y becomes a real iterval. We ow discuss the Chebyshev Iequality ad the Cetral Limit Theorem, both of which ca be used to costruct cofidece itervals. The Chebyshev Iequality Sice the Y i s are assumed to be IID we kow the variace of θ is give by Var( θ = 2 where 2 := Var(Y. Clearly a small value of Var( θ implies a more accurate estimate of θ ad this is ideed cofirmed by Chebyshev s Iequality which for ay k > 0 states that ( P θ θ k Var( θ k 2. (1 We ca see from (1 that a smaller value of Var( θ therefore improves our cofidece i θ. We could easily use Chebyshev s Iequality to costruct (how? cofidece itervals for θ but it is geerally very coservative. Exercise 1 Why does Chebyshev s Iequality geerally lead to coservative cofidece itervals? ( Istead, we will use the Cetral Limit Theorem to obtai better estimates of P θ θ k ad as a result, arrower cofidece itervals for θ.
Output Aalysis ad Ru-Legth Cotrol 2 The Cetral Limit Theorem The Cetral Limit Theorem is amog the most importat theorems i probability theory ad we state it here for coveiece with the symbol d deotig covergece i distributio. Theorem 1 (Cetral Limit Theorem Suppose Y 1,..., Y are IID ad E[Yi 2 <. The θ θ / d N(0, 1 as where θ = i=1 Y i/, θ := E[Y i ad 2 := Var(Y i. Note that we assume othig about the distributio of the Y i s other tha that E[Yi 2 <. If is sufficietly large i our simulatios, the we ca use the CLT to costruct cofidece itervals for θ := E[Y. We ow describe how to do this. 1.2 A Approximate 100(1 α% Cofidece Iterval for θ Let z 1 α/2 be the the (1 α/2 percetile poit of the N(0, 1 distributio so that P( z 1 α/2 Z z 1 α/2 = 1 α whe Z N(0, 1. Suppose ow that we have simulated IID samples, Y i, for i = 1,...,, ad that we wat to costruct a 100(1 α% CI for θ = E[Y. That is, we wat L(Y ad U(Y such that P (L(Y θ U(Y = 1 α. The CLT implies ( θ θ / is approximately N(0, 1 for large so we have P ( ( θ θ P z 1 α/2 z 1 α/2 ( z 1 α/2 θ θ z 1 α/2 1 α 1 α P ( θ z 1 α/2 θ θ + z 1 α/2 1 α. Our approximate 100(1 α% CI for θ is therefore give by [L(Y, U(Y = [ θ z 1 α/2, θ + z 1 α/2. (2 Recall that θ = (Y 1 +... + Y /, so L ad U are ideed fuctios of Y. There is still a problem, however, as we do ot usually kow 2. We resolve this issue by estimatig 2 with 2 i=1 = (Y i θ 2. 1 It is easy to show that 2 is a ubiased estimator of 2 ad that 2 2 w.p. 1 as. So ow we replace with i (2 to obtai [ [L(Y, U(Y = θ z 1 α/2, θ + z 1 α/2 (3 as our approximate 100(1 α% CI for θ whe is large.
Output Aalysis ad Ru-Legth Cotrol 3 Remark 1 Note that whe we obtai sample values of y = (y 1,..., y, the [L(y, U(y becomes a real iterval. The we ca o loger say (why ot? that P (θ [L(y, U(y = 1 α. Istead, we say that we are 100(1 α% cofidet that [L(y, U(y cotais θ. Furthermore, the smaller the value of U(y L(y, the more cofidece we will have i our estimate of θ. Example 1 (Pricig a Europea Call Optio Suppose we wat to estimate the price, C 0, of a call optio o a stock whose price process, S t, is a GBM(µ,. The relevat parameters are r =.05, T = 0.5 years, S 0 = $100, = 0.2 ad strike K = $110. The we kow that C 0 = E Q 0 [e rt max(s T K, 0 where we ca assume that S t GBM(r, uder the risk-eutral probability measure, Q. That is, we assume S T = S 0 exp ( (r 2 /2T + Z where Z N(0, T. Though we ca of course compute C 0 exactly, we ca also estimate C 0 usig Mote Carlo with (3 yieldig a approximate 100(1 α% CI for C 0 with Y i := e rt max(s (i T K, 0 deotig the ith discouted sample payoff of the optio. Based o = 100k samples, we obtai [15.16, 15.32 as our approximate 95% CI for C 0. Properties of the Cofidece Iterval The width of the cofidece iterval is give by U L = 2 z 1 α/2 ad so the half-width the is (U L/2. The width clearly depeds o α, ad. However, almost surely as, ad is a costat. Therefore, for a fixed α, we eed to icrease if we are to decrease the width of the cofidece iterval. Ideed, sice U L 1, we ca see for example that we would eed to icrease by a factor of four i order to decrease the width of the cofidece iterval by oly a factor of two. 2 Ru-Legth Cotrol Up to this poit we have selected i advace ad the computed the approximate CI. The width of the CI is the a measure of the error i our estimator. Now we will do the reverse by first choosig some error criterio that we wat our estimator to satisfy, ad the choosig so that this criterio is satisfied. There are two types of error that we will cosider: 1. Absolute error, which is give by E a := θ θ ad 2. Relative error, which is give by E r := θ θ. θ Now we kow that θ θ w.p. 1 as so that E a ad E r both 0 as. (If θ = 0 the E r is ot defied. However, i practice ad so the errors will be o-zero. We specify the followig error criterio: Error Criterio: Give 0 α 1 ad ɛ 0, we wat P(E ɛ = 1 α. E is the error type we have specified, i.e., relative or absolute. The goal the is to choose so that the error criterio is approximately satisfied ad this is easily doe. Suppose, for example, that we wat to cotrol absolute error, E a. The, as we saw earlier, P ( θ z 1 α/2 θ θ + z 1 α/2 1 α.
Output Aalysis ad Ru-Legth Cotrol 4 ( This the implies P θ θ z 1 α/2 1 α, so i terms of E a we have P (E a z 1 α/2 1 α. If we the wat P (E a ɛ 1 α, it clearly suffices to choose such that = 2 z 2 1 α/2 ɛ 2. If we are workig with relative error, the a similar argumet implies that P (E r ɛ 1 α if There are still some problems, however: = 2 z 2 1 α/2 θ 2 ɛ 2. 1. Whe we are cotrollig E r, we eed to kow ad θ i advace. 2. Whe we are cotrollig E a, we eed to kow i advace. Of course we do ot usually kow or θ i advace. I fact, θ is what we are tryig to estimate! There are two methods we ca use to overcome this problem: the two-stage method ad the sequetial method, both of which we will ow describe. 2.1 The Two-Stage Procedure Suppose we wat to satisfy the coditio P(E a ɛ = 1 α so that we are tryig to cotrol the absolute error. The we saw earlier that we would like to set = 2 z 2 1 α/2 ɛ 2. Ufortuately, we do t kow 2 but we ca solve this problem by first doig a pilot simulatio to estimate it. The idea is to do a small umber, p, of iitial rus to estimate 2. We the use our estimate, 2, to compute a estimate,, of. Fially, we repeat the simulatio, but ow we use rus. We have the followig algorithm. Two-Stage Mote Carlo Simulatio for Estimatig E[h(X / Do pilot simulatio first / for i = 1 to p geerate X i ed for set θ = h(x i /p set 2 = (h(x i θ 2 /(p 1 set = 2 z 2 1 α/2 ɛ 2 / Now do mai simulatio / for i = 1 to geerate X i ed for set θ = h(x i / set 2 = (h(x i θ 2 /( 1 set 100(1 α % CI = [ θ z 1 α/2, θ + z 1 α/2
Output Aalysis ad Ru-Legth Cotrol 5 For this method to work, it is importat that θ ad 2 be sufficietly good estimates of θ ad 2. Therefore, it is importat to make p sufficietly large. I practice, we usually take p 50. We ca use a aalogous two-stage procedure if we wat to cotrol the relative error ad have P(E r ɛ = 1 α. 2.2 The Sequetial Procedure Suppose agai that we wish to satisfy the coditio P(E a ɛ = 1 α. The we saw earlier that we would like to set = 2 z1 α/2 2 ɛ 2. I cotrast to the pilot procedure, we do ot precompute durig the sequetial procedure. Istead, we cotiue to geerate samples util z 1 α/2 ɛ where is agai the estimate of based upo the first samples. It is importat that be sufficietly large so that θ ad, 2 are sufficietly good estimates of θ ad 2, respectively. As a result, we typically isist that 50 before we stop. Approximate cofidece itervals are the computed as usual. Questio: Have we allowed ay biases to creep i here? We have the followig algorithm: Sequetial Mote Carlo Simulatio for Estimatig E[h(X set check = 0, = 1 while (check = 0 geerate X set θ = h(x i / set 2 = (h(x i θ 2 /( 1 if ( p ad ɛ else check = 1 ( z 1 α/2 = + 1 ed if ed while set 100(1 α % CI = [ θ z 1 α/2, θ + z 1 α/2 I practice we do ot eed to store every value, h(x i for i = 1,...,, i order to update θ ad. Ideed, we ca update θ ad efficietly by observig that θ = θ 1 + h(x θ 1 ad ( 2 2 = 1 2 + ( θ 1 θ 2 1. If we wat to cotrol the relative error ad have P(E r ɛ = 1 α, the we would simulate samples util (z 1 α/2 θ ɛ.
Output Aalysis ad Ru-Legth Cotrol 6 3 Output Aalysis Usig the Bootstrap We ca view 1 our output aalysis problem as oe of estimatig MSE(F := E F [(g(y 1,..., Y θ(f 2 (4 where θ(f = E F [X, g(y 1,..., Y := Ȳ ad F deotes the CDF of Y. I that case, we saw i Sectio 1 how we could use the CLT to costruct approximate cofidece itervals for θ. While this is certaily the most commo cotext i which we ecouter (4, other situatios arise where the CLT caot be easily used to obtai a cofidece iterval for θ(f. For example, if θ(f = Var(Y or θ(f = E [Y Y α, the a alterative method of costructig a cofidece iterval for θ will be required. The bootstrap method provides such a alterative ad i order to describe the method we will assume our problem is to estimate MSE(F as i (4. To begi with, recall that the empirical distributio, F e, is defied to be the CDF of the distributio that places a weight of 1/ o each of the simulated values Y 1,..., Y. The empirical CDF therefore satisfies i=1 F e (y = 1 {Y i y} ad for large it ca be show (ad should be ituitively clear that F e should be 2 a good approximatio to F. Therefore, as log as θ is sufficietly well-behaved, i.e. a cotiuous fuctio of F, the for sufficietly large we should have MSE(F MSE(F e = E Fe [(g(y 1,..., Y θ(f e 2. (5 The quatity MSE(F e is kow as the bootstrap approximatio to MSE(F ad is easy to estimate via simulatio as we shall see below. But first, however, we will cosider a example where MSE(F e ca be computed exactly. Ideed the bootstrap is ot required i this case but it is oetheless istructive to see the calculatios writte out explicitly. Example 2 (Applyig the Bootstrap to the Sample Mea Suppose we wish to estimate θ(f = E F [Y via the estimator θ = g(y 1,..., Y := Ȳ. As oted above, the bootstrap is ot ecessary i this case as we ca apply the CLT directly as i Sectio 1 to obtai cofidece itervals for θ [ (Ȳ 2 or equivaletly, we ca estimate the mea-squared error E θ = 2 / with / 2 = i=1 (y i ȳ 2 /(( 1. Lettig ȳ deote the mea of the observed, i.e. simulated, data-poits y 1,..., y, we obtai that the bootstrap estimator is give by MSE(F e = E Fe [ ( i=1 Y i ( i=1 = Var Y i Fe = Var F e (Y i=1 = (y i ȳ 2 2 ȳ where (6 follows sice E Fe [Y = ȳ, ad (7 follows sice the Y i s are IID F e. We therefore see that the bootstrap approximatio to the MSE is almost idetical to our usual estimator, 2 /. 2 (6 (7 1 We follow Sheldo M. Ross s Simulatio i our developmet of the bootstrap here. 2 Ideed it ca be show that F e(y coverges to F (y uiformly i y w.p. 1 as.
Output Aalysis ad Ru-Legth Cotrol 7 I cotrast to Example 2, we caot usually compute MSE(F e explicitly, but as it s a expectatio we ca easily use Mote-Carlo to estimate it. I this case we eed to simulate from F e which is easy to do ad so we obtai the followig bootstrap algorithm for estimatig MSE(F. Bootstrap Simulatio Algorithm for Estimatig MSE(F for i = 1 to B geerate Y 1,..., Y IID from F e set θ i b = g(y 1,..., Y set Zi [ θb b = i θ(f e ed for set MSE(F B = b=1 Z(b /B 2 The Zi b s (or equivaletly the θ i b s are the bootstrap samples ad a value of B = 100 is ofte sufficiet to obtai a sufficietly accurate estimate. I the ext example we apply the bootstrap approach i a historical simulatio cotext where we have real data observatios as opposed to simulated data. (The disadvatage with historical simulatio is that we typically have o cotrol over. Example 3 (Estimatig the Miimum Variace Portfolio Suppose we wish to ivest a fixed sum of moey i two fiacial assets, X ad Z say, that yield radom returs of R x ad R z, respectively. We ivest a fractio θ of our wealth i X, ad the remaiig 1 θ i Z. The goal is to choose θ to miimize the total variace, Var(θR x + (1 θr z, of our ivestmet retur. It is easy to see that the miimizig θ is give by z 2 xz θ = x 2 + z 2 (8 2 xz where 2 x = Var(R x, 2 z = Var(R z ad xz = Cov(R x, R z. I practice, we do ot kow these quatities ad therefore have to estimate them from historical data. We therefore obtai θ = 2 z xz 2 x + 2 z 2 xz. (9 as our estimator of the( miimum variace portfolio with x, 2 z 2 ad xz estimated from historical retur data Y 1,..., Y with Y i := R x (i, R z (i the joit retur i period i. We would like to kow how good a estimator θ is. More specifically, what is the (mea-squared error whe we use θ? We ca aswer this questio usig the bootstrap with θ(f := θ ad g(y 1,..., Y = θ the estimator give by (9. Exercise 2 Provide pseudo-code for estimatig MSE( θ := MSE(F, i Example 3. Exercise 3 Cosider the problem of estimatig θ(f = E [Y Y β for some fixed costat, β. Explai how you would use the bootstrap to estimate MSE(F i this case give Mote-Carlo samples Y 1,..., Y. 3.1 Costructig Bootstrap Cofidece Itervals The bootstrap method is also widely used to costruct cofidece itervals ad here we will cosider the so-called basic bootstrap iterval. Cosider our bootstrap samples θ 1, b..., θ B b ad suppose we wat a 1 α cofidece iterval for θ = θ(f. Let q l ad q u be the α/2 lower- ad upper-sample quatiles, respectively, of the bootstrap samples. The the fractio of bootstrap samples satisfyig q l θ b q u (10
Output Aalysis ad Ru-Legth Cotrol 8 is 1 α. But (10 is equivalet to θ q u θ θ b θ q l (11 where θ = g(y 1,..., y is our estimate of θ computed usig the origial data-set. This implies θ q u ad θ q l are the lower ad upper quatiles for θ θ b. The basic bootstrap assumes they are also the quatiles for θ θ. This makes sese ituitively ad ca be justified mathematically as ad if θ is a cotiuous fuctio of F. It therefore follows that θ q u θ θ θ q l (12 will occur i approximately i a fractio 1 α of samples. Addig θ across (12 yields a approximate (1 α% CI for θ of (2 θ q u, 2 θ q l.