Mote Carlo method ad applicatio to radom processes Lecture 3: Variace reductio techiques (8/3/2017) 1 Lecturer: Eresto Mordecki, Facultad de Ciecias, Uiversidad de la República, Motevideo, Uruguay Graduate program i egieerig, major i applied mathematics, GIMAS8AA, Mies Nacy, Uiversity of Lorraie. 1 March 9, 2017.
Cotets Variace reductio Atithetic variates Importace samplig Cotrol variates Stratified samplig Coditioal samplig
Variace reductio As we have see, a critical issue i MC method is the quality of estimatio. The questio we face is: ca we devise a method that produces, with the same umber of variates, a more precise estimatio? The aswer is yes, ad the geeral idea is the followig: If we wat to estimate µ = E X, to fid Y such that µ = E X = E Y, var Y < var X. The way to produce a good Y usually departs from the kowledge that we ca have about X. There are several methods to reduce variace, however there does ot to exist a geeral method that always produce gai i the variace, the case is that each problem has its ow good method.
Atithetic variates The method is simple, ad cosists i usig a symmetrized variate i the cases this is possible. For istace, if we wat to compute µ = 1 0 f (x)dx, we would have, with U uiform i [0, 1], We have X = f (U), Y = 1 (f (U) + f (1 U)). 2 var Y = 1 (var X + cov(f (U), f (1 U))) var X. 2 If cov(f (U), f (1 U)) < var(x) we have variace reductio.
Example: Uiform radom variables We compute π = 4 1 with = 10 6 variates. Our results 0 1 x 2 dx, Estimate Variace Classical Estimate 3.141379 0.000553 Atithetic Variates 3.141536 0.000205 True value 3.141593
Example: Tail probabilities of ormal radom variables We wat to compute the probability that a stadard ormal variable is larger that 3: µ = P(Z > 3), ˆµ = 1 ˆµ A = 1 2 1{Z k > 3}, k=1 (1{Z k > 3} + 1{ Z k > 3}). k=1 Our results with = 10 4 variates: Estimate Variace Classical Estimate 0.00121 0.00022 Atithetic Variates 0.00137 0.00016 True value 0.0013499
Importace samplig The importace samplig method cosists i chagig the uderlyig distributio of the variable used to simulate. It is specially suited for the estimatio of small probabilities (rare evets). Assumig that X f ad Y g, it is based i the followig idetity h(x)f (x) µ = E h(x) = h(x)f (x)dx = g(x)dx = E H(Y ), g(x) h(x)f (x) where we defie H(x) = g(x). The mai idea is to achieve that Y poits to the set where h takes large values. If ot correctly applied, the method ca elarge the variace.
Example: Tail probabilities of ormal radom variables µ = P(Z > 3) = = 3 3 e x 2 /2 2π e (x 3)2 /2 e (x 3)2 /2 dx e 3x+9/2 e (x 3)2 /2 2π dx = E e 3Y +9/2 1 {Y >3}, where Y N (3, 1). Our results with = 10 4 variates: Estimate Variace Classical Estimate 0.00121 2.2e-04 Atithetic Variates 0.00137 1.6e-04 Importace samplig 0.001340 1.5e-05 True value 0.0013499
Cotrol variates Give the problem of simulatig µ = E h(x) the idea is to cotrol the fuctio h through a fuctio g, close as posible to h, ad such that we kow β = E g(y ). We ca add a costat c to better adjustmet. More cocretely, the equatio is µ = E h(x) = E h(x) c(e g(x) β) = E(h(X) cg(x)) + cβ. The coefficiet c ca be chose i order to miimize the variace: var(h(x) cg(x)) = var h(x) + c 2 var g(x) 2c cov(h(x), g(x)).
This gives a miimum whe c = cov(h(x), g(x)). var(g(x)) As this quatities are usually ukow, we ca first ru a MC to estimate c. obtaiig the followig variace: var(h(x) cg(x)) = var(h(x)) cov(h(x), g(x))2 var(g(x)) = (1 ρ(h(x), g(x)) 2 ) var(h(x)) As ρ(h(x), g(x)) 1, we usually obtai a variace reductio.
Example: The computatio of π We choose g(x) = 1 x, that is close to 1 x 2 We first estimate c. I this case we kow β = E(1 U) ad var(1 U) = 1/12. After simulatio we obtiai ĉ 0.7
So we estimate π = 4 E( 1 U 2 0.7(1 U 1/2)). Our results with = 10 6 : Estimate Variace Classical Estimate 3.141379 0.000553 Atithetic Variates 3.141536 0.000205 Cotrol variates 3.141517 0.000215 True value 3.141593
Stratified samplig 2 The idea to reduce the variace that this method proposes is to produce a partitio of the probability space Ω, ad distribute the effort of samplig i each set of the partitio. Suppose we wat to estimate µ = E(X), ad suppose there is some discrete radom variable Y, with possible values y 1,..., y k, such that, for each i = 1,..., k: (a) the probability p i = P(Y = y i ), is kow; (b) we ca simulate the value of X coditioal o Y = y i. The proposal is to estimate E(X) = k E(X Y = y i )p i, i=1 by estimatig the k quatities E(X Y = y i ), i = 1,..., k. 2 Adapted from Simulatio 5ed.S. M. Ross, (2013) Elsevier
So, rather tha geeratig idepedet replicatios of X, we do p i of the simulatios coditioal o the evet that Y = y i for each i = 1,..., k. If we let ˆX i be the average of the p i observed values of X Y = y i, the we would have the ubiased estimator k ˆµ = ˆX i p i i=1 that is called a stratified samplig estimator of E(X). To compute the variace, we first have var( ˆX i ) = var(x Y = y i) p i
Cosequetly, usig the precedig ad that the ˆX i, are idepedet, we see that var(ˆµ) = = 1 k pi 2 var( ˆX i ) i=1 k p i var(x Y = y i ) = 1 E(var(X Y )). i=1 Because the variace of the classical estimator is 1 var(x), ad var(ˆµ) = 1 E(var(X Y )), we see from the coditioal variace formula var(x) = E(var(X Y )) + var(e(x Y )), that the variace reductio is 1 var(x) 1 E(var(X Y )) = 1 var E(X Y ),
That is, the variace savigs per ru is var E(X Y ) which ca be substatial whe the value of Y strogly affects the coditioal expectatio of X. O the cotrary, if X ad Y are idepedet, E(X Y ) = E X ad var E(X Y ) = 0. Observe that the variace of the stratified samplig estimator ca be estimated by var(ˆµ) = 1 k pi 2 s i, i=1 if s i is the usual estimator of the sample of X Y = y i. Remark: The simulatio of p i variates for each i is called the proportioal samplig. Alteratively, oe ca choose 1,..., k s.t. 1 + + k = that miimize the variace.
Example: Itegrals i [0, 1] Suppose that we wat to estimate We put We have µ = E(h(U)) = Y = j, if j 1 µ = E E(h(U) Y ) = 1 1 0 h(x)dx. U < j, for j = 1,...,. E(h(U (j) )), where U (j) is uiform i j 1 U < j. I this example we have k =, ad we use i = 1 variates for each value of Y. As j=1 j=1 U (j) U + j 1, the resultig estimator is ˆµ = 1 ( ) Uj + j 1 h.
To compute the variace, we have var(ˆµ) = 1 2 = 1 ( ) U + j 1 var h j=1 j=1 j j 1 (h(x) µ j ) 2 dx, where µ j = j j 1 h(x)dx. The reductio is obtaied because µ j is closer to h tha µ: var(ˆµ C ) = 1 1 0 (h(x) µ) 2 dx, where ˆµ C stads for the classic MC estimator.
Example: computatio of π We retur to Observig that π = 4 j U 1 0 1 x 2 dx. U + j 1 U (j), we combie stratified ad atithetic samplig: ˆµ = 2 ( ) Uj + j 1 2 ( ) j 2 Uj 1 + 1 j=1 For = 10 5 we obtai a estimatio ˆµ = 3.1415926537 π = 3.14159265358979 with 10 correct digits.
Coditioal samplig Remember the telescopic (or tower ) property of the coditioal expectatio: E(X) = E(E(X θ)). where θ is a auxiliar radom variable. I case we are able to simulate Y = E(X θ), we have the followig variace reductio: var(y ) = var(e(x θ)) = var(x) E(var(X θ)).
Example: Computig a expectatio Let U be uiform i [0, 1] ad Z N (0, 1). We wat to compute We first compute E(e UZ U = u) = µ = E(e UZ ). R e ux 1 2π e x 2 /2 dx = e u2 /2, so Y = E(e UZ U) = e U2 /2, ad E Y = 1 0 eu2 /2 du. Our results for two size samples. = 10 3 = 10 6 Classical 1.2145 ± 0.020 1.1951 ± 0.00060 Coditioal 1.1962 ± 0.004 1.1949 ± 0.00012 True 1.194958 Note that the classical method requires 2 samples.