on t0 t T, how can one compute the value E[g(X(T ))]? The Monte-Carlo method is based on the

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 203 Monte Carlo Euler for SDEs Consider the stochastic differential equation dx(t) = a(t, X(t))dt + b(t, X(t))dW (t) on t0 t T, how can one compute the value E[g(X(T ))]? The Monte-Carlo method is based on the approximation E[g(X(T ))] M j=1 g(x(t ; ωj)) M, where X is an approximation of X, here the Euler method.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 204 The error in the Monte-Carlo method is E[g(X(T ))] M j=1 g(x(t ; ωj)) M = E[g(X(T )) g(x(t ))] (28) + M j=1 E[g(X(T ))] g(x(t ; ωj)) M. (29) In the right hand side of the error representation (29), the first part is the time discretization error, which we will consider later, and the second part is the statistical error, which we study here.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 205 Monte Carlo Statistical Error Goal: Approximate the expected value, E[Y ], by a sample average of M iid samples M j=1 Y (ω j) M and choose M sufficiently large to control the statistical error, M j=1 E[Y ] Y (ω j). M

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 206 For M independent samples of Y denote sample average A(Y ; M), and sample standard deviation S(Y ; M) ofy by A(Y ; M) 1 M [ S(Y ; M) M j=1 Y (ωj) A(Y 2 ; M) (A(Y ; M)) 2 ] 1/2. Let σy { E [ Y E[Y ] 2 ]} 1/2

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 207 Exercise 13 Compute the integral I = f(x)dx by [0,1] d the Monte Carlo method, where we assume f(x) :[0, 1] d R. We have I = f(x) dx [0,1] d = f(x)p(x) dx (wherep is the uniform pdf) [0,1] d = E[f(x)] ( where x is uniformly distributed in [0, 1] d ) M j=1 IM, f(x(ωj)) M

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 208 The values {x(ωj)} are sampled uniformly in the cube [0, 1] d, by sampling the components xi(ωn) independently and uniformly on the interval [0, 1]. Remark 15 (Random number generators) One can generate approximate random numbers, so called pseudo random numbers, see the lecture notes. By using transformations, one can also generate more complicated distributions in terms of simpler ones.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 209 Example 8 Let Y be a given real valued random variable with P (Y x) =FY (x). Suppose that we want to sample iid from Y and that we can cheaply compute F 1 Y (u), u [0, 1]. Then, take U to be uniform distributed in [0, 1] and let Y (ω) =F 1 Y (U(ω)). We then have P (Y x) =P (F 1 Y (U) x) =P (U F Y (x)) = FY (x) as we wanted!

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 210 Acceptance-rejection sampling It generates sampling values from an arbitrary pdf ρy (x) by using an auxiliary pdf ρx(x). Assumptions: (i) It is simple to sample from ρx, (ii) There exists 0 <ɛ 1 s.t. ɛ ρ Y ρx (x) 1, for all x.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 211 Idea: Rejection sampling is usually used in cases where the form of ρy makes sampling difficult. Instead of sampling directly from ρy, we use samples from ρx. These samples from ρx are probabilistically accepted or rejected.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 212 Acceptance-rejection sampling The steps below generate a single realization of Y with pdf ρy. Step 1 Set k =1 Step 2 Sample two independent random variables: Xk from ρx and Uk U(0, 1). Step 3 If Uk ɛ ρ Y (Xk) then accept Y = X ρx(xk) k be a sample from ρy. Otherwise reject Xk, increment k by 1 and go to Step 1.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 213 Let us see that Y sampled by acceptance-rejection has indeed density ρy We have the acceptance probability P ( Uk ɛ ρ Y (Xk) ρx(xk) ) = =ɛ =ɛ ρ ɛ Y (x) ρ X (x) 0 duρx(x)dx ρ Y (x) ρx(x) ρ X(x)dx ρy (x)dx =ɛ

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 214 Let K(ω) be the first value of k for which Xk is accepted as a realization of Y. We want to show that XK has the desired density, ρy. Consider an open set B P (XK B) = k 1 P (Xk B,K = k) = ( P Xk B,Uk ɛ ρ ) Y (Xk) ρx(xk) k 1 }{{} does not depend on k =P ( Xk B,Uk ɛ ρ Y (Xk) ρx(xk) ) k 1 k 1 m=1 (1 ɛ) k 1 } {{ } =1/ɛ ( P Um >ɛ ρ ) Y (Xm) ρx(xm) }{{} =1 ɛ

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 215 To finish compute P ( Xk B,Uk ɛ ρ Y (Xk) ρx(xk) ) = =ɛ =ɛ B B B ρ ɛ Y (x) ρ X (x) 0 duρx(x)dx ρy (x) ρx(x) ρ X(x)dx ρy (x)dx which implies P (XK B) = as we claimed. B ρy (x)dx

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 216 Remark 16 (Acceptance-rejection cost) Compute the expected number of samples per accepted ones: E[K] = k 1 kp(k = k) = k 1 k (1 ɛ) k ɛ =1/ɛ Can you interpret this result?

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 217 Monte Carlo: Numerical example Consider the computation of the integral 1= N exp( [0,1] N n=1 xn)dx1...dxn/(e 1) N M = 1e6; % Max. number of realizations N = 20; % Dimension of the problem u = rand(m,n); f = exp(sum(u ) ); run_aver = cumsum(f)./(((1:m) )*(exp(1)-1)^n); plot(1:m, run_aver), figure, plot(1:m, run_aver), xlabel M figure,plot(1:m,(run_aver-1)), xlabel M figure,semilogy(1:m,abs(run_aver-1)), xlabel M,

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 218

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 219

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 220

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 221

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 222 Monte Carlo error analysis Consider the scaled random variable ZM M ( ) A(Y ; M) E[Y ] σy with cumulative distribution function FZM (x) P (Z M x), x R.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 223 The Central Limit Theorem is the fundamental result to understand the statistical error of Monte Carlo methods. Theorem 10 (The Central Limit Theorem) Assume ξj, j =1, 2, 3,... are independent, identically distributed (i.i.d) and E[ξj] =0, E[ξ j 2 ]=1. Then M j=1 ξj ν, (30) M where ν is N(0, 1) and denotes convergence of the distributions, also called weak convergence, i.e. the convergence (36) means E[g( M j=1 ξ j/ M)] E[g(ν)] for all bounded and continuous functions g.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 224 Characteristic function Let X be a r.v. then f(t) =E[e itx ] is called the characteristic function of X. This function identifies completely the distribution of X, namely Theorem 11 Two distributions having the same characteristic function are identical Example: Consider a standard normal distribution, X N(0, 1). Then f(t) =E[e itx ]=e t2 2

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 225 In fact, we have inversion formulas closely related to the Fourier transform a Theorem 12 Let x1,x2 be continuity points of FX. Then F (x2) F (x1) = 1 2π + e itx 2 e itx 1 it f(t)dt a See [Petrov]

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 226 Proof. Consider the characteristic function f(t) =E[e itξ j]. Then its derivatives satisfy f (m) (t) =E[i m ξ m j e itξ j]. (31) For the sample average of the ξj vars we have ( ) E[e it P M j=1 ξ j/ M M t ] = f M = ( f(0)+ t f (0) + 1 M 2 t 2 M f (0) + o ( t 2 M ) )M.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 227 The representation (31) implies f(0) = E[1] = 1, f (0) = ie[ξn] =0, f (0) = E[ξ 2 n]= 1. Therefore ( E[e it P M j=1 ξ j/ M ] = 1 t2 2M + o ( t 2 M )) M e t2 /2, as M e itx e x2 /2 = dx, (32) 2π R and we conclude that the Fourier transform of the pdf

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 228 (i.e. the characteristic function) of M j=1 ξ j/ M converges to the Fourier transform of the standard normal distribution. Therefore, E[g( M j=1 ξj/ M)] = }{{} = Parseval R R g(x)ρ P M j=1 ξ j/ M (x)dx f(t)f (g)(t)dt R }{{} = Parseval e t2 /2 F (g)(t)dt E[g(ν)].

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 229 Exercise 14 What is the error of IM I in Example 13? Let the error ɛm be defined by ɛm = M j=1 f(xj) M [0,1] d f(x)dx = M j=1 f(xj) E[f(x)] M. By the Central Limit Theorem, MɛM σν,where ν is

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 230 N(0, 1) and σ 2 = = f 2 (x)dx [0,1] d ( f(x) [0,1] d ( f(x)dx [0,1] d [0,1] d f(x)dx ) 2 ) 2 dx. In practice, σ 2 is approximated by ( ˆσ 2 = 1 M f(xj) M 1 j=1 M m=1 f(xm) M ) 2.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 231 Approximate error bound: Cαˆσ/ M.HereCα =3.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 232 Theorem13(Berry Esseen)Assume ( [ E Y E[Y ] 3]) 1/3 λ < +, σy then we have a uniform estimate in the central limit theorem (1 + x ) 3 M Here Φ is the distribution function of N(0, 1), Φ(x) = 1 x 2π exp ( s2 2 ) ds. (33) and CBE =30.51175... FZM (x) Φ(x) CBE λ 3

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 233 By the Berry Esseen thm., the statistical error ES(Y ; M) E[Y ] A(Y ; M) satisfies, c 0 > 0, ([ ]) σy P ES(Y ; M) c 0 M 2Φ(c 0 ) 1 C BE λ 3 (1 + c 0 ) 3 M.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 234 In practice choose c 0 1.65, 1 > 2Φ(c 0 ) 1 0.901 and the event S(Y ; M) ES(Y ; M) ES(Y ; M) c 0 (34) M has probability close to one, which involves the additional step to approximate σy by S(Y ; M).Thus, in the computations ES(Y ; M) is a good approximation of the statistical error ES(Y ; M).

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 235 Numerical Example: Taking c 0 = 3 yields 2Φ(c 0 ) 1=0.9973... and ([ P ES(Y ; M) 3 σ ]) Y 0.9973 0.4766 λ3. M M In particular, if Y is a uniform random variable, then λ 3 = 1 4 31.5 =1.299... and we have the bound ([ P ES(Y ; M) 3 σ ]) Y M 0.9973 0.6196. M Obs: the last term on the right will determine the confidence level for M 5 10 4

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 236 Numerical Example: Consider a Binomial r.v. with parameter p = 1/2, X = M Yi i=1 and Yi iid Bernoulli r.vars., σ 2 = p(1 p). Let Z = (X Mp) σ M, then we compare its cdf (computed exactly) vs. the CLT approximation, Φ(z). We do it for several values of M...

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 237

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 238 Adaptive Monte Carlo For a given TOLS > 0, the goal is to find M such that ES(Y ; M) TOLS. The following algorithm adaptively finds the number of realizations M to compute the sample average A(Y ; M) as an approximation to E[Y ]. With probability close to one, depending on c 0,the statistical error in the approximation is then bounded by TOLS.

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 239 routine Monte-Carlo(TOLS, Y, M0; EY ) Set the batch counter m =1,M[1] = M0 and ES[1] = 2 TOLS. Do while (ES[m] >TOLS) Compute M[m] newsamples of Y, along with the sample average EY A(Y ; M[m]), the sample variance S[m] S(Y ; M[m]) and the deviation ES[m +1] ES(Y ; M[m]). Compute M[m +1]by change M (M[m], S[m], TOLS; M[m + 1]). Increase m by 1. end-do end of Monte-Carlo

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 240 routine change M (Min, Sin, TOLS; Mout) { ( ) 2 M c0 Sin = min integer part, MCH Min TOLS } n = integer part (log 2 M )+1 Mout =2 n. (35) end of change M

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 241 Remark 17 (Parameters for change M) Here, M0 is a given initial value for M, and MCH > 1 is a positive integer parameter introduced to avoid a large new number of realizations in the next batch due to a possibly inaccurate sample standard deviation S[m]. Indeed, M[m +1] cannot be greater than MCH M[m]. We will use MCH = 2 in the next example:

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 242 Numerical Example: Adaptive MC, TOL = 1e 2 for 1= [0,1] exp( N N n=1 x n)dx1...dxn/(e 1) N, N =20. M Sample E Sample std Error est.comp. Error 200 1.2526 4.4003e+00 9.3e-01 2.5e-01 400 1.0411 2.1068e+00 3.1e-01 4.1e-02 800 0.9889 1.8730e+00 2.0e-01-1.1e-02 1600 1.0699 2.3410e+00 1.7e-01 7.0e-02 3200 1.0324 1.9087e+00 1.0e-01 3.2e-02 6400 1.0764 2.1659e+00 8.1e-02 7.6e-02 12800 1.0212 2.0788e+00 5.5e-02 2.1e-02 25600 9.9549 1.9036e+00 3.6e-02-4.5e-03 51200 1.0104 1.8946e+00 2.5e-02 1.0e-02 102400 0.98721 1.8248e+00 1.7e-02-1.3e-02 204800 0.99375 1.9103e+00 1.3e-02-6.6e-03

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 243 409600 0.99611 1.9320e+00 9.1e-03-3.9e-03

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 244 Question: Can you compute the confidence level corresponding to the above computations as a function of M using the BE Theorem?

SPRING 2008, CSC KTH - Numerical methods for SDEs, Szepessy, Tempone 245 Large Deviations Theory for rare events, deep in the distribution tails. Remember CLT and BET: Assume ξj, j=1, 2, 3,... are independent, identically distributed (i.i.d) and E[ξj] = 0, E[ξ j 2 ] = 1. Then M j=1 ξj ν, (36) M