Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Size: px
Start display at page:

Download "Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan"

Transcription

1 Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan Septembre 2015

2

3 Contents 1 Introduction The principle The error analysis of Monte-Carlo method Simulation of random vectors or stochastic processes Random variable of uniform distribution on 0, Inverse method Transformation method Reject method Simulation of Gaussian vector Simulation of Brownian motion Variance reduction techniques The antithetic variable Variate control method Stratification Importance sampling method Stochastic gradient algorithm 21 iii

4

5 Chapter 1 Introduction 1.1 The principle The principle of Monte-Carlo method is to use simulations of the random variables to estimate a quantity such as E fx) = R d fx) µdx), 1.1) where f : R d R is a function, X is some random vector taking value in R d with distribution µ. It may also be used to solve an optimization problem of the kind where f θ : R d R) θ Θ is a family of functions. min E f θ X), 1.2) θ Θ To solve the basic problem 1.1), the method consists in simulating a sequence of i.i.d. random vectors X k ) k 1 with the same distribution of X, and then estimate EfX) by the empirical mean value Y n := 1 n n Y k := 1 n n fx k ). 1.3) The advantages of the Monte-Carlo are usually its simplicity, flexibility and efficiency for high dimensional problems. It can also be served as an alternative method or benchmark) for other numerical methods. 1.2 The error analysis of Monte-Carlo method Theorem 1.1 Law of large number) Let Y k ) k 1 be a sequence of i.i.d. random variables such that E Y <. Then with Y n defined in 1.3), one has ) P lim Y n = EY n 1 = 1.

6 2 Theorem 1.2 Central limit theorem) Let Y k ) k 1 be a sequence of i.i.d. random variables such that E Y 2 <. Then n Y n EY ) N 0, VarY ) ). And consequently, n ˆσ n Y n EY ) N 0, 1 ), where ˆσ 2 n : = 1 n n ) 2. Yk Y n 1.4) Notice that ˆσ 2 n defined in 1.4) is an estimator of the variance VarY from the sequence Y k ) k 1, and it admits the representation ˆσ 2 n = 1 n n Y 2 k Y n) 2. The central limit theorem induces that the asymptotic confidence interval of level pr) := 1 2π e x2 /2 dx of the estimator Y n is given by x R Y n R n ˆσ n, Y n + R n ˆσ n. 1.5) More precisely, it means that P Y n n R ˆσ n EY Y n + n R ˆσ n pr), as n. Remark 1.1 In practice, for pr) = 95%, we know R Conclusions To utilize the Monte-Carlo method, the first issue is then how to simulate a sequence of random vector X k ) k 1 given its law µ or given its definition based on other random elements, the second issue is how to improve the estimator by reducing its error. The error of Monte-Carlo method is measured by its confidence interval 1.5), whose length is given by 2Rˆσ n / n. When the confidence level is fixed, R is fixed. One can then use a larger n, where the cost is the computation time which is proportional to n in general. Otherwise, one can reduce ˆσ n by find some other random variable Ỹ such that E Ỹ = EY and Var Ỹ < VarY. We shall address these issues in the following of the course.

7 Chapter 2 Simulation of random vectors or stochastic processes 2.1 Random variable of uniform distribution on 0, 1 Here we admits that we know how to simulate a random variable of uniform distribution U0, 1. In particular, most of the programming environment are equipped with the computer with a generator of uniform distribution. However, it worths noticing that a generator of random variables in a computer is a deterministic program, and hence it generates always a sequence of deterministic variables, in place of a sequence of independent random variables. In practice, we search for a generator such that the sequence of generated variables has similar performance statistically. 2.2 Inverse method Let X be random variable, its distribution function, defined by F x) := PX x), is a right-continuous non-decreasing function from R to 0, 1. We then define its rightcontinuous generalized inverse function by F 1 u) := inf { x R : F x) u }. Theorem 2.1 Let X be a random variable with distribution function F and U be a random variable of uniform distribution U0, 1 on interval 0, 1. Then X F 1 U) in law. 3

8 4 Proof. Notice that for any y R, we have F 1 u) y u F y). It follows that for any y R, P F 1 U) y = P U F y) = F y). Example 2.1 i) Let X be a random variable of discrete distribution, PX = x k ) = p k where x k R, p k 0 for k N and k N p k = 1. Then let U U0, 1, and Z be the random variable defined by Z := x n, if U n 1 p k, k=0 ) n p k. Then Z has the same distribution of X. The definition of Z can be interpreted as F 1 U) with the distribution function F of X. ii) Let X Eλ) be a random variable of exponential distribution of parameter λ > 0, i.e. the density function is given by fx) := λe λx 1 x 0, and the distribution is given by F x) := 1 e λx. By direct computation, F 1 u) = λ 1 log1 u) for every u 0, 1). Then for U U0, 1), k=0 F 1 U) = λ 1 log1 U) λ 1 logu) Eλ), since 1 U and U have the same distribution when U U0, 1). 2.3 Transformation method Proposition 2.1 Box-Muller) Suppose that U and V are independent random variables of uniform distribution on the interval 0, 1. Let X := 2 logu) cos2πv ) and Y := 2 logu) sin2πv ). Then X and Y are two independent random variables of Gassian distribution N0, 1). Proof. Exercise 2.1 Let U, V ) be a random vector which is uniformly distributed on the disk {u, v) : u 2 + v 2 1}. Let X := U 2 logu 2 + V 2 U 2 + V 2 and Y := V 2 logu 2 + V 2 U 2 + V 2.

9 5 Prove that X, Y ) N0, I 2 ). 2.4 Reject method Let f : R d R + and g : R d R + be two density functions such that, for some constant γ > 0, one has fx) γgx), for all x R d. In practice, g is the density function of some distribution with well-known simulation method such as Gaussian distribution, uniform distribution, exponential distribution, etc.), but f is the density function of some distribution without an easy simulation method. The objective is to use the simulations of random variables of distribution g, together with a rejection procedure, to simulate the random variable of distribution f. Proposition 2.2 Let Y k ) k 1 be an i.i.d. sequence of random variables of density g, and U k ) k 1 be an i.i.d. sequence of random variable of distribution U0, 1. Moreover, Y k ) k 1 and U k ) k 1 are also independent. Define a sequence X n ) n 1 of random variables { X n := Y Nn, with N 0 := 0, andn n+1 := min k > N n : U k fx } k). γgx k ) Then X n ) n 1 is a sequence of i.i.d. random variable of density f. Proof. Exercise 2.2 Let f : R R + be defined by fx) := 1 x ) +. Give a numerical algorithm based on the above reject method) to simulate an i.i.d. sequence of random variables of density f. 2.5 Simulation of Gaussian vector The case of dimension 2 of Gaussian distribution, ρ 1, 1 a constant. Define Let Z 1, Z 2 ) N0, I 2 ) be two independent random variable X 1 := Z 1 and X 2 := ρz ρ 2 Z 2.

10 6 It is clear that X 1 X 2 N0, 1) and CovX 1, X 2 ) = CovZ 1, ρz ρ 2 Z 2 ) = ρ, which means that X 1 X 2 ) N More generally, for Z 1, Z 2 ) N0, I 2 ), let 0 0 ), 1 ρ ρ 1 )) X 1 := µ 1 + σ 1 Z 1 and X 2 := µ 2 + σ 2 ρz ρ 2 Z 2 ),. then X 1 X 2 ) N µ 1 µ 2 ), σ 2 1 ρσ 1 σ 2 ρσ 1 σ 2 σ 2 2 )). 2.1) Notice also that any Gaussian vector of dimension 2 can be written in the form 2.1). General case: Cholesky s method Let Z N0, I d ) be standard Gaussian random vector of dimension d, and A be a lower triangular matrix of dimension d d, i.e. Z = Z 1 Z d and A = A A 21 A 22 0 A d1 A d2 A dd. Then the vector X := AZ N0, Σ) with variance-covariance matrix Σ := AA T. Cholesky s method consists in finding a lower triangular matrix A such that AA T = Σ, where Σ is a given variance-covariance matrix. Let us write the equation AA T = Σ as A A 11 A 21 A d1 Σ 11 Σ 12 Σ 1d A 21 A A 22 A d2 = Σ 21 Σ 22 Σ 2d. A d1 A d2 A dd. 0 0 A dd. Σ d1 Σ d2 Σ dd The solution is given by A 2 11 = Σ 11 A 21 A 11 = Σ 21 A d1 A 11 = Σ d1 A ii = Σ ii i 1 A2 ik A ij = Σ ii j 1 A ) ika jk /Ajj, j < i. Exercise 2.3 Provide a pseudo code for the algorithm of Cholesky s method.

11 7 2.6 Simulation of Brownian motion Definition 2.1 A standard Brownian motion W is a stochastic process starting from 0, and having i) continuous paths i.e. t W t is almost surely continuous), ii) independent increments i.e. W t W s W s W r, 0 r s t), iii) stationary and Gaussian increments i.e. W t W s N0, t s)). Forward simulation Using the the independent and stationary Gaussian increments property, one can simulate a path of a Brownian motion in a forward way. Let 0 = t 0 < t 1 < be a discrete grid of R +, Z k ) k 1 be a sequence of i.i.d. random variable of Gaussian distribution N0, 1), we define W by W 0 := 0 and W tk+1 := W tk + t k+1 t k Z k+1. Then W is a sample of paths of the Brownian motion on the discrete grid t k ) k 0. Brownian bridge The forward simulation method consists in simulating W tk+1 knowing the value of W tk. There is backward simulation method, i.e. one simulates first the variable W tn, and then simulates the variables W tn 1, W tn 2,, W t2, W t1 recursively. Proposition 2.3 Let 0 = t 0 < t 1 < be a discrete grid, then the conditional distribution of W k knowing W ti, i k ) is a Gaussian distribution Nµ, σ 2 ) with µ = t k+1 t t W tk 1 + t k t k 1 W tk+1 and σ 2 = t k+1 t k )t k t k 1 ), t k+1 t k 1 t k+1 t k 1 t k+1 t k 1 in particular, L W tk Proof. W tk 1 = x, W tk+1 = y ) = N x + t k t k 1 y, t k+1 t k 1 t k+1 t k )t k t k 1 ) ). t k+1 t k 1 Exercise 2.4 Give the backward simulation algorithm for a Brownian motion on 0, 1, using the above results.

12 8

13 Chapter 3 Variance reduction techniques Recall that the principle of the Monte-Carlo method is to estimate EY, where Y := fx) ) by Y n := 1 n n Y k := 1 n fx k ), 3.1) n with simulations Y k ) k 1 or more precisely X k ) k 1 ) of random variables Y. In view of the confidence interval 1.5), it is clear that to reduce the error, one should either augment the simulation number n in cost of computation time), or reduce the variance ˆσ n. 2 More precisely, since the variance VarY of Y is fixed, the real issue is to find some other random variable Ỹ satisfying E Ỹ = EY and Var Ỹ < VarY. 3.2) In most of cases, Ỹ admits the representation Ỹ := gx) with some function g : Rd R. Then using the simulations of Ỹ, one could expect an estimator of EY = E Ỹ ) with smaller error. 3.1 The antithetic variable For many random variables vectors) X, their distributions have some symmetric property and admits a simple transformation A : R d R d such that AX) and X have the same distribution. We call AX) the antithetic variable of X. For example, let X U0, 1, then AX) := 1 X U0, 1; let X N0, σ 2 ), then AX) := X N0, σ 2 ). It follows that 9

14 10 E fax)) = EfX), and hence E Ỹ = EY with Ỹ := fx) + f AX) ). 2 Then a new Monte-Carlo estimator can be given by Ỹ n := 1 n n fx k ) + fax k )) 2 = 1 2n n ) fx k ) + fax k )). 3.3) In some context, we can expect that VarỸ much smaller than VarY see the criteria 3.2)). Example 3.1 Naive Examples) i) Let fx) := x and X N0, σ 2 ) be a Gaussian r.v., then Y := fx) = X. The random variable X admits an antithetic variable X. fx)+f X) Then Ỹ := 2 0 and it is clear that i.e. 3.2) is true for this example. E Ỹ = EY and VarY > Var Ỹ = 0. ii) Let fx) := x, Y := fu) and U U0, 1 which admits an antithetic variable 1 U. fu)+f1 U) Then Ỹ := and it is clear that 3.2) holds true in this context. Exercise 3.1 Let U U0, 1, then X := logu) λ Then Comparer the variance of X and that of E X X + X = E 2 X+ X 2. Eλ) and X := log1 U) λ. Eλ). Variance analysis Then one has whenever Var Ỹ = 1 4 By direct computation, we have VarfX) + 2Cov fx), fax)) ) + VarfAX)) = 1 2 VarY Cov fx), fax)). Var Ỹ 1 VarY 3.4) 2 Cov fx), fax)) 0. Intuitively, since AX) is the antithetic variable, we can expect that AX) has a negative correlation with X. In practice, the computation error of estimators Ỹn in 3.3))

15 11 and Y 2n in 3.1)) should be the same, and under the condition 3.4), one has Var Ỹ n Var Y 2n. Remark 3.1 It is very important to use the same simulation X k for estimator Ỹn in 3.3). Otherwise, image that X k ) k 1 is i.i.d with X k ) k 1, and consider Ŷ n := 1 n n fx k ) + fa X k )). 2 Then one has Var Ŷ n = Var Y 2n, which means that the estimator Ŷn is not better than the classical estimator. Case of Gaussian distribution When X is of Gaussian distribution, we can provide more precise criteria for condition 3.4). Proposition 3.1 Let X Nµ, σ 2 ), which admits an antithetic variable AX) := 2µ X. Let f : R R be a monotone function, then Cov fx), fax)) 0. Proof. Without loss of generality, we can suppose that X N0, 1). Let X 1, X 2 be two independent r.v. of distribution N0, 1), then for a monotone function, one has fx1 ) fx 2 ) ) f X 1 ) f X 2 ) ) 0. And hence E fx1 ) fx 2 ) ) f X 1 ) f X 2 ) ) 0. By direct computation, it follows that fx1 0 E ) fx 2 ) ) f X 1 ) f X 2 ) ) = 2 Cov fx 1 ), f X 1 ) = 2 Cov fx), f X). Example 3.2 In application of finance, a problem may be E e rt S T K) + with S T := S 0 e r σ2 /2)T +σw T, where W is a Brownian motion, i.e. W T N0, T ). In this case, it is clear that Y := e rt S T K) + can be expressed as an increasing function of W T, and one can then use the antithetic variable technique in the Monte-Carlo method.

16 Variate control method We recall that the random variable takes the form fx) with some random vector X and f : R d R. Suppose that there is some other function g : R d R close to f) and such that the constant m := EgX) can be computed explicitly. Then for every constant b R, one has EY = EỸ b) with Ỹ b) := fx) b gx) m ). It follows another Monte-Carlo estimator of EY, with simulations X k ) k 1, 1 n n Ỹ k b) where Ỹ k b) := fx k ) b gx k ) m ). 3.5) Example 3.3 Let X U0, 1, f : 0, 1 R be some function, and Y := fx). By approximation, one may find some polynomial function g : 0, 1 R such that f g. Besides, the constant m := EgX) is known explicitly whenever g is a polynomial. Take b = 1, it follows that and we can expect that EY = E fx) gx) + m Var fx) gx) + m = Var fx) gx) < Var fx), since g is an approximation of f. Variance analysis By direct computation, it follows that Var Ỹ b) = Var fx) b gx) m ) = Var fx) 2b Cov fx), gx) + b 2 Var gx). We then minimize the variance on the control variable b: min Var Ỹ b) = VarY b R with the optimal control variable Cov fx), gx) ) 2 Var gx) = VarY 1 ρ 2 fx), gx) ), b := Cov fx), gx) Var gx). 3.6)

17 13 Remark 3.2 i) The above computation shows that to use the variate control method, one should search for a function g : R d R such that m := E gx) is known explicitly, and ρ fx), gx) ) is big. ii) As in Remark 3.1, it is very important to use the same simulation X k for estimator Ỹ n in 3.5). Otherwise, image that X k ) k 1 is i.i.d with X k ) k 1, and consider Ŷ n := 1 n fxk ) bg n X k ) m) ), Then one has ρ fx), g X) ) = 0, which means that the estimator Ŷn is not better than the classical estimator. Estimation of the optimal control variable b In practice, we use Monte-Carlo method to compute EfX) since it can be computed explicitly. Then there is no reason we know how to compute Cov fx), gx), and hence we need to estimate it to obtain an estimation of b in 3.6). A natural estimator of b should then be ˆbn := n Y k Y n )gx k ) G n ) n gx k) G n ) 2, with G n := 1 n n gx k ). 3.7) Further, to avoid the correlation between the estimator ˆb n and the simulations Ỹkˆb n ) in 3.5), we can estimate first ˆb n with a small number n of simulations of X k ) 1 k n, then use a large number m of simulations X k ) n+1 k n+m to estimate EY, i.e. to obtain the estimator 1 m m Ỹ n+k ˆb n ). Multi-variate controls On can also consider several functions g k : R d R),,n. Denote Z k := g k X), Z = Z 1,, Z n ) and suppose that EZ is known explicitly, we can then have a new variate control candidate Ỹ b) := Y b, Z EZ, b = b 1,, b n ) R n. It is clear that EY = E Ỹ b), and by similar computation, one has min Var Ỹ b) ) = min σ 2 b R n b R d Y 2bΣ Y Z + b T Σ ZZ b,

18 14 where Σ Y, Σ Y Z and Σ ZZ are given by Var Y Z = Σ Y Σ Y Z Σ Y Z Σ ZZ ). The optimal control b is provided by b := Σ 1 ZZ Σ ZY. 3.3 Stratification Recall that X is random vector and Y := fx) for some function f : R d R. Let g : R d R m be some function and denote Z := gx) another random vector. Let A k ) 1 k K be partition of the support of Z in R n, i.e. A 1,, A K are disjoints such that K p k = 1 with p k := PZ A k ), k = 1, K. It follows by Bayes s formula that E Y = K p k E Y Z A k. Assumption 3.1 i) The values of probability p k ) 1 k K are known explicitly. ii) One knows how to simulate a random variable following the conditional distribution LY Z A k ). Under Assumption 3.1, we can propose another Monte-Carlo estimator of E Y : for each k = 1,, K, let Y k) i ) i 1 be a sequence of i.i.d random variable such that LY k) 1 ) = LY Z A k ), then for n = n 1,, n K ) N K, denote Ŷ n := K p k 1 n k n k i=1 ) Y k) i. 3.8) It is clear that E Ŷ n = EY and Ŷ n EY as n 1,, n K ). Simulation of conditional distribution i)suppose that X is a random variable with distribution function F, Z = X and A k = a k, a k+1 for some constant a 1, a 2,, a K+1. Let X k) := F 1 F a k ) + U F a k+1 ) F a k ) )) where U U0, 1,

19 15 and Y k) := fx k) ). Then L X k)) = LX X A k ) and L Y k) ) ) = L fx k) ) ) = LY X A k ). ii) Suppose that X is a random vector of density ρ : R d R and Z = X. Define ρ k x) := 1 ρx)1 x Ak with p k := PX A k ) = ρx)dx. p k A k Then ρ k is the density function of the conditional distribution of LX X A k ). Variance analysis Denote µ k := EY k), σ 2 k := VarY k), q k := n k n with n := K n k. Then Var Ŷ n = = 1 n K p 2 k K 1 n k Var Y k) = K p 1 k 1 nq k σ 2 k p 2 k q k σ 2 k = 1 n σ2 q), where σ 2 q) := K p 2 k q k σ 2 k. Recall also that Var Y n = 1 n Var Y = 1 n σ2. Then one compare the value σ 2 q) and σ 2. i) Proportional allocation: Let n k n =: qk = p k, then σ 2 q ) := K p kσk 2, and one has σ 2 = σ 2 q ) + K K p k µ 2 k ) 2 p k µ k σ 2 q ), 3.9) where the last inequality follows by Jensen s inequality. Remark 3.3 Let us define a random variable η by η := K k1 Z Ak. Then we have µ k := EY k) = EY η = k and σ 2 k = VarY k) = VarY η = k. Moreover, E VarY η = K p k σk 2 and Var EY η = K K p k µ 2 k ) 2. p k µ k Then the equality 3.9) can be interpreted as a variance decomposition: σ 2 = VarY = E VarY η + Var EY η.

20 16 ii) Optimal allocation: Let us consider the minimal variance problem min q K p 2 k q k σ 2 k subject to q : q k 0, K q k = 1. The Lagrange multiplier is given by Lλ, q 1,, q K ) := K p 2 k σk 2 q + λ K ) q k 1. k Then the first order condition gives which implies that L = p2 k σ2 k q k qk 2 + λ = 0, p k σ k q k = λ, k = 1,, K. Thus q k = λp k σ k for all k = 1,, K, and it follows that q k = p k σ k K i=1 p iσ i. Application: Let S t be defined by S t = S 0 e σ2 /t+σw t, where S 0 and σ are some positive constant. Denote X := W T / T N0, 1), motivated by its application in finance, we usually need to compute the value E Y with Y := fs T ) = gx), for some function f : R R and g : R R notice that S T can be expressed as a function of X). Let Φ : R 0, 1) be the distribution function of the Gaussian distribution N0, 1), and take A k = a k, a k+1 for some constant a k ) 1 k K+1, X k) := Φ 1 a k + a k+1 a k )U ) Y k) := gx k) ). We then obtain the following algorithm: Algorithm 3.1 i) Choose the sequence of stratification a k ) 1 k K+1. ii) For each k = 1,, K, simulate a sequence of i.i.d. random variable Ui k) i 1 of uniform distribution U0, 1, and let X k) i := Φ 1 a k + a k+1 a k )Ui k ).

21 17 iii) Estimate EY by K ak+1 a k ) 1 n k n k i=1 ) gxi k ). 3.4 Importance sampling method For the importance sampling, let us begin with a simple example. Suppose that X N0, 1) and h : R R is some function, then E hx) 1 = hx) e x2 /2 dx = hx)e x2 /2+x µ) 2 /2 1 e x µ)2/2 dx R 2π R 2π = hx)e µx+µ2 /2 1 e x µ)2/2 dx R 2π = E hy )e µy +µ2 /2 where Y Nµ, 1)) = E hx + µ)e µx µ2 / ) In some context, we can expect that Var hx) > Var hx + µ)e µx µ2 /2, then we can use the latter expectation to propose a Monte-Carlo estimator. To deduce the equality 3.10), the main idea is to divide the function hx) by some density function and then multiple it. We can use this idea in a more general context. Importance sampling method Let X be a random vector of density function f : R d R + and h : R d R, the objective is to compute E hx). Suppose that there is some other density function g : R d R + such that gx) > 0 for every x R d such that fx) > 0. Then by direct computation, we have E hx) = R d hx)fx)dx = R d hx) fx) gx) gx)dx = E hz) fz) gz) where Z is a random vector of density function g. Then an importance sampling estimator for EhX) is given, with a sequence Z k ) k 1 of i.i.d. simulations of Y, by, 1 n n hz k ) fz k) gz k ). 3.11)

22 18 Variance analysis Let us compute the variance of the new estimator. Var hz) fz) gz) = E h 2 Z) f 2 Z) g 2 Z) E hz) fz) gz) ) 2 = h 2 z) f 2 z) R d g 2 z) gz)dz E hx) ) 2 = E h 2 X) fx) E hx) ) 2. gx) And hence the problem of minimizing the variance turns to be min g Var hz) fz) gz) min g E h 2 X) fx) gx). 3.12) Example 3.4 i) Suppose that hx) = 1 A x) for some subset A R d. Then the minimization problem min g Var hz) fz) gz) = min g Var 1 A Z) fz) gz) is solved by gz) := fz)1 Az) α, where α is the constant making g a density function. ii) Suppose that h is positive, then the minimization problem 3.12) is solved by gz) := 1 αfz)hz), where α is the constant making g a density function., The above two examples can not be implemented since to make g a density function, we need to choose α := R d fz)hz)dz = E hx), which is not known a priori. Therefore, the minimum variance problem 3.12) is not a well posed problem. In practice, we consider a family of density functions g θ ) ) θ Θ, and then solve the minimum variance problem: min Var hz) fz) θ Θ g θ Z) min E h 2 X) fx). θ Θ g θ X) Gaussian vector case function Let X = X 1, X n ) N0, σ 2 I n ), which admits density fx 1,, x n ) := Π n ρx k), with ρx) := 1 2πσ e x2 2σ 2. Let θ = θ 1,, θ n ) R n, and g θ x 1,, x n ) := Π n ρ θ k x k ), with ρ θk x k ) := 1 e x θ k ) 2 2πσ 2σ 2.

23 19 Then the ratio of the density function turns to be fx) g θ x) n µ2 k n = exp µ kx k + 1 ) 2 σ 2. Then E hx 1,, X n ) = E h ) X 1 + θ 1,, X n + θ n e n µ kx k 1 n ) 2 µ2 k /σ 2. Example 3.5 In application in finance, one usually considers a Brownian motion W, and denote W k := W tk W tk 1 on the discrete time grid 0 = t 0 < t 1 < and one needs to compute E h ) W 1,, W n for some function h. Let Xk = W k, σ 2 = t and µ k = θ k / t, then E h ) W 1,, W n = E h W 1 + µ 1 t,, W n + µ n t ) exp n µ k W k 1 2 µ2 k ). t

24 20

25 Chapter 4 Stochastic gradient algorithm The objective is solve an optimization problem of the form where F θ, )) θ Θ is a family of functions. min E F θ, X), 4.1) θ Θ An iterative algorithm to find the root Proposition 4.1 Let f : R d R d a bounded continuous function and θ R d such that fθ ) = 0 and θ θ ) fθ) > 0, θ R d \ {θ }. 4.2) Let γ n ) n 1 be a sequence of numbers satisfying γ n > 0, n 1, and n 1 γ n =, γn 2 <. 4.3) n 1 Further, with some θ 0 R d, define a sequence θ n ) n 1 by θ n+1 = θ n γ n+1 fθ n ), n 0. Then, θ n θ as n. Proof. i) First, by its definition, we have θ n+1 θ 2 = θ n θ 2 + 2θ n θ ) θ n+1 θ n ) + θ n+1 θ n 2 = θ n θ 2 2γ n+1 fθ n ) θ n θ ) + γn fθ 2 n ) 2 θ n θ 2 + γn fθ 2 n ) 2, 21

26 22 where the last inequality follows by 4.3). Define x n := θ n θ 2 n γk 2 fθ k 1) 2. Then the sequence x n ) n 1 is non-increasing. Moreover, it is bounded from below since x n f k 1 γ k. Therefore, there is some x such that x n x and hence θ n θ 2 l := x + k 1 γ 2 k fθ k 1) 2. It is clear l 0 since it is the limit of θ n θ 2. We claim that l = 0, which can conclude the proof. ii) We now prove l = 0 by contradiction. Assume that l > 0, then there is some N > 0 such that l/2 θ θ 2 2l for every n N. Besides, by the continuity of f and 4.2), we have Therefore, η := inf l/2 θ θ 2 2l θ θ ) fθ) > 0. n 1 γ n fθ n=1 ) θ n 1 θ ) n N γ n fθ n=1 ) θ n 1 θ ) η n N However, we have also θ n θ n 1 ) θ n 1 θ ) γ n =. γ n fθ n=1 ) θ n 1 θ ) = n 1 n 1 = 1 θ n θ 2 θ n θ n 1 2 θ n 1 θ 2) 2 n 1 = 1 γ 2 2 n fθ n 1 ) l + θ 0 θ 2) <. n 1 This is a contradiction, and hence the claim l = 0 is true. Stochastic gradient algorithm Theorem 4.1 In the context of Proposition 4.1, suppose that fθ) = E F θ, X) for some function F : R d R n R and some random vector X. Suppose that f satisfies 4.2) and a sequence of numbers γ) n 1 satisfies 4.3). Then, with some θ 0 R d and a sequence X n ) n 1 of i.i.d. simulations of X, we define a sequence θ n ) n 1 by θ n+1 = θ n γ n+1 F θ n, X n+1 ). 4.4) Then, θ n θ almost surely as n.

27 23 Proof. Application: optimal importance sampling the minimum variance problem Let us denote In the context of Section 3.4, we solve min Var hx + µ)e µx µ2 /2 min E h 2 X + µ)e 2µX µ2 µ R µ R min E h 2 X)e µx+µ2 /2 4.5) µ R Lµ, X) := h 2 X)e µx+µ2 /2 and lµ) := E Lµ, X), and F µ, X) := L µ µ, X) := µ X)h2 X)e µx+µ2 /2 and fµ) := E F µ, X). Then the minimum variance problem 4.5) is equivalent to find the µ such that fµ ) = l µ ) = 0. Notice that 1 f µ) = l µ) = E + µ X) 2 ) h 2 X)e µx+µ2 /2 > 0, and hence such a µ is separate for f. Therefore, we can use the stochastic gradient algorithm 4.4) to find the optimal µ. Algorithm 4.1 i) Simulate a sequence X n ) n 1 of i.i.d. simulations of X. ii) With µ 0 = 0, use the iteration: µ n+1 = µ n γ n+1 F µ n, X n+1 ) iii) The estimator of EhX) is given by Y n+1 := = 1 n + 1 n+1 hx k + µ k 1 )e µ k 1X k µ 2 /2) k 1 n n + 1 Y n + 1 n + 1 hx n+1 + µ n )e µnx n+1 µ 2 n/2. The advantage of the above algorithm is that one does not need to memorized the simulation X n ) n 1 in the iteration.

28 24

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

Stochastic Simulation Variance reduction methods Bo Friis Nielsen

Stochastic Simulation Variance reduction methods Bo Friis Nielsen Stochastic Simulation Variance reduction methods Bo Friis Nielsen Applied Mathematics and Computer Science Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk Variance reduction

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

Chp 4. Expectation and Variance

Chp 4. Expectation and Variance Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.

More information

Statistics (1): Estimation

Statistics (1): Estimation Statistics (1): Estimation Marco Banterlé, Christian Robert and Judith Rousseau Practicals 2014-2015 L3, MIDO, Université Paris Dauphine 1 Table des matières 1 Random variables, probability, expectation

More information

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 6: Probability and Random Processes Problem Set 3 Spring 9 Self-Graded Scores Due: February 8, 9 Submit your self-graded scores

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

2 Random Variable Generation

2 Random Variable Generation 2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods Nonlife Actuarial Models Chapter 14 Basic Monte Carlo Methods Learning Objectives 1. Generation of uniform random numbers, mixed congruential method 2. Low discrepancy sequence 3. Inversion transformation

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix: Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,

More information

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours

More information

1 Simulating normal (Gaussian) rvs with applications to simulating Brownian motion and geometric Brownian motion in one and two dimensions

1 Simulating normal (Gaussian) rvs with applications to simulating Brownian motion and geometric Brownian motion in one and two dimensions Copyright c 2007 by Karl Sigman 1 Simulating normal Gaussian rvs with applications to simulating Brownian motion and geometric Brownian motion in one and two dimensions Fundamental to many applications

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

µ X (A) = P ( X 1 (A) )

µ X (A) = P ( X 1 (A) ) 1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6 MTH739U/P: Topics in Scientific Computing Autumn 16 Week 6 4.5 Generic algorithms for non-uniform variates We have seen that sampling from a uniform distribution in [, 1] is a relatively straightforward

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Scientific Computing: Monte Carlo

Scientific Computing: Monte Carlo Scientific Computing: Monte Carlo Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 April 5th and 12th, 2012 A. Donev (Courant Institute)

More information

2905 Queueing Theory and Simulation PART IV: SIMULATION

2905 Queueing Theory and Simulation PART IV: SIMULATION 2905 Queueing Theory and Simulation PART IV: SIMULATION 22 Random Numbers A fundamental step in a simulation study is the generation of random numbers, where a random number represents the value of a random

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

CSCI-6971 Lecture Notes: Monte Carlo integration

CSCI-6971 Lecture Notes: Monte Carlo integration CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following

More information

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Probability Models. 4. What is the definition of the expectation of a discrete random variable? 1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

4. CONTINUOUS RANDOM VARIABLES

4. CONTINUOUS RANDOM VARIABLES IA Probability Lent Term 4 CONTINUOUS RANDOM VARIABLES 4 Introduction Up to now we have restricted consideration to sample spaces Ω which are finite, or countable; we will now relax that assumption We

More information

STOR Lecture 16. Properties of Expectation - I

STOR Lecture 16. Properties of Expectation - I STOR 435.001 Lecture 16 Properties of Expectation - I Jan Hannig UNC Chapel Hill 1 / 22 Motivation Recall we found joint distributions to be pretty complicated objects. Need various tools from combinatorics

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 You can t see this text! Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Probability Review - Part 2 1 /

More information

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema Chapter 7 Extremal Problems No matter in theoretical context or in applications many problems can be formulated as problems of finding the maximum or minimum of a function. Whenever this is the case, advanced

More information

Approximation of BSDEs using least-squares regression and Malliavin weights

Approximation of BSDEs using least-squares regression and Malliavin weights Approximation of BSDEs using least-squares regression and Malliavin weights Plamen Turkedjiev (turkedji@math.hu-berlin.de) 3rd July, 2012 Joint work with Prof. Emmanuel Gobet (E cole Polytechnique) Plamen

More information

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18 Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

DUBLIN CITY UNIVERSITY

DUBLIN CITY UNIVERSITY DUBLIN CITY UNIVERSITY SAMPLE EXAMINATIONS 2017/2018 MODULE: QUALIFICATIONS: Simulation for Finance MS455 B.Sc. Actuarial Mathematics ACM B.Sc. Financial Mathematics FIM YEAR OF STUDY: 4 EXAMINERS: Mr

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Structural Reliability

Structural Reliability Structural Reliability Thuong Van DANG May 28, 2018 1 / 41 2 / 41 Introduction to Structural Reliability Concept of Limit State and Reliability Review of Probability Theory First Order Second Moment Method

More information

ACM 116: Lectures 3 4

ACM 116: Lectures 3 4 1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

More information

Negative Association, Ordering and Convergence of Resampling Methods

Negative Association, Ordering and Convergence of Resampling Methods Negative Association, Ordering and Convergence of Resampling Methods Nicolas Chopin ENSAE, Paristech (Joint work with Mathieu Gerber and Nick Whiteley, University of Bristol) Resampling schemes: Informal

More information

{σ x >t}p x. (σ x >t)=e at.

{σ x >t}p x. (σ x >t)=e at. 3.11. EXERCISES 121 3.11 Exercises Exercise 3.1 Consider the Ornstein Uhlenbeck process in example 3.1.7(B). Show that the defined process is a Markov process which converges in distribution to an N(0,σ

More information

Hochdimensionale Integration

Hochdimensionale Integration Oliver Ernst Institut für Numerische Mathematik und Optimierung Hochdimensionale Integration 14-tägige Vorlesung im Wintersemester 2010/11 im Rahmen des Moduls Ausgewählte Kapitel der Numerik Contents

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution

More information

A note on general sliding window processes

A note on general sliding window processes A note on general sliding window processes Noga Alon Ohad N. Feldheim September 19, 214 Abstract Let f : R k [r] = {1, 2,..., r} be a measurable function, and let {U i } i N be a sequence of i.i.d. random

More information

Lecture 21: Convergence of transformations and generating a random variable

Lecture 21: Convergence of transformations and generating a random variable Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

CALCULUS JIA-MING (FRANK) LIOU

CALCULUS JIA-MING (FRANK) LIOU CALCULUS JIA-MING (FRANK) LIOU Abstract. Contents. Power Series.. Polynomials and Formal Power Series.2. Radius of Convergence 2.3. Derivative and Antiderivative of Power Series 4.4. Power Series Expansion

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t, CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

More information

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions Chapter 4 Signed Measures Up until now our measures have always assumed values that were greater than or equal to 0. In this chapter we will extend our definition to allow for both positive negative values.

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems

More information

National Sun Yat-Sen University CSE Course: Information Theory. Maximum Entropy and Spectral Estimation

National Sun Yat-Sen University CSE Course: Information Theory. Maximum Entropy and Spectral Estimation Maximum Entropy and Spectral Estimation 1 Introduction What is the distribution of velocities in the gas at a given temperature? It is the Maxwell-Boltzmann distribution. The maximum entropy distribution

More information

IE 581 Introduction to Stochastic Simulation

IE 581 Introduction to Stochastic Simulation 1. List criteria for choosing the majorizing density r (x) when creating an acceptance/rejection random-variate generator for a specified density function f (x). 2. Suppose the rate function of a nonhomogeneous

More information

Markov Chain Monte Carlo Lecture 1

Markov Chain Monte Carlo Lecture 1 What are Monte Carlo Methods? The subject of Monte Carlo methods can be viewed as a branch of experimental mathematics in which one uses random numbers to conduct experiments. Typically the experiments

More information

On a class of stochastic differential equations in a financial network model

On a class of stochastic differential equations in a financial network model 1 On a class of stochastic differential equations in a financial network model Tomoyuki Ichiba Department of Statistics & Applied Probability, Center for Financial Mathematics and Actuarial Research, University

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

STAT Chapter 5 Continuous Distributions

STAT Chapter 5 Continuous Distributions STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range

More information

Probability: Handout

Probability: Handout Probability: Handout Klaus Pötzelberger Vienna University of Economics and Business Institute for Statistics and Mathematics E-mail: Klaus.Poetzelberger@wu.ac.at Contents 1 Axioms of Probability 3 1.1

More information

We introduce methods that are useful in:

We introduce methods that are useful in: Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more

More information

A NEW NONLINEAR FILTER

A NEW NONLINEAR FILTER COMMUNICATIONS IN INFORMATION AND SYSTEMS c 006 International Press Vol 6, No 3, pp 03-0, 006 004 A NEW NONLINEAR FILTER ROBERT J ELLIOTT AND SIMON HAYKIN Abstract A discrete time filter is constructed

More information

Accumulation constants of iterated function systems with Bloch target domains

Accumulation constants of iterated function systems with Bloch target domains Accumulation constants of iterated function systems with Bloch target domains September 29, 2005 1 Introduction Linda Keen and Nikola Lakic 1 Suppose that we are given a random sequence of holomorphic

More information

Introduction to ARMA and GARCH processes

Introduction to ARMA and GARCH processes Introduction to ARMA and GARCH processes Fulvio Corsi SNS Pisa 3 March 2010 Fulvio Corsi Introduction to ARMA () and GARCH processes SNS Pisa 3 March 2010 1 / 24 Stationarity Strict stationarity: (X 1,

More information

Actuarial Science Exam 1/P

Actuarial Science Exam 1/P Actuarial Science Exam /P Ville A. Satopää December 5, 2009 Contents Review of Algebra and Calculus 2 2 Basic Probability Concepts 3 3 Conditional Probability and Independence 4 4 Combinatorial Principles,

More information

Discretization of SDEs: Euler Methods and Beyond

Discretization of SDEs: Euler Methods and Beyond Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo

More information

Chapter 4 : Expectation and Moments

Chapter 4 : Expectation and Moments ECE5: Analysis of Random Signals Fall 06 Chapter 4 : Expectation and Moments Dr. Salim El Rouayheb Scribe: Serge Kas Hanna, Lu Liu Expected Value of a Random Variable Definition. The expected or average

More information

MATH2715: Statistical Methods

MATH2715: Statistical Methods MATH2715: Statistical Methods Exercises V (based on lectures 9-10, work week 6, hand in lecture Mon 7 Nov) ALL questions count towards the continuous assessment for this module. Q1. If X gamma(α,λ), write

More information

Supermodular ordering of Poisson arrays

Supermodular ordering of Poisson arrays Supermodular ordering of Poisson arrays Bünyamin Kızıldemir Nicolas Privault Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological University 637371 Singapore

More information

Introduction to Normal Distribution

Introduction to Normal Distribution Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction

More information

Hence, (f(x) f(x 0 )) 2 + (g(x) g(x 0 )) 2 < ɛ

Hence, (f(x) f(x 0 )) 2 + (g(x) g(x 0 )) 2 < ɛ Matthew Straughn Math 402 Homework 5 Homework 5 (p. 429) 13.3.5, 13.3.6 (p. 432) 13.4.1, 13.4.2, 13.4.7*, 13.4.9 (p. 448-449) 14.2.1, 14.2.2 Exercise 13.3.5. Let (X, d X ) be a metric space, and let f

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

An idea how to solve some of the problems. diverges the same must hold for the original series. T 1 p T 1 p + 1 p 1 = 1. dt = lim

An idea how to solve some of the problems. diverges the same must hold for the original series. T 1 p T 1 p + 1 p 1 = 1. dt = lim An idea how to solve some of the problems 5.2-2. (a) Does not converge: By multiplying across we get Hence 2k 2k 2 /2 k 2k2 k 2 /2 k 2 /2 2k 2k 2 /2 k. As the series diverges the same must hold for the

More information

Gaussian, Markov and stationary processes

Gaussian, Markov and stationary processes Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ November

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

Stability of optimization problems with stochastic dominance constraints

Stability of optimization problems with stochastic dominance constraints Stability of optimization problems with stochastic dominance constraints D. Dentcheva and W. Römisch Stevens Institute of Technology, Hoboken Humboldt-University Berlin www.math.hu-berlin.de/~romisch SIAM

More information

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time: Chapter 1 Random Variables 1.1 Elementary Examples We will start with elementary and intuitive examples of probability. The most well-known example is that of a fair coin: if flipped, the probability of

More information

Continuous Functions on Metric Spaces

Continuous Functions on Metric Spaces Continuous Functions on Metric Spaces Math 201A, Fall 2016 1 Continuous functions Definition 1. Let (X, d X ) and (Y, d Y ) be metric spaces. A function f : X Y is continuous at a X if for every ɛ > 0

More information

Lecture 15 Random variables

Lecture 15 Random variables Lecture 15 Random variables Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn No.1

More information

E n = n + 1. λ n k (x k ˆm) 2. n 1. (x k m) 332:525 Homework Set 1

E n = n + 1. λ n k (x k ˆm) 2. n 1. (x k m) 332:525 Homework Set 1 332:525 Homework Set Estimation Problems. Recursive Least-Squares (RLS) Estimators: Consider a sequence of iid random variables x n, n = 0,,..., and form the running average of the first numbers considered

More information

f(x) f(z) c x z > 0 1

f(x) f(z) c x z > 0 1 INVERSE AND IMPLICIT FUNCTION THEOREMS I use df x for the linear transformation that is the differential of f at x.. INVERSE FUNCTION THEOREM Definition. Suppose S R n is open, a S, and f : S R n is a

More information

Chapter 4: Monte-Carlo Methods

Chapter 4: Monte-Carlo Methods Chapter 4: Monte-Carlo Methods A Monte-Carlo method is a technique for the numerical realization of a stochastic process by means of normally distributed random variables. In financial mathematics, it

More information