Solving stochastic differential equations and Kolmogorov equations by means of deep learning
|
|
- Winfred Cummings
- 5 years ago
- Views:
Transcription
1 Solving stochastic differential equations and Kolmogorov equations by means of deep learning Christian Beck 1, Sebastian Becker 2, Philipp Grohs 3, Nor Jaafari 4, and Arnulf Jentzen 5 arxiv: v1 [math.na] 1 Jun Department of Mathematics, ETH Zurich, Zurich, Switzerland, christian.beck@math.ethz.ch 2 ZENAI AG, Zurich, Switzerland, sebastian.becker@zenai.ch 3 Faculty of Mathematics and Research Platform Data Science, University of Vienna, Vienna, Austria, philipp.grohs@univie.ac.at 4 ZENAI AG, Zurich, Switzerland, nor.jaafari@zenai.ch 5 Department of Mathematics, ETH Zurich, Zurich, Switzerland, arnulf.jentzen@sam.math.ethz.ch June 4, 218 Abstract Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not be solved explicitly and it has been and still is an active topic of research to design and analyze numerical methods which are able to approximately solve Kolmogorov PDEs and SDEs, respectively. Nearly all approximation methods for Kolmogorov PDEs in the literature suffer under the curse of dimensionality or only provide approximations of the solution of the PDE at a single fixed space-time point. In this paper we derive and propose a numerical approximation method which aims to overcome both of the above mentioned drawbacks and intends to deliver a numerical approximation of the Kolmogorov PDE on an entire region [a, b] d without suffering from the curse of dimensionality. Numerical results on examples including the heat equation, the Black-Scholes model, the stochastic Lorenz equation, and the Heston model suggest that the proposed approximation algorithm is quite effective in high dimensions in terms of both accuracy and speed. 1
2 Contents 1 Introduction 2 2 Derivation and description of the proposed approximation algorithm Kolmogorov partial differential equations (PDEs) On stochastic differential equations and Kolmogorov PDEs Formulation as minimization problem Discretization of the stochastic differential equation Deep artificial neural network approximations Stochastic gradient descent-type minimization Description of the algorithm in a special case Description of the algorithm in the general case Examples Setting Heat equation Geometric Brownian motions Black-Scholes model with correlated noise Stochastic Lorenz equations Heston model Python source codes Python source code for the algorithm A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection Introduction Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not be solved explicitly and it has been and still is an active topic of research to design and analyze numerical methods which are able to approximately solve Kolmogorov PDEs and SDEs, respectively (see, e.g., [17], [2], [21], [27], [28], [29], [32], [4], [41], [42], [44], [45], [46], [47], [49], [52]). In particular, there are nowadays several different types of numerical approximation methods for Kolmogorov 2
3 PDEs in the literature including deterministic numerical approximation methods such as finite differences based approximation methods (cf., for example, [7], [8], [23], [43], [58], [56]) and finite elements based approximation methods (cf., for example, [9], [1], [59]) as well as random numerical approximation methods based on Monte Carlo methods (cf., for example, [17], [2]) and discretizations of the underlying SDEs (cf., for example, [21], [27], [28], [29], [32], [4], [41], [42], [44], [45], [46], [47], [49], [52]). The above mentioned deterministic approximation methods for PDEs work quite efficiently in one or two space dimensions but cannot be used in the case of high-dimensional PDEs as they suffer from the so-called curse of dimensionality (cf. Bellman [5]) in the sense that the computational effort of the considered approximation algorithm grows exponentially in the PDE dimension. The above mentioned random numerical approximation methods involving Monte Carlo approximations typically overcome this curse of dimensionality but only provide approximations of the Kolmogorov PDE at a single fixed space-time point. The key contribution of this paper is to derive and propose a numerical approximation method which aims to overcome both of the above mentioned drawbacks and intends to deliver a numerical approximation of the Kolmogorov PDE on an entire region [a, b] d without suffering from the curse of dimensionality. The numerical scheme, which we propose in this work, is inspired by recently developed deep learning based approximation algorithms for PDEs in the literature (cf., for example, [3], [4], [14], [15], [16], [24], [26], [51], [57]). To derive the proposed approximation scheme we first reformulate the considered Kolmogorov PDE as a suitable infinite dimensional stochastic optimization problem (see items (ii) (iii) in Proposition 2.7 below for details). This infinite dimensional stochastic optimization problem is then temporally discretized by means of suitable discretizations of the underlying SDE and it is spatially discretized by means of fully connected deep artificial neural network approximations (see (17) in Subsection 2.6 as well as Subsections below). The resulting finite dimensional stochastic optimization problem is then solved by means of stochastic gradient descent type optimization algorithms (see (19) in Subsection 2.6, Framework 2.9 in Subsection 2.7, Framework 2.1 in Subsection 2.8, as well as (124) (125) in Subsection 3.1). We test the proposed approximation method numerically in the case of several examples of SDEs and PDEs, respectively (see Subsections below for details). The obtained numerical results indicate that the proposed approximation algorithm is quite effective in high dimensions in terms of both accuracy and speed. The remainder of this article is organized as follows. In Section 2 we derive the proposed approximation algorithm (see Subsections below) and we present a detailed description of the proposed approximation algorithm in a special case (see Subsection 2.7 below) as well as in the general case (see Subsection 2.8 below). In Section 3 we test the proposed algorithm numerically in the case of several examples of SDEs and PDEs, respectively. The employed source codes for the numerical simulations in Section 3 are postponed to Section 4. 3
4 2 Derivation and description of the proposed approximation algorithm In this section we describe the approximation problem which we intend to solve (see Subsection 2.1 below) and we derive (see Subsections below) and specify (see Subsections below) the numerical scheme which we suggest to use to solve this approximation problem (cf., for example, E et al. [14], Han et al. [24], Sirignano & Spiliopoulos [57], Beck et al. [3], Fujii, Takahashi, A., & Takahashi, M. [16], and Henry-Labordere [26] for related derivations and related approximation schemes). 2.1 Kolmogorov partial differential equations (PDEs) Let T (, ), d N, let µ: R d R d and σ : R d R d d be Lipschitz continuous functions, let ϕ: R d R be a function, and let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d. (1) Our goal is to approximately calculate the function R d x u(t, x) R on some subset of R d. To fix ideas we consider real numbers a, b R with a < b and we suppose that our goal is to approximately calculate the function [a, b] d x u(t, x) R. 2.2 On stochastic differential equations and Kolmogorov PDEs In this subsection we provide a probabilistic representation for the solutions of the PDE (1), that is, we recall the classical Feynman-Kac formula for the PDE (1) (cf., for example, Øksendal [5, Chapter 8]). Let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R d be a standard (Ω, F,P, (F t ) t [,T ] )-Brownian motion, and for every x R d let X x = (X x t ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s. (2) The Feynman-Kac formula (cf., for example, Hairer et al. [22, Corollary 4.17 and Remark 4.1]) and (1) hence yield that for every x R d it holds that u(t, x) = E [ u(, X x T ) ] = E [ ϕ(x x T ) ]. (3) 4
5 2.3 Formulation as minimization problem In the next step we exploit (3) to formulate a minimization problem which is uniquely solved by the function [a, b] d x u(t, x) R (cf. (1) above). For this we first recall the L 2 -minimization property of the expectation of a real-valued random variable (see Lemma 2.1 below). Then we extend this minimization result to certain random fields (see Proposition 2.2 below). Thereafter, we apply Proposition 2.2 to random fields in the context of the Feynman-Kac representation (3) to obtain Proposition 2.7 below. Proposition 2.7 provides a minimization problem (see, for instance, (93) below) which has the function [a, b] d x u(t, x) R as the unique global minimizer. Our proof of Proposition 2.7 is based on the elementary auxiliary results in Lemmas For completeness we also present the proofs of Lemmas here. The statement and the proof of Lemma 2.3 are based on the proof of Da Prato & Zabczyk [13, Lemma 1.1]. Lemma 2.1. Let (Ω, F,P) be a probability space and let X : Ω R be an F/B(R)- measurable random variable which satisfies E[ X 2 ] <. Then (i) it holds for every y R that E [ X y 2] = E [ X E[X] 2] + E[X] y 2, (4) (ii) it holds that there exists a unique real number z R such that and E [ X z 2] = inf y R E[ X y 2], (5) (iii) it holds that E [ X E[X] 2] = inf y R E[ X y 2]. (6) Proof of Lemma 2.1. Observe that the fact that E[ X ] < ensures that for every y R it holds that E [ X y 2] = E [ X E[X] + E[X] y 2] = E [ X E[X] 2 + 2(X E[X])(E[X] y) + E[X] y 2] = E [ X E[X] 2] + 2(E[X] y)e [ X E[X] ] + E[X] y 2 = E [ X E[X] 2] + E[X] y 2. This establishes item (i). Item (ii) and item (iii) are immediate consequences of item (i). The proof of Lemma 2.1 is thus completed. (7) 5
6 Proposition 2.2. Let a R, b (a, ), let (Ω, F,P) be a probability space, let X = (X x ) x [a,b] d : [a, b] d Ω R be a (B([a, b] d ) F)/B(R)-measurable function, assume for every x [a, b] d that E[ X x 2 ] <, and assume that the function [a, b] d x E[X x ] R is continuous. Then (i) it holds that there exists a unique continuous function u: [a, b] d R such that E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx [a,b] d v C([a,b] d,r) [a,b] d and (8) (ii) it holds for every x [a, b] d that u(x) = E[X x ]. Proof of Proposition 2.2. Observe that item (i) in Lemma 2.1 and the hypothesis that x [a, b] d : E[ X x 2 ] < ensure that for every function u: [a, b] d R and every x [a, b] d it holds that E [ X x u(x) 2] = E [ X x E[X x ] 2] + E[X x ] u(x) 2. (9) Fubini s theorem (see, e.g., Klenke [39, Theorem 14.16]) hence proves that for every continuous function u: [a, b] d R it holds that E [ X x u(x) 2] dx = E [ X x E[X x ] 2] dx + E[X x ] u(x) 2 dx. (1) [a,b] d [a,b] d [a,b] d The hypothesis that the function [a, b] d x E[X x ] R is continuous therefore demonstrates that E [ X x E[X x ] 2] dx [a,b] d ( inf E [ X x v(x) 2] ) dx v C([a,b] d,r) [a,b] ( d = inf E [ X x E[X x ] 2] ) dx + E[X x ] v(x) 2 dx v C([a,b] d,r) [a,b] d [a,b] ( d inf E [ X x E[X x ] 2] ) dx v C([a,b] d,r) [a,b] d = E [ X x E[X x ] 2] dx. [a,b] d Hence, we obtain that [a,b] d E [ X x E[X x ] 2] dx = (11) ( inf E [ X x v(x) 2] ) dx. (12) v C([a,b] d,r) [a,b] d 6
7 Again the fact that the function [a, b] d x E[X x ] R is continuous therefore proves that there exists a continuous function u: [a, b] d R such that E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx. (13) [a,b] d v C([a,b] d,r) [a,b] d Next observe that (1) and (12) yield that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx (14) [a,b] d v C([a,b] d,r) [a,b] d it holds that E [ X x E[X x ] 2] dx [a,b] d ( = inf E [ X x v(x) 2] ) dx = E [ X x u(x) 2] dx v C([a,b] d,r) [a,b] d [a,b] d ] = E [ X x E[X x ] 2] dx + E[X x ] u(x) 2 dx. [a,b] d [a,b] d (15) Hence, we obtain that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx [a,b] d v C([a,b] d,r) [a,b] d (16) it holds that [a,b] d E[X x ] u(x) 2 dx =. (17) This and again the hypothesis that the function [a, b] d x E[X x ] R is continuous yield that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx (18) [a,b] d v C([a,b] d,r) [a,b] d and every x [a, b] d it holds that u(x) = E[X x ]. Combining this with (13) completes the proof of Proposition 2.2. Lemma 2.3 (Projections in metric spaces). Let (E, d) be a metric space, let n N, e 1, e 2,..., e n E, and let P : E E be the function which satisfies for every x E that Then P (x) = e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}. (19) 7
8 (i) it holds for every x E that and d(x, P (x)) = min d(x, e k) (2) k {1,2,...,n} (ii) it holds for every A E that P 1 (A) B(E). Proof of Lemma 2.3. Throughout this proof let D = (D 1,..., D n ): E R n be the function which satisfies for every x E that D(x) = (D 1 (x), D 2 (x),..., D n (x)) = (d(x, e 1 ), d(x, e 2 ),..., d(x, e n )). (21) Note that (19) ensures that for every x E it holds that d(x, P (x)) = d(x, e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}) = min d(x, e k). k {1,2,...,n} (22) This establishes item (i). It thus remains to prove item (ii). For this observe that the fact that the function d: E E [, ) is continuous ensures that the function D : E R n is continuous. Hence, we obtain that the function D : E R n is B(E)/B(R n )-measurable. Next note that item (i) demonstrates that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that d(x, e k ) = d(x, P (x)) = min d(x, e l ). (23) l {1,2,...,n} Hence, we obtain that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that k min{l {1, 2,..., n}: d(x, e l ) = min{d(x, e 1 ), d(x, e 2 ),..., d(x, e n )}}. (24) Moreover, note that (19) ensures that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) u {1,2,...,n} { (25) } { } l {1, 2,..., n}: e l = e k k, k + 1,..., n. Therefore, we obtain that for every k {1, 2,..., n}, x P 1 ({e k }) with e k / ( l N [,k) {e l }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) k. (26) u {1,2,...,n} Combining this with (24) yields that for every k {1, 2,..., n}, x P 1 ({e k }) with e k / ( l N [,k) {e l }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min u) u {1,2,...,n} = k. (27) 8
9 Hence, we obtain that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that { { } } P 1 ({e k }) x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k. (28) u {1,2,...,n} This and (19) show that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that { { } } P 1 ({e k }) = x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k. (29) u {1,2,...,n} Combining (21) with the fact that the function D : E R n is B(E)/B(R n )-measurable therefore demonstrates that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that P 1 ({e k }) { { } } = x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k u {1,2,...,n} { { } } = x E : min l {1, 2,..., n}: D l (x) = min D u(x) = k u {1,2,...,n} { ( )} l N [, k): Dk (x) < D = x E : l (x) and l {1, 2,..., n}: D k (x) D l (x) k 1 n = {x E : D k (x) < D l (x)} {x E : D }{{} k (x) D l (x)} B(E). }{{} l=1 l=1 B(E) B(E) Hence, we obtain that for every f {e 1, e 2,..., e n } it holds that Therefore, we obtain that for every A E it holds that (3) P 1 ({f}) B(E). (31) P 1 (A) = P 1 (A {e 1, e 2,..., e n }) = f A {e1,e 2,...,e n} P 1 ({f}) B(E). (32) }{{} B(E) This establishes item (ii). The proof of Lemma 2.3 is thus completed. Lemma 2.4. Let (E, d) be a separable metric space, let (E, δ) be a metric space, let (Ω, F) be a measurable space, let X : E Ω E be a function, assume for every e E that the function Ω ω X(e, ω) E is F/B(E)-measurable, and assume for every ω Ω that the function E e X(e, ω) E is continuous. Then it holds that the function X : E Ω E is (B(E) F)/B(E)-measurable. 9
10 Proof of Lemma 2.4. Throughout this proof let (e m ) m N E be a sequence which satisfies that {e m : m N} = E, let P n : E E, n N, be the functions which satisfy for every n N, x E that P n (x) = e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}, (33) and let X n : E Ω E, n N, be the functions which satisfy for every n N, x E, ω Ω that X n (x, ω) = X(P n (x), ω). (34) Note that (34) shows that for all n N, B B(E) it holds that (X n ) 1 (B) = {(x, ω) E Ω: X n (x, ω) B} = ( [(Xn ) 1 (B) ] [ (P n ) 1 ({y}) Ω ] ) y Im(P n) = y Im(P n) = y Im(P n) { [ ]} (x, ω) E Ω: X n (x, ω) B and x (P n ) 1 ({y}) { [ ]} (x, ω) E Ω: X(P n (x), ω) B and x (P n ) 1 ({y}). (35) Item (ii) in Lemma 2.3 hence implies that for all n N, B B(E) it holds that (X n ) 1 (B) = { [ ]} (x, ω) E Ω: X(y, ω) B and x (P n ) 1 ({y}) y Im(P n) = y Im(P n) = y Im(P n) ( {(x, ω) E Ω: X(y, ω) B} [ (P n ) 1 ({y}) Ω ] ) ( [ E ( (X(y, )) 1 (B) ) } {{ } (B(E) F) ] [ ] ) (Pn ) 1 ({y}) Ω (B(E) F). }{{} (B(E) F) (36) This proves that for every n N it holds that the function X n is (B(E) F)/B(E)- measurable. In addition, note that item (i) in Lemma 2.3 and the hypothesis that for every ω Ω it holds that the function E x X(x, ω) E is continuous imply that for every x E, ω Ω it holds that lim X n(x, ω) = lim X(P n (x), ω) = X(x, ω). (37) n n Combining this with the fact that for every n N it holds that the function X n : E Ω E is (B(E) F)/B(E)-measurable shows that the function X : E Ω E is (B(E) F)/B(E)- measurable. The proof of Lemma 2.4 is thus completed. 1
11 Lemma 2.5. Let (Ω, F,P) be a probability space, let (E, d) and (E, δ) be separable metric spaces, let X n : Ω E, n N, be random variables which satisfy for every ε (, ) that lim supp(d(x n, X ) ε) =, (38) n and let Φ: E E be a continuous function. Then it holds for every ε (, ) that lim supp(δ(φ(x n ), Φ(X )) ε) =. (39) n Proof of Lemma 2.5. Note that (38), e.g., Cox et al. [12, Lemma 2.4], and, e.g., Hutzenthaler et al. [35, Lemma 4.2] establish (39). The proof of Lemma 2.5 is thus completed. Lemma 2.6. Let d, m N, T (, ), L, a R, b (a, ), let µ: R d R d and σ : R d R d m be functions which satisfy for every x, y R d that max{ µ(x) µ(y) R d, σ(x) σ(y) HS(R m,r )} L x y d R d, let Φ: C([, T ], R d ) R be an at most polynomially growing continuous function, let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let ξ : Ω [a, b] d be a continuous uniformly distributed F /B([a, b] d )-measurable random variable, let W : [, T ] Ω R m be a standard (F t ) t [,T ] -Brownian motion, for every x [a, b] d let X x = (Xt x ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s, (4) and let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that Then X t = ξ + t µ(x s ) ds + t σ(x s ) dw s. (41) (i) it holds for every x [a, b] d that the functions Ω ω Φ((X x t (ω)) t [,T ] ) R and Ω ω Φ((X t (ω)) t [,T ] ) R are F/B(R)-measurable, (ii) it holds for every p [2, ), x, y [a, b] d that ( [ p ) 1/p E sup Xt x X y t R d] ( 2 exp L 2 T [ p + T ] ) 2 x y R d, (42) t [,T ] (iii) it holds for every x [a, b] d that E [ Φ((X x t ) t [,T ] ) + Φ((X t ) t [,T ] ) ] <, (iv) it holds that the function [a, b] d x E[Φ((X x t ) t [,T ] )] R is continuous, and 11
12 (v) it holds that E [ Φ((X t ) t [,T ] ) ] = 1 E (b a) ( [a,b] [ Φ((X d t x ) t [,T ] ) ] ) dx. (43) d Proof of Lemma 2.6. Throughout this proof let c [1, ) be a real number which satisfies for every w C([, T ], R m ) that [ Φ(w) c c 1 + sup w t R d], (44) t [,T ] let p t : C([, T ], R m ) R m, t [, T ], be the functions which satisfy for every t [, T ], w = (w s ) s [,T ] C([, T ], R m ) that p t (w) = w t, and let Ψ N x,w : [, T ] R d, N N, x R d, w C([, T ], R m ), be the functions which satisfy for every w C([, T ], R m ), x R d, N N, n {, 1,..., N 1}, t [ nt, (n+1)t ] that N N ΨN x,w() = x and Ψ N x,w(t) = Ψ N x,w( nt N ) ( [ + nt n) µ ( Ψ N nt )) T T x,w( + σ( ( Ψ N nt )) ( N N x,w N w (n+1)t N ) ] w nt. (45) N Observe that the fact that the Borel sigma-algebra B(C([, T ], R m )) is generated by the set t [,T ] A B(R m ) {(p t ) 1 (A)} (cf., for example, Klenke [39, Theorem 21.31]), the hypothesis that for every x [a, b] d it holds that X x : [, T ] Ω R d is an (F t ) t [,T ] -adapted stochastic process with continuous sample paths, and the hypothesis that X: [, T ] Ω R d is an (F t ) t [,T ] -adapted stochastic process with continuous sample paths demonstrate that the functions Ω ω (X x t (ω)) t [,T ] C([, T ], R d ) (46) and Ω ω (X t (ω)) t [,T ] C([, T ], R d ) (47) are F/B(C([, T ], R d ))-measurable. Combining this with the fact that the function Φ: C([, T ], R d ) R is B(C([, T ], R d ))/B(R)-measurable implies that for every x [a, b] d it holds that the functions Ω ω Φ((Xt x (ω)) t [,T ] ) R and Ω ω Φ((X t (ω)) t [,T ] ) R are F/B(R)-measurable. This proves item (i). Next observe that (45), the hypothesis that µ: R d R d and σ : R d R d m are globally Lipschitz continuous, and the fact that for every p (, ) it holds that E[ ξ p ] < ensure that for every p (, ) it holds R d that sup sup N N x [a,b] d = sup sup N N x [a,b] d ( [ E sup ( E t [,T ] [ ] [ ]) Ψ N x,w (t) p + E sup Ψ N R d ξ,w (t) p R d t [,T ] [ max n {,1,...,N} Ψ N x,w ( nt N ) p R d ] + E 12 max n {,1,...,N} Ψ N ξ,w ( nt N ) p R d ]) < (48)
13 (cf., for example, Kloeden & Platen [41, Section 1.6]). Next note that (4), (41), (45), the fact that µ: R d R d and σ : R d R d m are locally Lipschitz continuous functions, and e.g., Hutzenthaler & Jentzen [32, Theorem 3.3] ensure that for every x [a, b] d, ε (, ) it holds that ( ) and lim supp N sup Xt x Ψ N x,w (t) R d ε t [,T ] = (49) ( ) lim supp N sup t [,T ] X t Ψ N ξ,w (t) R d ε =. (5) Combining (46), (47), and, e.g., Hutzenthaler & Jentzen [32, Lemma 3.1] hence demonstrates that for every x [a, b] d, p (, ) it holds that [ ] [ ] E sup Xt x p lim inf E sup Ψ N x,w (t) p t [,T ] N t [,T ] [ ] (51) sup E sup Ψ N x,w (t) p N N t [,T ] and [ ] E sup X t p lim inf t [,T ] N E [ sup E N N [ ] sup Ψ N ξ,w (t) p t [,T ] ] sup Ψ N ξ,w (t) p. t [,T ] (52) This and (48) assure that for every p (, ) it holds that ( [ ] [ ]) E sup Xt x p + E X R d t p <. (53) R d t [,T ] sup x [a,b] d sup t [,T ] Combining (44) and the fact that r (, ), a, b R: a + b r 2 r ( a r + b r ) therefore demonstrates that for every p (, ) it holds that ( [ sup E Φ((X x t ) t [,T ] ) p] + E [ Φ((X t ) t [,T ] ) p]) x [a,b] d ( sup cp E [ 1 + sup t [,T ] Xt x R d cp] + c p E [ 1 + sup t [,T ] X t R d cp]) x [a,b] d ( ) 2 cp c p <. sup E [ ] 2 + sup t [,T ] Xt x cp + sup R d t [,T ] X t cp R d x [a,b] d 13 (54)
14 This establishes item (iii). In the next step we observe that (4) ensures that for every x, y [a, b] d, t [, T ] it holds P-a.s. that X x t X y t = x y + t (µ(x x s ) µ(x y s )) ds + t (σ(x x s ) σ(x y s )) dw s. (55) The triangle inequality hence ensures that for every x, y [a, b] d, t [, T ] it holds P- a.s. that sup Xs x Xs y R d s [,t] x y R d + sup x y R d + L = x y R d + L s s [,t] [ s sup s [,t] t µ(x x r ) µ(x y r ) R d dr + sup X x r X y r R d dr X x r X y r R d dr + sup ] s [,t] + sup t s [,t] s s [,t] s (σ(x x r ) σ(x y r )) dw r R d (σ(x x r ) σ(x y r )) dw r R d R (σ(xr x ) σ(xr y )) dw r. d Therefore, we obtain for every p [1, ), x, y [a, b] d, t [, T ] that ( [ t ( [ ]) E E X x r Xr y p 1/p dr R d ]) 1/p sup Xs x Xs y p x y R d R d + L s [,t] + ( [ E sup s [,t] s (σ(x x r ) σ(x y r )) dw r p R d ]) 1/p. The Burkholder-Davis-Gundy type inequality in Da Prato & Zabczyk [13, Lemma 7.2] hence shows that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p t ( [ ]) E sup Xs x Xs y p x y R d R d + L E X x r Xr y p 1/p dr R d s [,t] [ t ( [ ] ) (58) ]1/2 + p E σ(x x r ) σ(xr y ) p 2/p HS(R d,r m ) dr. This demonstrates that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p t ( [ ]) E sup Xs x Xs y p x y R d R d + L E X x r Xr y p 1/p dr R d s [,t] [ t ]1/2 ( [ ]) + Lp E X x r Xr y p 2/p dr. R d 14 (56) (57) (59)
15 Hölder s inequality hence proves that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p E sup Xs x Xs y p x y R d R d + L [ t ]1/2 ( [ ]) t E X x r Xr y p 2/p dr R d s [,t] [ t ]1/2 ( [ ]) + Lp E X x r Xr y p 2/p dr R d x y R d + L [ p + T ] [ t ( [ ]) E X x r Xr y p 2/p dr R d The fact that v, w R: v + w 2 2v 2 + 2w 2 therefore shows that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 2/p E sup Xs x Xs y p R d s [,t] 2 x y 2 R d + 2L 2[ p + T ] 2 2 x y 2 R d + 2L 2[ p + T ] 2 t t ( E [ X x r X y r p R d ]) 2/p dr ( [ ]) 2/p E sup Xs x Xs y p dr. R d s [,r] Combining the Gronwall inequality (cf., e.g., Andersson et al. [2, Lemma 2.6] (with α =, β =, a = 2 x y 2, b = 3L 2[ p+ T ], e = ([, T ] t ( E [ ]) sup R d s [,t] Xs x Xs y p 2/p R d [, ]) in the notation of Lemma 2.6)) and (53) hence establishes that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( E [ ]) 2/p ( sup Xs x Xs y p 2 x y 2 R d R exp d s [,t] ]1/2. (6) (61) 2L 2 t [ p + T ] 2 ). (62) Therefore, we obtain that for every p [2, ), x, y [a, b] d it holds that ( [ ]) 1/p E sup Xt x X y t p ( 2 exp L 2 T [ p + T ] ) 2 x y R d. (63) t [,T ] This establishes item (ii). Next observe that item (ii) and Jensen s inequality imply that for every p (, ) it holds that (( [ ]) E supt [,T ] Xt x X y t p 1/p ) R d sup x,y [a,b] d,x y x y R d 15 <. (64)
16 The hypothesis that the function Φ: C([, T ], R d ) R is continuous and Lemma 2.5 hence ensure that for every ε (, ), (x n ) n N [a, b] d with lim sup n x x n R d = it holds that lim sup n P ( Φ((X x t ) t [,T ] ) Φ((Xt xn ) t [,T ] ) ε ) =. (65) Combining (54) with, e.g., Hutzenthaler et al. [35, Proposition 4.5] therefore implies that for every (x n ) n N [a, b] d with lim sup n x x n R d = it holds that lim sup N lim sup n E [ Φ((X x t ) t [,T ] ) Φ((Xt xn ) t [,T ] ) ] =. (66) This establishes item (iv). In the next step we observe that (49) and (5) ensure that for every ε (, ), x [a, b] d it holds that [ ( ) P sup Ψ N x,w (t) Xt x R d ε t [,T ] ( )] + P sup t [,T ] Ψ N ξ,w (t) X t R d ε =. (67) The hypothesis that the function Φ is continuous and Lemma 2.5 therefore demonstrate that for every ε (, ), x [a, b] d it holds that [ ( lim sup P Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) ε )] =. (68) N Next observe that (44) assures that for every N N, x [a, b] d, p (, ) it holds that E [ Φ(Ψ N x,w ) Φ((X x t ) t [,T ] ) p + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) p] 2 p E [ Φ(Ψ N x,w ) p + Φ((X x t ) t [,T ] ) p] + 2 p E [ Φ(Ψ N ξ,w ) p + Φ((X t ) t [,T ] ) p] 2 p c p E [ 1 + supt [,T ] Ψ N x,w (t) R cp d supt [,T ] Xt x R ] cp d + 2 p c p E [ 1 + supt [,T ] Ψ N ξ,w (t) R d cp supt [,T ] X t R d 4 p c p E [ 2 + sup t [,T ] Ψ N x,w (t) cp R d + sup t [,T ] X x t cp R d ] + 4 p c p E [ 2 + sup t [,T ] Ψ N ξ,w (t) cp R d + sup t [,T ] X t cp R d ]. ] cp Combining (48) and (53) hence shows that for every p (, ) it holds that ( [ sup sup E Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) p + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) p]) <. (7) N N x [a,b] d This, (68), and, e.g., Hutzenthaler et al. [35, Proposition 4.5] imply that for every x [a, b] d it holds that lim sup N (69) ( E [ Φ(Ψ N x,w ) Φ((X x t ) t [,T ] ) ] + E [ Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) ]) =. (71) 16
17 Combining (7) with Lebesgue s dominated convergence theorem therefore demonstrates that ( lim sup E [ Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) ] ) dx =. (72) N [a,b] d In addition, observe that (69), (48), and (53) prove that for all p (, ) it holds that sup sup E [ Φ(Ψ N x,w ) p] <. (73) N N x [a,b] d Next observe that (71) and the fact that ξ and W are independent imply that E [ Φ((X t ) t [,T ] ) ] = lim E[ Φ ( ) ] Ψ N ξ,w N [ ) ] = lim Φ (Ψ Nξ(ω),(Wt(ω))t [,T N ] P(dω) Ω [ = lim Φ ( ) ( Ψ ) ] N x,w (ξ, W )(P) (dx, dw) N [a,b] d C([,T ],R d ) [ = lim Φ ( ( Ψx,w) N (ξ(p)) (W (P)) ) ] (dx, dw). N [a,b] d C([,T ],R d ) (74) Combining Fubini s theorem, (72), and (73) with Lebesgue s dominated convergence theorem therefore assures that E [ Φ((X t ) t [,T ] ) ] [ ( = lim Φ ( ) ] Ψx,w) N (W (P))(dw) (ξ(p))(dx) N [a,b] d C([,T ],R d ) [ ( = lim Φ ( ) ] ) Ψ N x,w P(dw) (ξ(p))(dx) N [a,b] d Ω [ = lim E [ Φ ( ) ] ] Ψ N x,w (ξ(p))(dx) N [a,b] d = lim E[ Φ ( ) ] (75) Ψ N x,w (ξ(p))(dx) [a,b] d N = E [ Φ((Xt x ) t [,T ] ) ] (ξ(p))(dx) [a,b] d = 1 (b a) d ( E [ Φ((Xt x ) t [,T ] ) ] ) dx. [a,b] d This establishes item (v). The proof of Lemma 2.6 is thus completed. Proposition 2.7. Let d, m N, T (, ), a R, b (a, ), let µ: R d R d and σ : R d R d m be globally Lipschitz continuous functions, let ϕ: R d R be a function, let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be a function with at most polynomially 17
18 growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u t (t, x) = 1 2 Trace R d ( σ(x)[σ(x)] (Hess x u)(t, x) ) + µ(x), ( x u)(t, x) R d, (76) let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R m be a standard (F t ) t [,T ] -Brownian motion, let ξ : Ω [a, b] d be a continuous uniformly distributed F /B([a, b] d )-measurable random variable, and let X = (X t ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that Then X t = ξ + t µ(x s ) ds + t σ(x s ) dw s. (77) (i) it holds that the function ϕ: R d R is twice continuously differentiable with at most polynomially growing derivatives, (ii) it holds that there exists a unique continuous function U : [a, b] d R such that and E [ ϕ(x T ) U(ξ) 2] = (iii) it holds for every x [a, b] d that U(x) = u(t, x). inf E [ ϕ(x T ) v(ξ) 2], (78) v C([a,b] d,r) Proof of Proposition 2.7. Throughout this proof let X x = (X x t ) t [,T ] : [, T ] Ω R d, x [a, b] d, be (F t ) t [,T ] -adapted stochastic processes with continuous sample paths a) which satisfy that for every t [, T ], x [a, b] d it holds P-a.s. that and X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s (79) b) which satisfy that for every ω Ω it holds that the function [a, b] d x XT x (ω) Rd is continuous (cf., for example, Cox et al. [11, Theorem 3.5] and item (ii) in Lemma 2.6). Note that the assumption that x R d : u(, x) = ϕ(x) and the assumption that u C 1,2 ([, T ] R d, R) has at most polynomially growing partial derivatives establish item (i). Next note that item (i) and the fact that for every p (, ), x [a, b] d it holds that sup E[ Xt x p ] < (8) R d t [,T ] 18
19 assure that for every x [a, b] d it holds that E [ ϕ(x x T ) 2 ] <. (81) Item (i), the assumption that for every ω Ω it holds that the function [a, b] d x X x T (ω) Rd is continuous, and, e.g., Hutzenthaler et al. [35, Proposition 4.5] hence ensure that the function [a, b] d x E[ϕ(X x T )] R (82) is continuous. In the next step we combine the fact that for every x [a, b] d it holds that the function Ω ω ϕ(xt x (ω)) R is F/B(R)-measurable, the fact that for every ω Ω it holds that the function [a, b] d x ϕ(xt x (ω)) R is continuous, and Lemma 2.4 to obtain that the function [a, b] d Ω (x, ω) ϕ(x x T (ω)) R (83) is (B([a, b] d ) F)/B(R)-measurable. Combining this, (81), (82), and Proposition 2.2 demonstrates A) that there exists a unique continuous function U : [a, b] d R which satisfies that E [ ϕ(xt x ) U(x) 2] ( dx = inf E [ ϕ(x [a,b] d v C([a,b] d T x ) v(x) 2] ) dx,r) [a,b] d and (84) B) that it holds for every x [a, b] d that U(x) = E[ϕ(X x T )]. (85) Next note that for every continuous function V : [a, b] d R it holds that sup V (x) <. (86) x [a,b] d Item (i) hence implies that for every continuous function V : [a, b] d R it holds that C([, T ], R d ) (z t ) t [,T ] ϕ(z T ) V (z ) 2 R (87) is an at most polynomially growing continuous function. Combining Lemma 2.6 (with Φ = (C([, T ], R d ) (z t ) t [,T ] ϕ(z T ) V (z ) 2 R) for V C([a, b] d, R d ) in the notation of Lemma 2.6), (77), item (i), and (79) hence ensures that for every continuous function V : [a, b] d R it holds that E [ ϕ(x T ) V (ξ) 2] [ 1 = E [ ϕ(x (b a) d T x ) V (x) 2] ] dx. (88) [a,b] d 19
20 Hence, we obtain that for every continuous function V : [a, b] d R with E[ ϕ(x T ) V (ξ) 2 ] = inf v C([a,b] d,r) E[ ϕ(x T ) v(ξ) 2 ] it holds that E [ ϕ(xt x ) V (x) 2] dx [a,b] d ( [ = (b a) d 1 E [ ϕ(x (b a) d T x ) V (x) 2] ]) dx [a,b] d = (b a) ( d E [ ϕ(x T ) V (ξ) 2]) ( = (b a) d inf E [ ϕ(x T ) v(ξ) 2]) (89) v C([a,b] d,r) ( ( [ = (b a) d 1 inf E [ ϕ(x v C([a,b] d,r) (b a) d T x ) v(x) 2] ])) dx [a,b] [ d = inf E [ ϕ(x v C([a,b] d T x ) v(x) 2] ] dx.,r) [a,b] d Combining this with (84) proves that for every continuous function V : [a, b] d R with E[ ϕ(x T ) V (ξ) 2 ] = inf v C([a,b] d,r) E[ ϕ(x T ) v(ξ) 2 ] it holds that U = V. (9) Next observe that (88) and (84) demonstrate that E [ ϕ(x T ) U(ξ) 2] ( 1 = E [ ϕ(x (b a) d T x ) U(x) 2] ) dx [a,b] [ d ( 1 = inf E [ ϕ(x v C([a,b] d,r) (b a) d T x ) v(x) 2] ) ] dx [a,b] d = inf E [ ϕ(x T ) v(ξ) 2]. v C([a,b] d,r) (91) Combining this with (9) proves item (ii). Next note that (76), (85), and the Feynman- Kac formula (cf., for example, Hairer et al. [22, Corollary 4.17]) imply that for every x [a, b] d it holds that U(x) = E[ϕ(XT x )] = u(t, x). This establishes item (iii). The proof of Proposition 2.7 is thus completed. In the next step we use Proposition 2.7 to obtain a minimization problem which is uniquely solved by the function [a, b] d x u(t, x) R. More specifically, let ξ : Ω [a, b] d be a continuously uniformly distributed F /B([a, b] d )-measurable random variable, and let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X t = ξ + t µ(x s ) ds + 2 t σ(x s ) dw s. (92)
21 Proposition 2.7 then guarantees that the function [a, b] d x u(t, x) R is the unique global minimizer of the function C([a, b] d, R) v E [ ϕ(x T ) v(ξ) 2] R. (93) In the following two subsections we derive an approximated minimization problem by discretizing the stochastic process X: [, T ] Ω R d (see Subsection 2.4 below) and by employing a deep neural network approximation for the function R d x u(t, x) R (see Subsection 2.5 below). 2.4 Discretization of the stochastic differential equation In this subsection we use the Euler-Maruyama scheme (cf., for example, Kloeden & Platen [41] and Maruyama [44]) to temporally discretize the solution process X of the SDE (92). More specifically, let N N, let t, t 1,..., t N [, ) be real numbers which satisfy that = t < t 1 <... < t N = T. (94) Note that (92) implies that for every n {, 1,..., N 1} it holds P-a.s. that X tn+1 = X tn + tn+1 t n µ(x s ) ds + tn+1 t n σ(x s ) dw s. (95) This suggests that for sufficiently small mesh size sup n {,1,...,N 1} (t n+1 t n ) it holds that X tn+1 X tn + µ(x tn ) (t n+1 t n ) + σ(x tn ) (W tn+1 W tn ). (96) Let X : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X = ξ and X n+1 = X n + µ(x n ) (t n+1 t n ) + σ(x n ) (W tn+1 W tn ). (97) Observe that (96) and (97) suggest, in turn, that for every n {, 1, 2,..., N} it holds that X n X tn (98) (cf., for example, Theorem 2.8 below for a strong convergence result for the Euler-Maruyama scheme). Theorem 2.8 (Strong convergence rate for the Euler-Maruyama scheme). Let T (, ), d N, p [2, ), let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R d be a standard (F t ) t [,T ] -Brownian motion, let ξ : Ω R d be a random variable which satisfies that E[ ξ p R d ] <, let µ: R d R d and σ : R d R d d be Lipschitz 21
22 continuous functions, let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X t = ξ + t µ(x s ) ds + t σ(x s ) dw s, (99) for every N N let t N, t N 1,..., t N N [, T ] be real numbers which satisfy that = t N < t N 1 <... < t N N = T, (1) and for every N N let X N : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X N = ξ and X N n+1 = X N n + µ ( X N n ) (t N n+1 t N n ) + σ ( X N n )( Wt N n+1 W t N n ). (11) Then there exists a real number C (, ) such that for every N N it holds that ( [ sup E X t N n {,1,...,N} X ] ) [ ] 1/2 N n n p 1/p R C max t d n+1 t n. (12) n {,1,...,N 1} The proof of Theorem 2.4 is well-known in the literature (cf., for instance, Kloeden & Platen [41], Milstein [46], Hofmann, Müller-Gronbach, & Ritter [3], Müller-Gronbach & Ritter [49], and the references mentioned therein). 2.5 Deep artificial neural network approximations In this subsection we employ suitable approximations for the solution R d x u(t, x) R of the PDE (1) at time T. More specifically, let ν N and let U = (U(θ, x)) (θ,x) R ν R : d Rν R d R be a continuous function. For every suitable θ R ν and every x [a, b] d we think of U(θ, x) R as an appropriate approximation U(θ, x) u(t, x) (13) of u(t, x). We suggest to choose the function U: R ν R d R as a deep neural network (cf., for example, Bishop [6]). For instance, let L d : R d R d be the function which satisfies for every x = (x 1, x 2,..., x d ) R d that ( ) exp(x1 ) L d (x) = exp(x 1 ) + 1, exp(x 2 ) exp(x 2 ) + 1,..., exp(x d ) (14) exp(x d ) + 1 (multidimensional version of the standard logistic function), for every k, l N, v N = {} N, θ = (θ 1,..., θ ν ) R ν with v + l(k + 1) ν let A θ,v k,l : Rk R l be the function 22
23 which satisfies for every x = (x 1,..., x k ) R k that θ v+1 θ v+2... θ v+k θ v+k+1 θ v+k+2... θ v+2k θ v+2k+1 θ v+2k+2... θ v+3k A θ,v k,l (x) =.... θ v+(l 1)k+1 θ v+(l 1)k+2... θ v+lk x 1 x 2 x 3. x k + θ v+kl+1 θ v+kl+2 θ v+kl+3. θ v+kl+l, (15) let s {3, 4, 5, 6,...}, assume that (s 1)d(d + 1) + d + 1 ν, and let U: R ν R d R be the function which satisfies for every θ R ν, x R d that U(θ, x) = ( A θ,(s 1)d(d+1) d,1 L d A θ,(s 2)d(d+1) d,d... L d A θ,d(d+1) d,d L d Ad,d) θ, (x). (16) The function U: R ν R d R in (16) describes an artificial neural network with s+1 layers (1 input layer with d neurons, s 1 hidden layers with d neurons each, and 1 output layer with d neurons) and standard logistic functions as activation functions (cf., for instance, Bishop [6]). 2.6 Stochastic gradient descent-type minimization As described in Subsection 2.5 for every suitable θ R ν and every x [a, b] d we think of U(θ, x) R as an appropriate approximation of u(t, x) R. In this subsection we intend to find a suitable θ R ν as an approximate minimizer of the function R ν θ E [ ϕ(x N ) U(θ, ξ) 2] R. (17) To be more specific, we intend to find an approximate minimizer of the function in (17) through a stochastic gradient descent-type minimization algorithm (cf., for instance, Ruder [53, Section 4], Jentzen et al. [37], and the references mentioned therein). For this we approximate the derivative of the function in (17) by means of the Monte Carlo method. More precisely, let ξ (m) : Ω [a, b] d, m N, be independent continuously uniformly distributed F /B([a, b] d )-measurable random variables, let W (m) : [, T ] Ω R d, m N, be independent standard (F t ) t [,T ] -Brownian motions, for every m N let X (m) = (X n (m) ) n {,1,...,N} : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X (m) = ξ (m) and X (m) n+1 = X (m) n + µ(x (m) n ) (t n+1 t n ) + σ(x (m) ) (W (m) t n+1 W (m) t n ), (18) let γ (, ), and let Θ: N Ω R ν be a stochastic process which satisfies for every m N that Θ m+1 = Θ m 2γ (U(Θ m, ξ (m) ) ϕ(x (m) N )) ( θ U)(Θ m, ξ (m) ). (19) 23 n
24 Under appropriate hypotheses we think for every sufficiently large m N of the random variable Θ m : Ω R ν as a suitable approximation of a local minimum point of the function (17) and we think for every sufficiently large m N of the random function [a, b] d x U(Θ n, x) R as a suitable approximation of the function [a, b] d x u(t, x) R. 2.7 Description of the algorithm in a special case In this subsection we give a description of the proposed approximation method in a special case, that is, we describe the proposed approximation method in the specific case where a particular neural network approximation is chosen and where the plain-vanilla stochastic gradient descent method with a constant learning rate is the employed stochastic minimization algorithm (cf. (19) above). For a more general description of the proposed approximation method we refer the reader to Subsection 2.8 below. Framework 2.9. Let T, γ (, ), a R, b (a, ), d, N N, s {3, 4, 5,...}, let ν = sd(d + 1), let t, t 1,..., t N [, T ] be real numbers with = t < t 1 <... < t N = T, (11) let µ: R d R d and σ : R d R d d be continuous functions, let (Ω, F,P, (F t ) t [,T ] ) be a filtered probability space, let ξ (m) : Ω [a, b] d, m N, be independent continuously uniformly distributed F /B([a, b] d )-measurable random variables, let W (m) : [, T ] Ω R d, m N, be i.i.d. standard (F t ) t [,T ] -Brownian motions, for every m N let X (m) : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X (m) = ξ (m) and X (m) n+1 = X (m) n + µ(x (m) n ) (t n+1 t n ) + σ(x (m) ) (W (m) t n+1 W (m) t n ), (111) let L d : R d R d be the function which satisfies for every x = (x 1, x 2,..., x d ) R d that ( ) exp(x1 ) L d (x) = exp(x 1 ) + 1, exp(x 2 ) exp(x 2 ) + 1,..., exp(x d ), (112) exp(x d ) + 1 for every k, l N, v N = {} N, θ = (θ 1,..., θ ν ) R ν with v + l(k + 1) ν let A θ,v k,l : Rk R l be the function which satisfies for every x = (x 1,..., x k ) R k that θ v+1 θ v+2... θ v+k x 1 θ v+kl+1 θ v+k+1 θ v+k+2... θ v+2k x 2 θ v+kl+2 θ v+2k+1 θ v+2k+2... θ v+3k x 3 + θ v+kl+3, (113) A θ,v k,l (x) =.... θ v+(l 1)k+1 θ v+(l 1)k+2... θ v+lk n. x k. θ v+kl+l 24
25 let U: R ν R d R be the function which satisfies for every θ R ν, x R d that U(θ, x) = ( A θ,(s 1)d(d+1) d,1 L d A θ,(s 2)d(d+1) d,d... L d A θ,d(d+1) d,d L d Ad,d) θ, (x), (114) and let Θ: N Ω R ν be a stochastic process which satisfies for every m N that Θ m+1 = Θ m 2γ (U(Θ m, ξ (m) ) ϕ(x (m) N )) ( θ U)(Θ m, ξ (m) ) (115) Under appropriate hypotheses we think for every sufficiently large m N and every x [a, b] d of the random variable U(Θ m, x): Ω R in Framework 2.9 as a suitable approximation U(Θ m, x) u(t, x) of u(t, x) R where u = u(t, x) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) is a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d (116) (cf. (1) above). 2.8 Description of the algorithm in the general case In this subsection we provide a general framework which covers the approximation method derived in Subsections above and which allows, in addition, to incorporate other minimization algorithms (cf., for example, Kingma & Ba [38], Ruder [53], E et al. [14], Han et al. [24], and Beck et al. [3]) than just the plain vanilla stochastic gradient descent method. The proposed approximation algorithm is an extension of the approximation algorithm in E et al. [14], Han et al. [24], and Beck et al. [3] in the special case of linear Kolmogorov partial differential equations. Framework 2.1. Let T (, ), N, d, ϱ, ν, ς N, let H : [, T ] 2 R d R d R d, ϕ: R d R be functions, let (Ω, F,P, (F t ) t [,T ] ) be a filtered probability space, let W m,j : [, T ] Ω R d, m N, j N, be independent standard (F t ) t [,T ] -Brownian motions on (Ω, F,P), let ξ m,j : Ω R d, m N, j N, be i.i.d. F /B(R d )-measurable random variables, let t, t 1,..., t N [, T ] be real numbers with = t < t 1 <... < t N = T, for every θ R ν, j N, s R ς let U θ,j,s : R d R be a function, for every m N, j N let X m,j = (Xn m,j ) n {,1,...,N} : {, 1,..., N} Ω R d be a stochastic process which satisfies for every n {, 1,..., N 1} that X m,j = ξ m,j and X m,j n+1 = H(t n, t n+1, X m,j n, W m,j t n+1 W m,j t n ), (117) let (J m ) m N N be a sequence, for every m N, s R ς let φ m,s : R ν Ω R be the function which satisfies for every (θ, ω) R ν Ω that φ m,s (θ, ω) = 1 J m J m j=1 [ U θ,j,s ( ξ m,j (ω)) ϕ ( X m,j N (ω))] 2, (118) 25
26 for every m N, s R ς let Ψ m,s : R ν Ω R ν be a function which satisfies for every ω Ω, θ {η R ν : φ m,s (, ω): R ν R is differentiable at η} that Ψ m,s (θ, ω) = ( θ φ m,s )(θ, ω), (119) let S : R ς R ν (R d ) N R ς be a function, for every m N let Φ m : R ϱ R ν and Ψ m : R ϱ R ν R ϱ be functions, let Θ: N Ω R ν, S: N Ω R ς, and Ξ: N Ω R ϱ be stochastic processes which satisfy for every m N that S m+1 = S ( S m, Θ m, (X m,i N ) i N), Ξm+1 = Ψ m (Ξ m, Ψ m,s m+1 (Θ m )), (12) and Θ m+1 = Θ m Φ m (Ξ m+1 ). (121) Under appropriate hypotheses we think for every sufficiently large m N and every x [a, b] d of the random variable U Θm,1,Sm (x): Ω R in Framework 2.1 as a suitable approximation U Θm,1,Sm (x) u(t, x) of u(t, x) R where u = u(t, x) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) is a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d, (122) where µ: R d R d and σ : R d R d d are sufficiently regular functions (cf. (1) above). 3 Examples In this section we test the proposed approximation algorithm (see Section 2 above) in the case of several examples of SDEs and Kolmogorov PDEs, respectively. In particular, in this section we apply the proposed approximation algorithm to the heat equation (cf. Subsection 3.2 below), to independent geometric Brownian motions (cf. Subsection 3.3 below), to the Black-Scholes model (cf. Subsection 3.4 below), to stochastic Lorenz equations (cf. Subsection 3.5 below), and to the Heston model (cf. Subsection 3.6 below). In the case of each of the examples below we employ the general approximation algorithm in Framework 2.1 above in conjunction with the Adam optimizer (cf. Kingma & Ba [38]) with mini-batches of size 8192 in each iteration step (see Subsection 3.1 below for a precise description). Moreover, we employ a fully-connected feedforward neural network with one input layer, two hidden layers, and one one-dimensional output layer in our implementations in the case of each of these examples. We also use batch normalization (cf. Ioffe & Szegedy [36]) just before the first linear transformation, just before each of the two nonlinear activation functions in front of the hidden layers as well as just after the last linear transformation. For the two nonlinear activation functions we employ the multidimensional version of the function R x tanh(x) ( 1, 1). All weights in the neural network are initialized by means of the Xavier initialization (cf. Glorot & Bengio [18]). All computations were 26
27 performed in single precision (float32) on a NVIDIA GeForce GTX 18 GPU with 1974 MHz core clock and 8 GB GDDR5X memory with MHz clock rate. The underlying system consisted of an Intel Core i7-68k CPU with 64 GB DDR memory running Tensorflow 1.5 on Ubuntu Setting Framework 3.1. Assume Framework 2.1, let ε = 1 8, β 1 = 9, β 1 1 = 999, (γ 1 m) m N (, ), let Pow r : R ν R ν, r (, ), be the functions which satisfy for every r (, ), x = (x 1,..., x ν ) R ν that Pow r (x) = ( x 1 r,..., x ν r ), (123) assume for every m N, i {, 1,..., N} that J m = 8192, t i = it N, ϱ = 2ν, T = 1, γ m = [,25] (m) (25,5] (m) (5, ) (m), assume for every m N, x = (x 1,..., x ν ), y = (y 1,..., y ν ), η = (η 1,..., η ν ) R ν that Ψ m (x, y, η) = (β 1 x + (1 β 1 )η, β 2 y + (1 β 2 ) Pow 2 (η)) (124) and ( [ ] 1 [ ] ) 1 y ψ m (x, y) = 1 γ m x 1 1 (β 2 + ε ) m 1 (β 1 ),..., y ν γ m x ν m 1 (β 2 + ε. (125) ) m 1 (β 1 ) m Equations (124) and (125) in Framework 3.1 describe the Adam optimizer (cf. Kingma & Ba [38], e.g., E et al. [24, (32) (33) in Section 4.2 and (9) (91) in Section 5.2], and line 84 in Python code 1 in Section 4 below). 3.2 Heat equation In this subsection we apply the proposed approximation algorithm to the heat equation (see (126) below). Assume Framework 2.1, assume for every s, t [, T ], x, w R d, m N that N = 1, d = 1, ν = d(2d) + (2d) 2 + 2d = 2d(3d + 1), ϕ(x) = x 2, H(s, t, x, w) = x + 2 Id R d R d w, assume that ξ,1 : Ω R d is continuous uniformly distributed on [, 1] d, and let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be an at most polynomially growing function which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and ( u t )(t, x) = ( xu)(t, x). (126) Combining, e.g., Lemma 3.2 below with, e.g., Hairer et al. [22, Corollary 4.17 and Remark 4.1] shows that for every t [, T ], x R d it holds that u(t, x) = x 2 R d + t d. (127) 27
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations arxiv:1706.04702v1 [math.na] 15 Jun 2017 Weinan E 1, Jiequn
More informationCh. Beck and W. E and A. Jentzen. Research Report No September 2017
Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations Ch. Beck and W. E and A. Jentzen
More informationLoss of regularity for Kolmogorov equations
Loss of regularity for Kolmogorov equations Martin Hairer, Martin Hutzenthaler and Arnulf Jentzen 3,4 Mathematics Department, The University of Warwick, Coventry, CV4 7AL, United Kingdom, e-mail: M.Hairer
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationStrong error analysis for stochastic gradient descent optimization algorithms
Strong error analysis for stochastic gradient descent optimization algorithms Arnulf Jentzen 1, Benno Kuckuck 2, Ariel Neufeld 3, and Philippe von Wurstemberger 4 arxiv:1801.09324v1 math.na 29 Jan 2018
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationDiscretization of SDEs: Euler Methods and Beyond
Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo
More informationBrownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539
Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory
More informationThe Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems
The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems Weinan E 1 and Bing Yu 2 arxiv:1710.00211v1 [cs.lg] 30 Sep 2017 1 The Beijing Institute of Big Data Research,
More informationAdaptive timestepping for SDEs with non-globally Lipschitz drift
Adaptive timestepping for SDEs with non-globally Lipschitz drift Mike Giles Wei Fang Mathematical Institute, University of Oxford Workshop on Numerical Schemes for SDEs and SPDEs Université Lille June
More informationMean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations
Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations Ram Sharan Adhikari Assistant Professor Of Mathematics Rogers State University Mathematical
More informationThe Wiener Itô Chaos Expansion
1 The Wiener Itô Chaos Expansion The celebrated Wiener Itô chaos expansion is fundamental in stochastic analysis. In particular, it plays a crucial role in the Malliavin calculus as it is presented in
More informationUNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE
Surveys in Mathematics and its Applications ISSN 1842-6298 (electronic), 1843-7265 (print) Volume 5 (2010), 275 284 UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE Iuliana Carmen Bărbăcioru Abstract.
More information1. Stochastic Processes and filtrations
1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S
More informationExercises in stochastic analysis
Exercises in stochastic analysis Franco Flandoli, Mario Maurelli, Dario Trevisan The exercises with a P are those which have been done totally or partially) in the previous lectures; the exercises with
More informationStochastic methods for solving partial differential equations in high dimension
Stochastic methods for solving partial differential equations in high dimension Marie Billaud-Friess Joint work with : A. Macherey, A. Nouy & C. Prieur marie.billaud-friess@ec-nantes.fr Centrale Nantes,
More informationarxiv: v2 [math.pr] 9 May 2013
arxiv:103.5809v math.pr] 9 May 013 umerical approximations of stochastic differential equations with non-globally Author address: Lipschitz continuous coefficients Martin Hutzenthaler Arnulf Jentzen LMU
More informationA Concise Course on Stochastic Partial Differential Equations
A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original
More informationBranching Processes II: Convergence of critical branching to Feller s CSB
Chapter 4 Branching Processes II: Convergence of critical branching to Feller s CSB Figure 4.1: Feller 4.1 Birth and Death Processes 4.1.1 Linear birth and death processes Branching processes can be studied
More information(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS
(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS Svetlana Janković and Miljana Jovanović Faculty of Science, Department of Mathematics, University
More informationCores for generators of some Markov semigroups
Cores for generators of some Markov semigroups Giuseppe Da Prato, Scuola Normale Superiore di Pisa, Italy and Michael Röckner Faculty of Mathematics, University of Bielefeld, Germany and Department of
More informationIn terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3.
1. GAUSSIAN PROCESSES A Gaussian process on a set T is a collection of random variables X =(X t ) t T on a common probability space such that for any n 1 and any t 1,...,t n T, the vector (X(t 1 ),...,X(t
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationEULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS
Qiao, H. Osaka J. Math. 51 (14), 47 66 EULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS HUIJIE QIAO (Received May 6, 11, revised May 1, 1) Abstract In this paper we show
More informationStochastic optimal control with rough paths
Stochastic optimal control with rough paths Paul Gassiat TU Berlin Stochastic processes and their statistics in Finance, Okinawa, October 28, 2013 Joint work with Joscha Diehl and Peter Friz Introduction
More informationStochastic Gradient Descent in Continuous Time
Stochastic Gradient Descent in Continuous Time Justin Sirignano University of Illinois at Urbana Champaign with Konstantinos Spiliopoulos (Boston University) 1 / 27 We consider a diffusion X t X = R m
More informationFiltrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition
Filtrations, Markov Processes and Martingales Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition David pplebaum Probability and Statistics Department,
More informationOptimal portfolio strategies under partial information with expert opinions
1 / 35 Optimal portfolio strategies under partial information with expert opinions Ralf Wunderlich Brandenburg University of Technology Cottbus, Germany Joint work with Rüdiger Frey Research Seminar WU
More informationON THE PATHWISE UNIQUENESS OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS
PORTUGALIAE MATHEMATICA Vol. 55 Fasc. 4 1998 ON THE PATHWISE UNIQUENESS OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS C. Sonoc Abstract: A sufficient condition for uniqueness of solutions of ordinary
More informationBackward Stochastic Differential Equations with Infinite Time Horizon
Backward Stochastic Differential Equations with Infinite Time Horizon Holger Metzler PhD advisor: Prof. G. Tessitore Università di Milano-Bicocca Spring School Stochastic Control in Finance Roscoff, March
More informationRough Burgers-like equations with multiplicative noise
Rough Burgers-like equations with multiplicative noise Martin Hairer Hendrik Weber Mathematics Institute University of Warwick Bielefeld, 3.11.21 Burgers-like equation Aim: Existence/Uniqueness for du
More informationA Class of Fractional Stochastic Differential Equations
Vietnam Journal of Mathematics 36:38) 71 79 Vietnam Journal of MATHEMATICS VAST 8 A Class of Fractional Stochastic Differential Equations Nguyen Tien Dung Department of Mathematics, Vietnam National University,
More informationExercises Measure Theoretic Probability
Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary
More informationNonlinear Lévy Processes and their Characteristics
Nonlinear Lévy Processes and their Characteristics Ariel Neufeld Marcel Nutz January 11, 215 Abstract We develop a general construction for nonlinear Lévy processes with given characteristics. More precisely,
More informationConvergence of Particle Filtering Method for Nonlinear Estimation of Vortex Dynamics
Convergence of Particle Filtering Method for Nonlinear Estimation of Vortex Dynamics Meng Xu Department of Mathematics University of Wyoming February 20, 2010 Outline 1 Nonlinear Filtering Stochastic Vortex
More informationRandom Process Lecture 1. Fundamentals of Probability
Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus
More informationIntroduction to self-similar growth-fragmentations
Introduction to self-similar growth-fragmentations Quan Shi CIMAT, 11-15 December, 2017 Quan Shi Growth-Fragmentations CIMAT, 11-15 December, 2017 1 / 34 Literature Jean Bertoin, Compensated fragmentation
More informationControlled Diffusions and Hamilton-Jacobi Bellman Equations
Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter
More informationWiener Measure and Brownian Motion
Chapter 16 Wiener Measure and Brownian Motion Diffusion of particles is a product of their apparently random motion. The density u(t, x) of diffusing particles satisfies the diffusion equation (16.1) u
More informationAn iterative procedure for constructing subsolutions of discrete-time optimal control problems
An iterative procedure for constructing subsolutions of discrete-time optimal control problems Markus Fischer version of November, 2011 Abstract An iterative procedure for constructing subsolutions of
More informationLinearly-Solvable Stochastic Optimal Control Problems
Linearly-Solvable Stochastic Optimal Control Problems Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014
More informationExponential martingales: uniform integrability results and applications to point processes
Exponential martingales: uniform integrability results and applications to point processes Alexander Sokol Department of Mathematical Sciences, University of Copenhagen 26 September, 2012 1 / 39 Agenda
More informationFast-slow systems with chaotic noise
Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York University New York NY www.dtbkelly.com May 1, 216 Statistical properties of dynamical systems, ESI Vienna. David
More information3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?
MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable
More informationStochastic Differential Equations
CHAPTER 1 Stochastic Differential Equations Consider a stochastic process X t satisfying dx t = bt, X t,w t dt + σt, X t,w t dw t. 1.1 Question. 1 Can we obtain the existence and uniqueness theorem for
More informationWeak convergence and large deviation theory
First Prev Next Go To Go Back Full Screen Close Quit 1 Weak convergence and large deviation theory Large deviation principle Convergence in distribution The Bryc-Varadhan theorem Tightness and Prohorov
More information5 December 2016 MAA136 Researcher presentation. Anatoliy Malyarenko. Topics for Bachelor and Master Theses. Anatoliy Malyarenko
5 December 216 MAA136 Researcher presentation 1 schemes The main problem of financial engineering: calculate E [f(x t (x))], where {X t (x): t T } is the solution of the system of X t (x) = x + Ṽ (X s
More informationON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME
ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems
More informationStochastic differential equation models in biology Susanne Ditlevsen
Stochastic differential equation models in biology Susanne Ditlevsen Introduction This chapter is concerned with continuous time processes, which are often modeled as a system of ordinary differential
More informationHölder continuity of the solution to the 3-dimensional stochastic wave equation
Hölder continuity of the solution to the 3-dimensional stochastic wave equation (joint work with Yaozhong Hu and Jingyu Huang) Department of Mathematics Kansas University CBMS Conference: Analysis of Stochastic
More information2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?
MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is
More informationWeak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij
Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real
More informationOn the fast convergence of random perturbations of the gradient flow.
On the fast convergence of random perturbations of the gradient flow. Wenqing Hu. 1 (Joint work with Chris Junchi Li 2.) 1. Department of Mathematics and Statistics, Missouri S&T. 2. Department of Operations
More information13 The martingale problem
19-3-2012 Notations Ω complete metric space of all continuous functions from [0, + ) to R d endowed with the distance d(ω 1, ω 2 ) = k=1 ω 1 ω 2 C([0,k];H) 2 k (1 + ω 1 ω 2 C([0,k];H) ), ω 1, ω 2 Ω. F
More informationWeak solutions of mean-field stochastic differential equations
Weak solutions of mean-field stochastic differential equations Juan Li School of Mathematics and Statistics, Shandong University (Weihai), Weihai 26429, China. Email: juanli@sdu.edu.cn Based on joint works
More informationDynamical systems with Gaussian and Levy noise: analytical and stochastic approaches
Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches Noise is often considered as some disturbing component of the system. In particular physical situations, noise becomes
More informationLecture 6 Basic Probability
Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic
More informationLecture 4 Lebesgue spaces and inequalities
Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how
More informationMalliavin Calculus in Finance
Malliavin Calculus in Finance Peter K. Friz 1 Greeks and the logarithmic derivative trick Model an underlying assent by a Markov process with values in R m with dynamics described by the SDE dx t = b(x
More informationA MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT
A MODEL FOR HE LONG-ERM OPIMAL CAPACIY LEVEL OF AN INVESMEN PROJEC ARNE LØKKA AND MIHAIL ZERVOS Abstract. We consider an investment project that produces a single commodity. he project s operation yields
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationOPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS
APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze
More informationA random perturbation approach to some stochastic approximation algorithms in optimization.
A random perturbation approach to some stochastic approximation algorithms in optimization. Wenqing Hu. 1 (Presentation based on joint works with Chris Junchi Li 2, Weijie Su 3, Haoyi Xiong 4.) 1. Department
More informationSelected Exercises on Expectations and Some Probability Inequalities
Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ
More informationLipschitz continuity for solutions of Hamilton-Jacobi equation with Ornstein-Uhlenbeck operator
Lipschitz continuity for solutions of Hamilton-Jacobi equation with Ornstein-Uhlenbeck operator Thi Tuyen Nguyen Ph.D student of University of Rennes 1 Joint work with: Prof. E. Chasseigne(University of
More informationOn Stopping Times and Impulse Control with Constraint
On Stopping Times and Impulse Control with Constraint Jose Luis Menaldi Based on joint papers with M. Robin (216, 217) Department of Mathematics Wayne State University Detroit, Michigan 4822, USA (e-mail:
More informationSolutions to the Exercises in Stochastic Analysis
Solutions to the Exercises in Stochastic Analysis Lecturer: Xue-Mei Li 1 Problem Sheet 1 In these solution I avoid using conditional expectations. But do try to give alternative proofs once we learnt conditional
More informationNumerical discretisations of stochastic wave equations
Numerical discretisations of stochastic wave equations David Cohen Matematik och matematisk statistik \ UMIT Research Lab, Umeå universitet Institut für Mathematik, Universität Innsbruck Joint works with
More informationApproximation of BSDEs using least-squares regression and Malliavin weights
Approximation of BSDEs using least-squares regression and Malliavin weights Plamen Turkedjiev (turkedji@math.hu-berlin.de) 3rd July, 2012 Joint work with Prof. Emmanuel Gobet (E cole Polytechnique) Plamen
More informationERRATA: Probabilistic Techniques in Analysis
ERRATA: Probabilistic Techniques in Analysis ERRATA 1 Updated April 25, 26 Page 3, line 13. A 1,..., A n are independent if P(A i1 A ij ) = P(A 1 ) P(A ij ) for every subset {i 1,..., i j } of {1,...,
More informationA D VA N C E D P R O B A B I L - I T Y
A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2
More informationChapter 4: Monte-Carlo Methods
Chapter 4: Monte-Carlo Methods A Monte-Carlo method is a technique for the numerical realization of a stochastic process by means of normally distributed random variables. In financial mathematics, it
More informationLarge Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm
Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm Gonçalo dos Reis University of Edinburgh (UK) & CMA/FCT/UNL (PT) jointly with: W. Salkeld, U. of
More informationNonlinear representation, backward SDEs, and application to the Principal-Agent problem
Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Ecole Polytechnique, France April 4, 218 Outline The Principal-Agent problem Formulation 1 The Principal-Agent problem
More informationThe Pedestrian s Guide to Local Time
The Pedestrian s Guide to Local Time Tomas Björk, Department of Finance, Stockholm School of Economics, Box 651, SE-113 83 Stockholm, SWEDEN tomas.bjork@hhs.se November 19, 213 Preliminary version Comments
More informationConvergence at first and second order of some approximations of stochastic integrals
Convergence at first and second order of some approximations of stochastic integrals Bérard Bergery Blandine, Vallois Pierre IECN, Nancy-Université, CNRS, INRIA, Boulevard des Aiguillettes B.P. 239 F-5456
More informationRegularity of the density for the stochastic heat equation
Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department
More informationDeep Learning for Partial Differential Equations (PDEs)
Deep Learning for Partial Differential Equations (PDEs) Kailai Xu kailaix@stanford.edu Bella Shi bshi@stanford.edu Shuyi Yin syin3@stanford.edu Abstract Partial differential equations (PDEs) have been
More informationLarge deviations and averaging for systems of slow fast stochastic reaction diffusion equations.
Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationLecture No 1 Introduction to Diffusion equations The heat equat
Lecture No 1 Introduction to Diffusion equations The heat equation Columbia University IAS summer program June, 2009 Outline of the lectures We will discuss some basic models of diffusion equations and
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationPartial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula
Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula Group 4: Bertan Yilmaz, Richard Oti-Aboagye and Di Liu May, 15 Chapter 1 Proving Dynkin s formula
More informationOn a class of stochastic differential equations in a financial network model
1 On a class of stochastic differential equations in a financial network model Tomoyuki Ichiba Department of Statistics & Applied Probability, Center for Financial Mathematics and Actuarial Research, University
More informationSOLUTIONS OF SEMILINEAR WAVE EQUATION VIA STOCHASTIC CASCADES
Communications on Stochastic Analysis Vol. 4, No. 3 010) 45-431 Serials Publications www.serialspublications.com SOLUTIONS OF SEMILINEAR WAVE EQUATION VIA STOCHASTIC CASCADES YURI BAKHTIN* AND CARL MUELLER
More informationProperties of an infinite dimensional EDS system : the Muller s ratchet
Properties of an infinite dimensional EDS system : the Muller s ratchet LATP June 5, 2011 A ratchet source : wikipedia Plan 1 Introduction : The model of Haigh 2 3 Hypothesis (Biological) : The population
More informationRobustness for a Liouville type theorem in exterior domains
Robustness for a Liouville type theorem in exterior domains Juliette Bouhours 1 arxiv:1207.0329v3 [math.ap] 24 Oct 2014 1 UPMC Univ Paris 06, UMR 7598, Laboratoire Jacques-Louis Lions, F-75005, Paris,
More informationfor all f satisfying E[ f(x) ] <.
. Let (Ω, F, P ) be a probability space and D be a sub-σ-algebra of F. An (H, H)-valued random variable X is independent of D if and only if P ({X Γ} D) = P {X Γ}P (D) for all Γ H and D D. Prove that if
More informationQuantization of stochastic processes with applications on Euler-Maruyama schemes
UPTEC F15 063 Examensarbete 30 hp Oktober 2015 Quantization of stochastic processes with applications on Euler-Maruyama schemes Viktor Edward Abstract Quantization of stochastic processes with applications
More informationLECTURE 2: LOCAL TIME FOR BROWNIAN MOTION
LECTURE 2: LOCAL TIME FOR BROWNIAN MOTION We will define local time for one-dimensional Brownian motion, and deduce some of its properties. We will then use the generalized Ray-Knight theorem proved in
More informationLAN property for sde s with additive fractional noise and continuous time observation
LAN property for sde s with additive fractional noise and continuous time observation Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Samy Tindel (Purdue University) Vlad s 6th birthday,
More informationKolmogorov Equations and Markov Processes
Kolmogorov Equations and Markov Processes May 3, 013 1 Transition measures and functions Consider a stochastic process {X(t)} t 0 whose state space is a product of intervals contained in R n. We define
More informationA NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES
A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES STEFAN TAPPE Abstract. In a work of van Gaans (25a) stochastic integrals are regarded as L 2 -curves. In Filipović and Tappe (28) we have shown the connection
More information1 Independent increments
Tel Aviv University, 2008 Brownian motion 1 1 Independent increments 1a Three convolution semigroups........... 1 1b Independent increments.............. 2 1c Continuous time................... 3 1d Bad
More informationStrong Solutions and a Bismut-Elworthy-Li Formula for Mean-F. Formula for Mean-Field SDE s with Irregular Drift
Strong Solutions and a Bismut-Elworthy-Li Formula for Mean-Field SDE s with Irregular Drift Thilo Meyer-Brandis University of Munich joint with M. Bauer, University of Munich Conference for the 1th Anniversary
More informationSome Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2)
Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2) Statistical analysis is based on probability theory. The fundamental object in probability theory is a probability space,
More information4 Expectation & the Lebesgue Theorems
STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does
More informationUnbiased simulation of stochastic differential equations using parametrix expansions
Submitted to Bernoulli Unbiased simulation of stochastic differential equations using parametrix expansions PATRIK ANDERSSON and ARTURO KOHATSU-HIGA Department of Statistics, Uppsala University, Box 53,
More information