Solving stochastic differential equations and Kolmogorov equations by means of deep learning

Size: px
Start display at page:

Download "Solving stochastic differential equations and Kolmogorov equations by means of deep learning"

Transcription

1 Solving stochastic differential equations and Kolmogorov equations by means of deep learning Christian Beck 1, Sebastian Becker 2, Philipp Grohs 3, Nor Jaafari 4, and Arnulf Jentzen 5 arxiv: v1 [math.na] 1 Jun Department of Mathematics, ETH Zurich, Zurich, Switzerland, christian.beck@math.ethz.ch 2 ZENAI AG, Zurich, Switzerland, sebastian.becker@zenai.ch 3 Faculty of Mathematics and Research Platform Data Science, University of Vienna, Vienna, Austria, philipp.grohs@univie.ac.at 4 ZENAI AG, Zurich, Switzerland, nor.jaafari@zenai.ch 5 Department of Mathematics, ETH Zurich, Zurich, Switzerland, arnulf.jentzen@sam.math.ethz.ch June 4, 218 Abstract Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not be solved explicitly and it has been and still is an active topic of research to design and analyze numerical methods which are able to approximately solve Kolmogorov PDEs and SDEs, respectively. Nearly all approximation methods for Kolmogorov PDEs in the literature suffer under the curse of dimensionality or only provide approximations of the solution of the PDE at a single fixed space-time point. In this paper we derive and propose a numerical approximation method which aims to overcome both of the above mentioned drawbacks and intends to deliver a numerical approximation of the Kolmogorov PDE on an entire region [a, b] d without suffering from the curse of dimensionality. Numerical results on examples including the heat equation, the Black-Scholes model, the stochastic Lorenz equation, and the Heston model suggest that the proposed approximation algorithm is quite effective in high dimensions in terms of both accuracy and speed. 1

2 Contents 1 Introduction 2 2 Derivation and description of the proposed approximation algorithm Kolmogorov partial differential equations (PDEs) On stochastic differential equations and Kolmogorov PDEs Formulation as minimization problem Discretization of the stochastic differential equation Deep artificial neural network approximations Stochastic gradient descent-type minimization Description of the algorithm in a special case Description of the algorithm in the general case Examples Setting Heat equation Geometric Brownian motions Black-Scholes model with correlated noise Stochastic Lorenz equations Heston model Python source codes Python source code for the algorithm A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection A Python source code associated to Subsection Introduction Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not be solved explicitly and it has been and still is an active topic of research to design and analyze numerical methods which are able to approximately solve Kolmogorov PDEs and SDEs, respectively (see, e.g., [17], [2], [21], [27], [28], [29], [32], [4], [41], [42], [44], [45], [46], [47], [49], [52]). In particular, there are nowadays several different types of numerical approximation methods for Kolmogorov 2

3 PDEs in the literature including deterministic numerical approximation methods such as finite differences based approximation methods (cf., for example, [7], [8], [23], [43], [58], [56]) and finite elements based approximation methods (cf., for example, [9], [1], [59]) as well as random numerical approximation methods based on Monte Carlo methods (cf., for example, [17], [2]) and discretizations of the underlying SDEs (cf., for example, [21], [27], [28], [29], [32], [4], [41], [42], [44], [45], [46], [47], [49], [52]). The above mentioned deterministic approximation methods for PDEs work quite efficiently in one or two space dimensions but cannot be used in the case of high-dimensional PDEs as they suffer from the so-called curse of dimensionality (cf. Bellman [5]) in the sense that the computational effort of the considered approximation algorithm grows exponentially in the PDE dimension. The above mentioned random numerical approximation methods involving Monte Carlo approximations typically overcome this curse of dimensionality but only provide approximations of the Kolmogorov PDE at a single fixed space-time point. The key contribution of this paper is to derive and propose a numerical approximation method which aims to overcome both of the above mentioned drawbacks and intends to deliver a numerical approximation of the Kolmogorov PDE on an entire region [a, b] d without suffering from the curse of dimensionality. The numerical scheme, which we propose in this work, is inspired by recently developed deep learning based approximation algorithms for PDEs in the literature (cf., for example, [3], [4], [14], [15], [16], [24], [26], [51], [57]). To derive the proposed approximation scheme we first reformulate the considered Kolmogorov PDE as a suitable infinite dimensional stochastic optimization problem (see items (ii) (iii) in Proposition 2.7 below for details). This infinite dimensional stochastic optimization problem is then temporally discretized by means of suitable discretizations of the underlying SDE and it is spatially discretized by means of fully connected deep artificial neural network approximations (see (17) in Subsection 2.6 as well as Subsections below). The resulting finite dimensional stochastic optimization problem is then solved by means of stochastic gradient descent type optimization algorithms (see (19) in Subsection 2.6, Framework 2.9 in Subsection 2.7, Framework 2.1 in Subsection 2.8, as well as (124) (125) in Subsection 3.1). We test the proposed approximation method numerically in the case of several examples of SDEs and PDEs, respectively (see Subsections below for details). The obtained numerical results indicate that the proposed approximation algorithm is quite effective in high dimensions in terms of both accuracy and speed. The remainder of this article is organized as follows. In Section 2 we derive the proposed approximation algorithm (see Subsections below) and we present a detailed description of the proposed approximation algorithm in a special case (see Subsection 2.7 below) as well as in the general case (see Subsection 2.8 below). In Section 3 we test the proposed algorithm numerically in the case of several examples of SDEs and PDEs, respectively. The employed source codes for the numerical simulations in Section 3 are postponed to Section 4. 3

4 2 Derivation and description of the proposed approximation algorithm In this section we describe the approximation problem which we intend to solve (see Subsection 2.1 below) and we derive (see Subsections below) and specify (see Subsections below) the numerical scheme which we suggest to use to solve this approximation problem (cf., for example, E et al. [14], Han et al. [24], Sirignano & Spiliopoulos [57], Beck et al. [3], Fujii, Takahashi, A., & Takahashi, M. [16], and Henry-Labordere [26] for related derivations and related approximation schemes). 2.1 Kolmogorov partial differential equations (PDEs) Let T (, ), d N, let µ: R d R d and σ : R d R d d be Lipschitz continuous functions, let ϕ: R d R be a function, and let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d. (1) Our goal is to approximately calculate the function R d x u(t, x) R on some subset of R d. To fix ideas we consider real numbers a, b R with a < b and we suppose that our goal is to approximately calculate the function [a, b] d x u(t, x) R. 2.2 On stochastic differential equations and Kolmogorov PDEs In this subsection we provide a probabilistic representation for the solutions of the PDE (1), that is, we recall the classical Feynman-Kac formula for the PDE (1) (cf., for example, Øksendal [5, Chapter 8]). Let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R d be a standard (Ω, F,P, (F t ) t [,T ] )-Brownian motion, and for every x R d let X x = (X x t ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s. (2) The Feynman-Kac formula (cf., for example, Hairer et al. [22, Corollary 4.17 and Remark 4.1]) and (1) hence yield that for every x R d it holds that u(t, x) = E [ u(, X x T ) ] = E [ ϕ(x x T ) ]. (3) 4

5 2.3 Formulation as minimization problem In the next step we exploit (3) to formulate a minimization problem which is uniquely solved by the function [a, b] d x u(t, x) R (cf. (1) above). For this we first recall the L 2 -minimization property of the expectation of a real-valued random variable (see Lemma 2.1 below). Then we extend this minimization result to certain random fields (see Proposition 2.2 below). Thereafter, we apply Proposition 2.2 to random fields in the context of the Feynman-Kac representation (3) to obtain Proposition 2.7 below. Proposition 2.7 provides a minimization problem (see, for instance, (93) below) which has the function [a, b] d x u(t, x) R as the unique global minimizer. Our proof of Proposition 2.7 is based on the elementary auxiliary results in Lemmas For completeness we also present the proofs of Lemmas here. The statement and the proof of Lemma 2.3 are based on the proof of Da Prato & Zabczyk [13, Lemma 1.1]. Lemma 2.1. Let (Ω, F,P) be a probability space and let X : Ω R be an F/B(R)- measurable random variable which satisfies E[ X 2 ] <. Then (i) it holds for every y R that E [ X y 2] = E [ X E[X] 2] + E[X] y 2, (4) (ii) it holds that there exists a unique real number z R such that and E [ X z 2] = inf y R E[ X y 2], (5) (iii) it holds that E [ X E[X] 2] = inf y R E[ X y 2]. (6) Proof of Lemma 2.1. Observe that the fact that E[ X ] < ensures that for every y R it holds that E [ X y 2] = E [ X E[X] + E[X] y 2] = E [ X E[X] 2 + 2(X E[X])(E[X] y) + E[X] y 2] = E [ X E[X] 2] + 2(E[X] y)e [ X E[X] ] + E[X] y 2 = E [ X E[X] 2] + E[X] y 2. This establishes item (i). Item (ii) and item (iii) are immediate consequences of item (i). The proof of Lemma 2.1 is thus completed. (7) 5

6 Proposition 2.2. Let a R, b (a, ), let (Ω, F,P) be a probability space, let X = (X x ) x [a,b] d : [a, b] d Ω R be a (B([a, b] d ) F)/B(R)-measurable function, assume for every x [a, b] d that E[ X x 2 ] <, and assume that the function [a, b] d x E[X x ] R is continuous. Then (i) it holds that there exists a unique continuous function u: [a, b] d R such that E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx [a,b] d v C([a,b] d,r) [a,b] d and (8) (ii) it holds for every x [a, b] d that u(x) = E[X x ]. Proof of Proposition 2.2. Observe that item (i) in Lemma 2.1 and the hypothesis that x [a, b] d : E[ X x 2 ] < ensure that for every function u: [a, b] d R and every x [a, b] d it holds that E [ X x u(x) 2] = E [ X x E[X x ] 2] + E[X x ] u(x) 2. (9) Fubini s theorem (see, e.g., Klenke [39, Theorem 14.16]) hence proves that for every continuous function u: [a, b] d R it holds that E [ X x u(x) 2] dx = E [ X x E[X x ] 2] dx + E[X x ] u(x) 2 dx. (1) [a,b] d [a,b] d [a,b] d The hypothesis that the function [a, b] d x E[X x ] R is continuous therefore demonstrates that E [ X x E[X x ] 2] dx [a,b] d ( inf E [ X x v(x) 2] ) dx v C([a,b] d,r) [a,b] ( d = inf E [ X x E[X x ] 2] ) dx + E[X x ] v(x) 2 dx v C([a,b] d,r) [a,b] d [a,b] ( d inf E [ X x E[X x ] 2] ) dx v C([a,b] d,r) [a,b] d = E [ X x E[X x ] 2] dx. [a,b] d Hence, we obtain that [a,b] d E [ X x E[X x ] 2] dx = (11) ( inf E [ X x v(x) 2] ) dx. (12) v C([a,b] d,r) [a,b] d 6

7 Again the fact that the function [a, b] d x E[X x ] R is continuous therefore proves that there exists a continuous function u: [a, b] d R such that E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx. (13) [a,b] d v C([a,b] d,r) [a,b] d Next observe that (1) and (12) yield that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx (14) [a,b] d v C([a,b] d,r) [a,b] d it holds that E [ X x E[X x ] 2] dx [a,b] d ( = inf E [ X x v(x) 2] ) dx = E [ X x u(x) 2] dx v C([a,b] d,r) [a,b] d [a,b] d ] = E [ X x E[X x ] 2] dx + E[X x ] u(x) 2 dx. [a,b] d [a,b] d (15) Hence, we obtain that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx [a,b] d v C([a,b] d,r) [a,b] d (16) it holds that [a,b] d E[X x ] u(x) 2 dx =. (17) This and again the hypothesis that the function [a, b] d x E[X x ] R is continuous yield that for every continuous function u: [a, b] d R with E [ X x u(x) 2] ( dx = inf E [ X x v(x) 2] ) dx (18) [a,b] d v C([a,b] d,r) [a,b] d and every x [a, b] d it holds that u(x) = E[X x ]. Combining this with (13) completes the proof of Proposition 2.2. Lemma 2.3 (Projections in metric spaces). Let (E, d) be a metric space, let n N, e 1, e 2,..., e n E, and let P : E E be the function which satisfies for every x E that Then P (x) = e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}. (19) 7

8 (i) it holds for every x E that and d(x, P (x)) = min d(x, e k) (2) k {1,2,...,n} (ii) it holds for every A E that P 1 (A) B(E). Proof of Lemma 2.3. Throughout this proof let D = (D 1,..., D n ): E R n be the function which satisfies for every x E that D(x) = (D 1 (x), D 2 (x),..., D n (x)) = (d(x, e 1 ), d(x, e 2 ),..., d(x, e n )). (21) Note that (19) ensures that for every x E it holds that d(x, P (x)) = d(x, e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}) = min d(x, e k). k {1,2,...,n} (22) This establishes item (i). It thus remains to prove item (ii). For this observe that the fact that the function d: E E [, ) is continuous ensures that the function D : E R n is continuous. Hence, we obtain that the function D : E R n is B(E)/B(R n )-measurable. Next note that item (i) demonstrates that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that d(x, e k ) = d(x, P (x)) = min d(x, e l ). (23) l {1,2,...,n} Hence, we obtain that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that k min{l {1, 2,..., n}: d(x, e l ) = min{d(x, e 1 ), d(x, e 2 ),..., d(x, e n )}}. (24) Moreover, note that (19) ensures that for every k {1, 2,..., n}, x P 1 ({e k }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) u {1,2,...,n} { (25) } { } l {1, 2,..., n}: e l = e k k, k + 1,..., n. Therefore, we obtain that for every k {1, 2,..., n}, x P 1 ({e k }) with e k / ( l N [,k) {e l }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) k. (26) u {1,2,...,n} Combining this with (24) yields that for every k {1, 2,..., n}, x P 1 ({e k }) with e k / ( l N [,k) {e l }) it holds that { } min l {1, 2,..., n}: d(x, e l ) = min u) u {1,2,...,n} = k. (27) 8

9 Hence, we obtain that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that { { } } P 1 ({e k }) x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k. (28) u {1,2,...,n} This and (19) show that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that { { } } P 1 ({e k }) = x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k. (29) u {1,2,...,n} Combining (21) with the fact that the function D : E R n is B(E)/B(R n )-measurable therefore demonstrates that for every k {1, 2,..., n} with e k / ( l N [,k) {e l }) it holds that P 1 ({e k }) { { } } = x E : min l {1, 2,..., n}: d(x, e l ) = min d(x, e u) = k u {1,2,...,n} { { } } = x E : min l {1, 2,..., n}: D l (x) = min D u(x) = k u {1,2,...,n} { ( )} l N [, k): Dk (x) < D = x E : l (x) and l {1, 2,..., n}: D k (x) D l (x) k 1 n = {x E : D k (x) < D l (x)} {x E : D }{{} k (x) D l (x)} B(E). }{{} l=1 l=1 B(E) B(E) Hence, we obtain that for every f {e 1, e 2,..., e n } it holds that Therefore, we obtain that for every A E it holds that (3) P 1 ({f}) B(E). (31) P 1 (A) = P 1 (A {e 1, e 2,..., e n }) = f A {e1,e 2,...,e n} P 1 ({f}) B(E). (32) }{{} B(E) This establishes item (ii). The proof of Lemma 2.3 is thus completed. Lemma 2.4. Let (E, d) be a separable metric space, let (E, δ) be a metric space, let (Ω, F) be a measurable space, let X : E Ω E be a function, assume for every e E that the function Ω ω X(e, ω) E is F/B(E)-measurable, and assume for every ω Ω that the function E e X(e, ω) E is continuous. Then it holds that the function X : E Ω E is (B(E) F)/B(E)-measurable. 9

10 Proof of Lemma 2.4. Throughout this proof let (e m ) m N E be a sequence which satisfies that {e m : m N} = E, let P n : E E, n N, be the functions which satisfy for every n N, x E that P n (x) = e min{k {1,2,...,n}: d(x,ek )=min{d(x,e 1 ),d(x,e 2 ),...,d(x,e n)}}, (33) and let X n : E Ω E, n N, be the functions which satisfy for every n N, x E, ω Ω that X n (x, ω) = X(P n (x), ω). (34) Note that (34) shows that for all n N, B B(E) it holds that (X n ) 1 (B) = {(x, ω) E Ω: X n (x, ω) B} = ( [(Xn ) 1 (B) ] [ (P n ) 1 ({y}) Ω ] ) y Im(P n) = y Im(P n) = y Im(P n) { [ ]} (x, ω) E Ω: X n (x, ω) B and x (P n ) 1 ({y}) { [ ]} (x, ω) E Ω: X(P n (x), ω) B and x (P n ) 1 ({y}). (35) Item (ii) in Lemma 2.3 hence implies that for all n N, B B(E) it holds that (X n ) 1 (B) = { [ ]} (x, ω) E Ω: X(y, ω) B and x (P n ) 1 ({y}) y Im(P n) = y Im(P n) = y Im(P n) ( {(x, ω) E Ω: X(y, ω) B} [ (P n ) 1 ({y}) Ω ] ) ( [ E ( (X(y, )) 1 (B) ) } {{ } (B(E) F) ] [ ] ) (Pn ) 1 ({y}) Ω (B(E) F). }{{} (B(E) F) (36) This proves that for every n N it holds that the function X n is (B(E) F)/B(E)- measurable. In addition, note that item (i) in Lemma 2.3 and the hypothesis that for every ω Ω it holds that the function E x X(x, ω) E is continuous imply that for every x E, ω Ω it holds that lim X n(x, ω) = lim X(P n (x), ω) = X(x, ω). (37) n n Combining this with the fact that for every n N it holds that the function X n : E Ω E is (B(E) F)/B(E)-measurable shows that the function X : E Ω E is (B(E) F)/B(E)- measurable. The proof of Lemma 2.4 is thus completed. 1

11 Lemma 2.5. Let (Ω, F,P) be a probability space, let (E, d) and (E, δ) be separable metric spaces, let X n : Ω E, n N, be random variables which satisfy for every ε (, ) that lim supp(d(x n, X ) ε) =, (38) n and let Φ: E E be a continuous function. Then it holds for every ε (, ) that lim supp(δ(φ(x n ), Φ(X )) ε) =. (39) n Proof of Lemma 2.5. Note that (38), e.g., Cox et al. [12, Lemma 2.4], and, e.g., Hutzenthaler et al. [35, Lemma 4.2] establish (39). The proof of Lemma 2.5 is thus completed. Lemma 2.6. Let d, m N, T (, ), L, a R, b (a, ), let µ: R d R d and σ : R d R d m be functions which satisfy for every x, y R d that max{ µ(x) µ(y) R d, σ(x) σ(y) HS(R m,r )} L x y d R d, let Φ: C([, T ], R d ) R be an at most polynomially growing continuous function, let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let ξ : Ω [a, b] d be a continuous uniformly distributed F /B([a, b] d )-measurable random variable, let W : [, T ] Ω R m be a standard (F t ) t [,T ] -Brownian motion, for every x [a, b] d let X x = (Xt x ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s, (4) and let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that Then X t = ξ + t µ(x s ) ds + t σ(x s ) dw s. (41) (i) it holds for every x [a, b] d that the functions Ω ω Φ((X x t (ω)) t [,T ] ) R and Ω ω Φ((X t (ω)) t [,T ] ) R are F/B(R)-measurable, (ii) it holds for every p [2, ), x, y [a, b] d that ( [ p ) 1/p E sup Xt x X y t R d] ( 2 exp L 2 T [ p + T ] ) 2 x y R d, (42) t [,T ] (iii) it holds for every x [a, b] d that E [ Φ((X x t ) t [,T ] ) + Φ((X t ) t [,T ] ) ] <, (iv) it holds that the function [a, b] d x E[Φ((X x t ) t [,T ] )] R is continuous, and 11

12 (v) it holds that E [ Φ((X t ) t [,T ] ) ] = 1 E (b a) ( [a,b] [ Φ((X d t x ) t [,T ] ) ] ) dx. (43) d Proof of Lemma 2.6. Throughout this proof let c [1, ) be a real number which satisfies for every w C([, T ], R m ) that [ Φ(w) c c 1 + sup w t R d], (44) t [,T ] let p t : C([, T ], R m ) R m, t [, T ], be the functions which satisfy for every t [, T ], w = (w s ) s [,T ] C([, T ], R m ) that p t (w) = w t, and let Ψ N x,w : [, T ] R d, N N, x R d, w C([, T ], R m ), be the functions which satisfy for every w C([, T ], R m ), x R d, N N, n {, 1,..., N 1}, t [ nt, (n+1)t ] that N N ΨN x,w() = x and Ψ N x,w(t) = Ψ N x,w( nt N ) ( [ + nt n) µ ( Ψ N nt )) T T x,w( + σ( ( Ψ N nt )) ( N N x,w N w (n+1)t N ) ] w nt. (45) N Observe that the fact that the Borel sigma-algebra B(C([, T ], R m )) is generated by the set t [,T ] A B(R m ) {(p t ) 1 (A)} (cf., for example, Klenke [39, Theorem 21.31]), the hypothesis that for every x [a, b] d it holds that X x : [, T ] Ω R d is an (F t ) t [,T ] -adapted stochastic process with continuous sample paths, and the hypothesis that X: [, T ] Ω R d is an (F t ) t [,T ] -adapted stochastic process with continuous sample paths demonstrate that the functions Ω ω (X x t (ω)) t [,T ] C([, T ], R d ) (46) and Ω ω (X t (ω)) t [,T ] C([, T ], R d ) (47) are F/B(C([, T ], R d ))-measurable. Combining this with the fact that the function Φ: C([, T ], R d ) R is B(C([, T ], R d ))/B(R)-measurable implies that for every x [a, b] d it holds that the functions Ω ω Φ((Xt x (ω)) t [,T ] ) R and Ω ω Φ((X t (ω)) t [,T ] ) R are F/B(R)-measurable. This proves item (i). Next observe that (45), the hypothesis that µ: R d R d and σ : R d R d m are globally Lipschitz continuous, and the fact that for every p (, ) it holds that E[ ξ p ] < ensure that for every p (, ) it holds R d that sup sup N N x [a,b] d = sup sup N N x [a,b] d ( [ E sup ( E t [,T ] [ ] [ ]) Ψ N x,w (t) p + E sup Ψ N R d ξ,w (t) p R d t [,T ] [ max n {,1,...,N} Ψ N x,w ( nt N ) p R d ] + E 12 max n {,1,...,N} Ψ N ξ,w ( nt N ) p R d ]) < (48)

13 (cf., for example, Kloeden & Platen [41, Section 1.6]). Next note that (4), (41), (45), the fact that µ: R d R d and σ : R d R d m are locally Lipschitz continuous functions, and e.g., Hutzenthaler & Jentzen [32, Theorem 3.3] ensure that for every x [a, b] d, ε (, ) it holds that ( ) and lim supp N sup Xt x Ψ N x,w (t) R d ε t [,T ] = (49) ( ) lim supp N sup t [,T ] X t Ψ N ξ,w (t) R d ε =. (5) Combining (46), (47), and, e.g., Hutzenthaler & Jentzen [32, Lemma 3.1] hence demonstrates that for every x [a, b] d, p (, ) it holds that [ ] [ ] E sup Xt x p lim inf E sup Ψ N x,w (t) p t [,T ] N t [,T ] [ ] (51) sup E sup Ψ N x,w (t) p N N t [,T ] and [ ] E sup X t p lim inf t [,T ] N E [ sup E N N [ ] sup Ψ N ξ,w (t) p t [,T ] ] sup Ψ N ξ,w (t) p. t [,T ] (52) This and (48) assure that for every p (, ) it holds that ( [ ] [ ]) E sup Xt x p + E X R d t p <. (53) R d t [,T ] sup x [a,b] d sup t [,T ] Combining (44) and the fact that r (, ), a, b R: a + b r 2 r ( a r + b r ) therefore demonstrates that for every p (, ) it holds that ( [ sup E Φ((X x t ) t [,T ] ) p] + E [ Φ((X t ) t [,T ] ) p]) x [a,b] d ( sup cp E [ 1 + sup t [,T ] Xt x R d cp] + c p E [ 1 + sup t [,T ] X t R d cp]) x [a,b] d ( ) 2 cp c p <. sup E [ ] 2 + sup t [,T ] Xt x cp + sup R d t [,T ] X t cp R d x [a,b] d 13 (54)

14 This establishes item (iii). In the next step we observe that (4) ensures that for every x, y [a, b] d, t [, T ] it holds P-a.s. that X x t X y t = x y + t (µ(x x s ) µ(x y s )) ds + t (σ(x x s ) σ(x y s )) dw s. (55) The triangle inequality hence ensures that for every x, y [a, b] d, t [, T ] it holds P- a.s. that sup Xs x Xs y R d s [,t] x y R d + sup x y R d + L = x y R d + L s s [,t] [ s sup s [,t] t µ(x x r ) µ(x y r ) R d dr + sup X x r X y r R d dr X x r X y r R d dr + sup ] s [,t] + sup t s [,t] s s [,t] s (σ(x x r ) σ(x y r )) dw r R d (σ(x x r ) σ(x y r )) dw r R d R (σ(xr x ) σ(xr y )) dw r. d Therefore, we obtain for every p [1, ), x, y [a, b] d, t [, T ] that ( [ t ( [ ]) E E X x r Xr y p 1/p dr R d ]) 1/p sup Xs x Xs y p x y R d R d + L s [,t] + ( [ E sup s [,t] s (σ(x x r ) σ(x y r )) dw r p R d ]) 1/p. The Burkholder-Davis-Gundy type inequality in Da Prato & Zabczyk [13, Lemma 7.2] hence shows that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p t ( [ ]) E sup Xs x Xs y p x y R d R d + L E X x r Xr y p 1/p dr R d s [,t] [ t ( [ ] ) (58) ]1/2 + p E σ(x x r ) σ(xr y ) p 2/p HS(R d,r m ) dr. This demonstrates that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p t ( [ ]) E sup Xs x Xs y p x y R d R d + L E X x r Xr y p 1/p dr R d s [,t] [ t ]1/2 ( [ ]) + Lp E X x r Xr y p 2/p dr. R d 14 (56) (57) (59)

15 Hölder s inequality hence proves that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 1/p E sup Xs x Xs y p x y R d R d + L [ t ]1/2 ( [ ]) t E X x r Xr y p 2/p dr R d s [,t] [ t ]1/2 ( [ ]) + Lp E X x r Xr y p 2/p dr R d x y R d + L [ p + T ] [ t ( [ ]) E X x r Xr y p 2/p dr R d The fact that v, w R: v + w 2 2v 2 + 2w 2 therefore shows that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( [ ]) 2/p E sup Xs x Xs y p R d s [,t] 2 x y 2 R d + 2L 2[ p + T ] 2 2 x y 2 R d + 2L 2[ p + T ] 2 t t ( E [ X x r X y r p R d ]) 2/p dr ( [ ]) 2/p E sup Xs x Xs y p dr. R d s [,r] Combining the Gronwall inequality (cf., e.g., Andersson et al. [2, Lemma 2.6] (with α =, β =, a = 2 x y 2, b = 3L 2[ p+ T ], e = ([, T ] t ( E [ ]) sup R d s [,t] Xs x Xs y p 2/p R d [, ]) in the notation of Lemma 2.6)) and (53) hence establishes that for every p [2, ), x, y [a, b] d, t [, T ] it holds that ( E [ ]) 2/p ( sup Xs x Xs y p 2 x y 2 R d R exp d s [,t] ]1/2. (6) (61) 2L 2 t [ p + T ] 2 ). (62) Therefore, we obtain that for every p [2, ), x, y [a, b] d it holds that ( [ ]) 1/p E sup Xt x X y t p ( 2 exp L 2 T [ p + T ] ) 2 x y R d. (63) t [,T ] This establishes item (ii). Next observe that item (ii) and Jensen s inequality imply that for every p (, ) it holds that (( [ ]) E supt [,T ] Xt x X y t p 1/p ) R d sup x,y [a,b] d,x y x y R d 15 <. (64)

16 The hypothesis that the function Φ: C([, T ], R d ) R is continuous and Lemma 2.5 hence ensure that for every ε (, ), (x n ) n N [a, b] d with lim sup n x x n R d = it holds that lim sup n P ( Φ((X x t ) t [,T ] ) Φ((Xt xn ) t [,T ] ) ε ) =. (65) Combining (54) with, e.g., Hutzenthaler et al. [35, Proposition 4.5] therefore implies that for every (x n ) n N [a, b] d with lim sup n x x n R d = it holds that lim sup N lim sup n E [ Φ((X x t ) t [,T ] ) Φ((Xt xn ) t [,T ] ) ] =. (66) This establishes item (iv). In the next step we observe that (49) and (5) ensure that for every ε (, ), x [a, b] d it holds that [ ( ) P sup Ψ N x,w (t) Xt x R d ε t [,T ] ( )] + P sup t [,T ] Ψ N ξ,w (t) X t R d ε =. (67) The hypothesis that the function Φ is continuous and Lemma 2.5 therefore demonstrate that for every ε (, ), x [a, b] d it holds that [ ( lim sup P Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) ε )] =. (68) N Next observe that (44) assures that for every N N, x [a, b] d, p (, ) it holds that E [ Φ(Ψ N x,w ) Φ((X x t ) t [,T ] ) p + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) p] 2 p E [ Φ(Ψ N x,w ) p + Φ((X x t ) t [,T ] ) p] + 2 p E [ Φ(Ψ N ξ,w ) p + Φ((X t ) t [,T ] ) p] 2 p c p E [ 1 + supt [,T ] Ψ N x,w (t) R cp d supt [,T ] Xt x R ] cp d + 2 p c p E [ 1 + supt [,T ] Ψ N ξ,w (t) R d cp supt [,T ] X t R d 4 p c p E [ 2 + sup t [,T ] Ψ N x,w (t) cp R d + sup t [,T ] X x t cp R d ] + 4 p c p E [ 2 + sup t [,T ] Ψ N ξ,w (t) cp R d + sup t [,T ] X t cp R d ]. ] cp Combining (48) and (53) hence shows that for every p (, ) it holds that ( [ sup sup E Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) p + Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) p]) <. (7) N N x [a,b] d This, (68), and, e.g., Hutzenthaler et al. [35, Proposition 4.5] imply that for every x [a, b] d it holds that lim sup N (69) ( E [ Φ(Ψ N x,w ) Φ((X x t ) t [,T ] ) ] + E [ Φ(Ψ N ξ,w ) Φ((X t ) t [,T ] ) ]) =. (71) 16

17 Combining (7) with Lebesgue s dominated convergence theorem therefore demonstrates that ( lim sup E [ Φ(Ψ N x,w ) Φ((Xt x ) t [,T ] ) ] ) dx =. (72) N [a,b] d In addition, observe that (69), (48), and (53) prove that for all p (, ) it holds that sup sup E [ Φ(Ψ N x,w ) p] <. (73) N N x [a,b] d Next observe that (71) and the fact that ξ and W are independent imply that E [ Φ((X t ) t [,T ] ) ] = lim E[ Φ ( ) ] Ψ N ξ,w N [ ) ] = lim Φ (Ψ Nξ(ω),(Wt(ω))t [,T N ] P(dω) Ω [ = lim Φ ( ) ( Ψ ) ] N x,w (ξ, W )(P) (dx, dw) N [a,b] d C([,T ],R d ) [ = lim Φ ( ( Ψx,w) N (ξ(p)) (W (P)) ) ] (dx, dw). N [a,b] d C([,T ],R d ) (74) Combining Fubini s theorem, (72), and (73) with Lebesgue s dominated convergence theorem therefore assures that E [ Φ((X t ) t [,T ] ) ] [ ( = lim Φ ( ) ] Ψx,w) N (W (P))(dw) (ξ(p))(dx) N [a,b] d C([,T ],R d ) [ ( = lim Φ ( ) ] ) Ψ N x,w P(dw) (ξ(p))(dx) N [a,b] d Ω [ = lim E [ Φ ( ) ] ] Ψ N x,w (ξ(p))(dx) N [a,b] d = lim E[ Φ ( ) ] (75) Ψ N x,w (ξ(p))(dx) [a,b] d N = E [ Φ((Xt x ) t [,T ] ) ] (ξ(p))(dx) [a,b] d = 1 (b a) d ( E [ Φ((Xt x ) t [,T ] ) ] ) dx. [a,b] d This establishes item (v). The proof of Lemma 2.6 is thus completed. Proposition 2.7. Let d, m N, T (, ), a R, b (a, ), let µ: R d R d and σ : R d R d m be globally Lipschitz continuous functions, let ϕ: R d R be a function, let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be a function with at most polynomially 17

18 growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u t (t, x) = 1 2 Trace R d ( σ(x)[σ(x)] (Hess x u)(t, x) ) + µ(x), ( x u)(t, x) R d, (76) let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R m be a standard (F t ) t [,T ] -Brownian motion, let ξ : Ω [a, b] d be a continuous uniformly distributed F /B([a, b] d )-measurable random variable, and let X = (X t ) t [,T ] : [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that Then X t = ξ + t µ(x s ) ds + t σ(x s ) dw s. (77) (i) it holds that the function ϕ: R d R is twice continuously differentiable with at most polynomially growing derivatives, (ii) it holds that there exists a unique continuous function U : [a, b] d R such that and E [ ϕ(x T ) U(ξ) 2] = (iii) it holds for every x [a, b] d that U(x) = u(t, x). inf E [ ϕ(x T ) v(ξ) 2], (78) v C([a,b] d,r) Proof of Proposition 2.7. Throughout this proof let X x = (X x t ) t [,T ] : [, T ] Ω R d, x [a, b] d, be (F t ) t [,T ] -adapted stochastic processes with continuous sample paths a) which satisfy that for every t [, T ], x [a, b] d it holds P-a.s. that and X x t = x + t µ(x x s ) ds + t σ(x x s ) dw s (79) b) which satisfy that for every ω Ω it holds that the function [a, b] d x XT x (ω) Rd is continuous (cf., for example, Cox et al. [11, Theorem 3.5] and item (ii) in Lemma 2.6). Note that the assumption that x R d : u(, x) = ϕ(x) and the assumption that u C 1,2 ([, T ] R d, R) has at most polynomially growing partial derivatives establish item (i). Next note that item (i) and the fact that for every p (, ), x [a, b] d it holds that sup E[ Xt x p ] < (8) R d t [,T ] 18

19 assure that for every x [a, b] d it holds that E [ ϕ(x x T ) 2 ] <. (81) Item (i), the assumption that for every ω Ω it holds that the function [a, b] d x X x T (ω) Rd is continuous, and, e.g., Hutzenthaler et al. [35, Proposition 4.5] hence ensure that the function [a, b] d x E[ϕ(X x T )] R (82) is continuous. In the next step we combine the fact that for every x [a, b] d it holds that the function Ω ω ϕ(xt x (ω)) R is F/B(R)-measurable, the fact that for every ω Ω it holds that the function [a, b] d x ϕ(xt x (ω)) R is continuous, and Lemma 2.4 to obtain that the function [a, b] d Ω (x, ω) ϕ(x x T (ω)) R (83) is (B([a, b] d ) F)/B(R)-measurable. Combining this, (81), (82), and Proposition 2.2 demonstrates A) that there exists a unique continuous function U : [a, b] d R which satisfies that E [ ϕ(xt x ) U(x) 2] ( dx = inf E [ ϕ(x [a,b] d v C([a,b] d T x ) v(x) 2] ) dx,r) [a,b] d and (84) B) that it holds for every x [a, b] d that U(x) = E[ϕ(X x T )]. (85) Next note that for every continuous function V : [a, b] d R it holds that sup V (x) <. (86) x [a,b] d Item (i) hence implies that for every continuous function V : [a, b] d R it holds that C([, T ], R d ) (z t ) t [,T ] ϕ(z T ) V (z ) 2 R (87) is an at most polynomially growing continuous function. Combining Lemma 2.6 (with Φ = (C([, T ], R d ) (z t ) t [,T ] ϕ(z T ) V (z ) 2 R) for V C([a, b] d, R d ) in the notation of Lemma 2.6), (77), item (i), and (79) hence ensures that for every continuous function V : [a, b] d R it holds that E [ ϕ(x T ) V (ξ) 2] [ 1 = E [ ϕ(x (b a) d T x ) V (x) 2] ] dx. (88) [a,b] d 19

20 Hence, we obtain that for every continuous function V : [a, b] d R with E[ ϕ(x T ) V (ξ) 2 ] = inf v C([a,b] d,r) E[ ϕ(x T ) v(ξ) 2 ] it holds that E [ ϕ(xt x ) V (x) 2] dx [a,b] d ( [ = (b a) d 1 E [ ϕ(x (b a) d T x ) V (x) 2] ]) dx [a,b] d = (b a) ( d E [ ϕ(x T ) V (ξ) 2]) ( = (b a) d inf E [ ϕ(x T ) v(ξ) 2]) (89) v C([a,b] d,r) ( ( [ = (b a) d 1 inf E [ ϕ(x v C([a,b] d,r) (b a) d T x ) v(x) 2] ])) dx [a,b] [ d = inf E [ ϕ(x v C([a,b] d T x ) v(x) 2] ] dx.,r) [a,b] d Combining this with (84) proves that for every continuous function V : [a, b] d R with E[ ϕ(x T ) V (ξ) 2 ] = inf v C([a,b] d,r) E[ ϕ(x T ) v(ξ) 2 ] it holds that U = V. (9) Next observe that (88) and (84) demonstrate that E [ ϕ(x T ) U(ξ) 2] ( 1 = E [ ϕ(x (b a) d T x ) U(x) 2] ) dx [a,b] [ d ( 1 = inf E [ ϕ(x v C([a,b] d,r) (b a) d T x ) v(x) 2] ) ] dx [a,b] d = inf E [ ϕ(x T ) v(ξ) 2]. v C([a,b] d,r) (91) Combining this with (9) proves item (ii). Next note that (76), (85), and the Feynman- Kac formula (cf., for example, Hairer et al. [22, Corollary 4.17]) imply that for every x [a, b] d it holds that U(x) = E[ϕ(XT x )] = u(t, x). This establishes item (iii). The proof of Proposition 2.7 is thus completed. In the next step we use Proposition 2.7 to obtain a minimization problem which is uniquely solved by the function [a, b] d x u(t, x) R. More specifically, let ξ : Ω [a, b] d be a continuously uniformly distributed F /B([a, b] d )-measurable random variable, and let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X t = ξ + t µ(x s ) ds + 2 t σ(x s ) dw s. (92)

21 Proposition 2.7 then guarantees that the function [a, b] d x u(t, x) R is the unique global minimizer of the function C([a, b] d, R) v E [ ϕ(x T ) v(ξ) 2] R. (93) In the following two subsections we derive an approximated minimization problem by discretizing the stochastic process X: [, T ] Ω R d (see Subsection 2.4 below) and by employing a deep neural network approximation for the function R d x u(t, x) R (see Subsection 2.5 below). 2.4 Discretization of the stochastic differential equation In this subsection we use the Euler-Maruyama scheme (cf., for example, Kloeden & Platen [41] and Maruyama [44]) to temporally discretize the solution process X of the SDE (92). More specifically, let N N, let t, t 1,..., t N [, ) be real numbers which satisfy that = t < t 1 <... < t N = T. (94) Note that (92) implies that for every n {, 1,..., N 1} it holds P-a.s. that X tn+1 = X tn + tn+1 t n µ(x s ) ds + tn+1 t n σ(x s ) dw s. (95) This suggests that for sufficiently small mesh size sup n {,1,...,N 1} (t n+1 t n ) it holds that X tn+1 X tn + µ(x tn ) (t n+1 t n ) + σ(x tn ) (W tn+1 W tn ). (96) Let X : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X = ξ and X n+1 = X n + µ(x n ) (t n+1 t n ) + σ(x n ) (W tn+1 W tn ). (97) Observe that (96) and (97) suggest, in turn, that for every n {, 1, 2,..., N} it holds that X n X tn (98) (cf., for example, Theorem 2.8 below for a strong convergence result for the Euler-Maruyama scheme). Theorem 2.8 (Strong convergence rate for the Euler-Maruyama scheme). Let T (, ), d N, p [2, ), let (Ω, F,P) be a probability space with a normal filtration (F t ) t [,T ], let W : [, T ] Ω R d be a standard (F t ) t [,T ] -Brownian motion, let ξ : Ω R d be a random variable which satisfies that E[ ξ p R d ] <, let µ: R d R d and σ : R d R d d be Lipschitz 21

22 continuous functions, let X: [, T ] Ω R d be an (F t ) t [,T ] -adapted stochastic process with continuous sample paths which satisfies that for every t [, T ] it holds P-a.s. that X t = ξ + t µ(x s ) ds + t σ(x s ) dw s, (99) for every N N let t N, t N 1,..., t N N [, T ] be real numbers which satisfy that = t N < t N 1 <... < t N N = T, (1) and for every N N let X N : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X N = ξ and X N n+1 = X N n + µ ( X N n ) (t N n+1 t N n ) + σ ( X N n )( Wt N n+1 W t N n ). (11) Then there exists a real number C (, ) such that for every N N it holds that ( [ sup E X t N n {,1,...,N} X ] ) [ ] 1/2 N n n p 1/p R C max t d n+1 t n. (12) n {,1,...,N 1} The proof of Theorem 2.4 is well-known in the literature (cf., for instance, Kloeden & Platen [41], Milstein [46], Hofmann, Müller-Gronbach, & Ritter [3], Müller-Gronbach & Ritter [49], and the references mentioned therein). 2.5 Deep artificial neural network approximations In this subsection we employ suitable approximations for the solution R d x u(t, x) R of the PDE (1) at time T. More specifically, let ν N and let U = (U(θ, x)) (θ,x) R ν R : d Rν R d R be a continuous function. For every suitable θ R ν and every x [a, b] d we think of U(θ, x) R as an appropriate approximation U(θ, x) u(t, x) (13) of u(t, x). We suggest to choose the function U: R ν R d R as a deep neural network (cf., for example, Bishop [6]). For instance, let L d : R d R d be the function which satisfies for every x = (x 1, x 2,..., x d ) R d that ( ) exp(x1 ) L d (x) = exp(x 1 ) + 1, exp(x 2 ) exp(x 2 ) + 1,..., exp(x d ) (14) exp(x d ) + 1 (multidimensional version of the standard logistic function), for every k, l N, v N = {} N, θ = (θ 1,..., θ ν ) R ν with v + l(k + 1) ν let A θ,v k,l : Rk R l be the function 22

23 which satisfies for every x = (x 1,..., x k ) R k that θ v+1 θ v+2... θ v+k θ v+k+1 θ v+k+2... θ v+2k θ v+2k+1 θ v+2k+2... θ v+3k A θ,v k,l (x) =.... θ v+(l 1)k+1 θ v+(l 1)k+2... θ v+lk x 1 x 2 x 3. x k + θ v+kl+1 θ v+kl+2 θ v+kl+3. θ v+kl+l, (15) let s {3, 4, 5, 6,...}, assume that (s 1)d(d + 1) + d + 1 ν, and let U: R ν R d R be the function which satisfies for every θ R ν, x R d that U(θ, x) = ( A θ,(s 1)d(d+1) d,1 L d A θ,(s 2)d(d+1) d,d... L d A θ,d(d+1) d,d L d Ad,d) θ, (x). (16) The function U: R ν R d R in (16) describes an artificial neural network with s+1 layers (1 input layer with d neurons, s 1 hidden layers with d neurons each, and 1 output layer with d neurons) and standard logistic functions as activation functions (cf., for instance, Bishop [6]). 2.6 Stochastic gradient descent-type minimization As described in Subsection 2.5 for every suitable θ R ν and every x [a, b] d we think of U(θ, x) R as an appropriate approximation of u(t, x) R. In this subsection we intend to find a suitable θ R ν as an approximate minimizer of the function R ν θ E [ ϕ(x N ) U(θ, ξ) 2] R. (17) To be more specific, we intend to find an approximate minimizer of the function in (17) through a stochastic gradient descent-type minimization algorithm (cf., for instance, Ruder [53, Section 4], Jentzen et al. [37], and the references mentioned therein). For this we approximate the derivative of the function in (17) by means of the Monte Carlo method. More precisely, let ξ (m) : Ω [a, b] d, m N, be independent continuously uniformly distributed F /B([a, b] d )-measurable random variables, let W (m) : [, T ] Ω R d, m N, be independent standard (F t ) t [,T ] -Brownian motions, for every m N let X (m) = (X n (m) ) n {,1,...,N} : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X (m) = ξ (m) and X (m) n+1 = X (m) n + µ(x (m) n ) (t n+1 t n ) + σ(x (m) ) (W (m) t n+1 W (m) t n ), (18) let γ (, ), and let Θ: N Ω R ν be a stochastic process which satisfies for every m N that Θ m+1 = Θ m 2γ (U(Θ m, ξ (m) ) ϕ(x (m) N )) ( θ U)(Θ m, ξ (m) ). (19) 23 n

24 Under appropriate hypotheses we think for every sufficiently large m N of the random variable Θ m : Ω R ν as a suitable approximation of a local minimum point of the function (17) and we think for every sufficiently large m N of the random function [a, b] d x U(Θ n, x) R as a suitable approximation of the function [a, b] d x u(t, x) R. 2.7 Description of the algorithm in a special case In this subsection we give a description of the proposed approximation method in a special case, that is, we describe the proposed approximation method in the specific case where a particular neural network approximation is chosen and where the plain-vanilla stochastic gradient descent method with a constant learning rate is the employed stochastic minimization algorithm (cf. (19) above). For a more general description of the proposed approximation method we refer the reader to Subsection 2.8 below. Framework 2.9. Let T, γ (, ), a R, b (a, ), d, N N, s {3, 4, 5,...}, let ν = sd(d + 1), let t, t 1,..., t N [, T ] be real numbers with = t < t 1 <... < t N = T, (11) let µ: R d R d and σ : R d R d d be continuous functions, let (Ω, F,P, (F t ) t [,T ] ) be a filtered probability space, let ξ (m) : Ω [a, b] d, m N, be independent continuously uniformly distributed F /B([a, b] d )-measurable random variables, let W (m) : [, T ] Ω R d, m N, be i.i.d. standard (F t ) t [,T ] -Brownian motions, for every m N let X (m) : {, 1,..., N} Ω R d be the stochastic process which satisfies for every n {, 1,..., N 1} that X (m) = ξ (m) and X (m) n+1 = X (m) n + µ(x (m) n ) (t n+1 t n ) + σ(x (m) ) (W (m) t n+1 W (m) t n ), (111) let L d : R d R d be the function which satisfies for every x = (x 1, x 2,..., x d ) R d that ( ) exp(x1 ) L d (x) = exp(x 1 ) + 1, exp(x 2 ) exp(x 2 ) + 1,..., exp(x d ), (112) exp(x d ) + 1 for every k, l N, v N = {} N, θ = (θ 1,..., θ ν ) R ν with v + l(k + 1) ν let A θ,v k,l : Rk R l be the function which satisfies for every x = (x 1,..., x k ) R k that θ v+1 θ v+2... θ v+k x 1 θ v+kl+1 θ v+k+1 θ v+k+2... θ v+2k x 2 θ v+kl+2 θ v+2k+1 θ v+2k+2... θ v+3k x 3 + θ v+kl+3, (113) A θ,v k,l (x) =.... θ v+(l 1)k+1 θ v+(l 1)k+2... θ v+lk n. x k. θ v+kl+l 24

25 let U: R ν R d R be the function which satisfies for every θ R ν, x R d that U(θ, x) = ( A θ,(s 1)d(d+1) d,1 L d A θ,(s 2)d(d+1) d,d... L d A θ,d(d+1) d,d L d Ad,d) θ, (x), (114) and let Θ: N Ω R ν be a stochastic process which satisfies for every m N that Θ m+1 = Θ m 2γ (U(Θ m, ξ (m) ) ϕ(x (m) N )) ( θ U)(Θ m, ξ (m) ) (115) Under appropriate hypotheses we think for every sufficiently large m N and every x [a, b] d of the random variable U(Θ m, x): Ω R in Framework 2.9 as a suitable approximation U(Θ m, x) u(t, x) of u(t, x) R where u = u(t, x) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) is a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d (116) (cf. (1) above). 2.8 Description of the algorithm in the general case In this subsection we provide a general framework which covers the approximation method derived in Subsections above and which allows, in addition, to incorporate other minimization algorithms (cf., for example, Kingma & Ba [38], Ruder [53], E et al. [14], Han et al. [24], and Beck et al. [3]) than just the plain vanilla stochastic gradient descent method. The proposed approximation algorithm is an extension of the approximation algorithm in E et al. [14], Han et al. [24], and Beck et al. [3] in the special case of linear Kolmogorov partial differential equations. Framework 2.1. Let T (, ), N, d, ϱ, ν, ς N, let H : [, T ] 2 R d R d R d, ϕ: R d R be functions, let (Ω, F,P, (F t ) t [,T ] ) be a filtered probability space, let W m,j : [, T ] Ω R d, m N, j N, be independent standard (F t ) t [,T ] -Brownian motions on (Ω, F,P), let ξ m,j : Ω R d, m N, j N, be i.i.d. F /B(R d )-measurable random variables, let t, t 1,..., t N [, T ] be real numbers with = t < t 1 <... < t N = T, for every θ R ν, j N, s R ς let U θ,j,s : R d R be a function, for every m N, j N let X m,j = (Xn m,j ) n {,1,...,N} : {, 1,..., N} Ω R d be a stochastic process which satisfies for every n {, 1,..., N 1} that X m,j = ξ m,j and X m,j n+1 = H(t n, t n+1, X m,j n, W m,j t n+1 W m,j t n ), (117) let (J m ) m N N be a sequence, for every m N, s R ς let φ m,s : R ν Ω R be the function which satisfies for every (θ, ω) R ν Ω that φ m,s (θ, ω) = 1 J m J m j=1 [ U θ,j,s ( ξ m,j (ω)) ϕ ( X m,j N (ω))] 2, (118) 25

26 for every m N, s R ς let Ψ m,s : R ν Ω R ν be a function which satisfies for every ω Ω, θ {η R ν : φ m,s (, ω): R ν R is differentiable at η} that Ψ m,s (θ, ω) = ( θ φ m,s )(θ, ω), (119) let S : R ς R ν (R d ) N R ς be a function, for every m N let Φ m : R ϱ R ν and Ψ m : R ϱ R ν R ϱ be functions, let Θ: N Ω R ν, S: N Ω R ς, and Ξ: N Ω R ϱ be stochastic processes which satisfy for every m N that S m+1 = S ( S m, Θ m, (X m,i N ) i N), Ξm+1 = Ψ m (Ξ m, Ψ m,s m+1 (Θ m )), (12) and Θ m+1 = Θ m Φ m (Ξ m+1 ). (121) Under appropriate hypotheses we think for every sufficiently large m N and every x [a, b] d of the random variable U Θm,1,Sm (x): Ω R in Framework 2.1 as a suitable approximation U Θm,1,Sm (x) u(t, x) of u(t, x) R where u = u(t, x) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) is a function with at most polynomially growing partial derivatives which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and u (t, x) = 1 Trace ( t 2 R σ(x)[σ(x)] (Hess d x u)(t, x) ) + µ(x), ( x u)(t, x) R d, (122) where µ: R d R d and σ : R d R d d are sufficiently regular functions (cf. (1) above). 3 Examples In this section we test the proposed approximation algorithm (see Section 2 above) in the case of several examples of SDEs and Kolmogorov PDEs, respectively. In particular, in this section we apply the proposed approximation algorithm to the heat equation (cf. Subsection 3.2 below), to independent geometric Brownian motions (cf. Subsection 3.3 below), to the Black-Scholes model (cf. Subsection 3.4 below), to stochastic Lorenz equations (cf. Subsection 3.5 below), and to the Heston model (cf. Subsection 3.6 below). In the case of each of the examples below we employ the general approximation algorithm in Framework 2.1 above in conjunction with the Adam optimizer (cf. Kingma & Ba [38]) with mini-batches of size 8192 in each iteration step (see Subsection 3.1 below for a precise description). Moreover, we employ a fully-connected feedforward neural network with one input layer, two hidden layers, and one one-dimensional output layer in our implementations in the case of each of these examples. We also use batch normalization (cf. Ioffe & Szegedy [36]) just before the first linear transformation, just before each of the two nonlinear activation functions in front of the hidden layers as well as just after the last linear transformation. For the two nonlinear activation functions we employ the multidimensional version of the function R x tanh(x) ( 1, 1). All weights in the neural network are initialized by means of the Xavier initialization (cf. Glorot & Bengio [18]). All computations were 26

27 performed in single precision (float32) on a NVIDIA GeForce GTX 18 GPU with 1974 MHz core clock and 8 GB GDDR5X memory with MHz clock rate. The underlying system consisted of an Intel Core i7-68k CPU with 64 GB DDR memory running Tensorflow 1.5 on Ubuntu Setting Framework 3.1. Assume Framework 2.1, let ε = 1 8, β 1 = 9, β 1 1 = 999, (γ 1 m) m N (, ), let Pow r : R ν R ν, r (, ), be the functions which satisfy for every r (, ), x = (x 1,..., x ν ) R ν that Pow r (x) = ( x 1 r,..., x ν r ), (123) assume for every m N, i {, 1,..., N} that J m = 8192, t i = it N, ϱ = 2ν, T = 1, γ m = [,25] (m) (25,5] (m) (5, ) (m), assume for every m N, x = (x 1,..., x ν ), y = (y 1,..., y ν ), η = (η 1,..., η ν ) R ν that Ψ m (x, y, η) = (β 1 x + (1 β 1 )η, β 2 y + (1 β 2 ) Pow 2 (η)) (124) and ( [ ] 1 [ ] ) 1 y ψ m (x, y) = 1 γ m x 1 1 (β 2 + ε ) m 1 (β 1 ),..., y ν γ m x ν m 1 (β 2 + ε. (125) ) m 1 (β 1 ) m Equations (124) and (125) in Framework 3.1 describe the Adam optimizer (cf. Kingma & Ba [38], e.g., E et al. [24, (32) (33) in Section 4.2 and (9) (91) in Section 5.2], and line 84 in Python code 1 in Section 4 below). 3.2 Heat equation In this subsection we apply the proposed approximation algorithm to the heat equation (see (126) below). Assume Framework 2.1, assume for every s, t [, T ], x, w R d, m N that N = 1, d = 1, ν = d(2d) + (2d) 2 + 2d = 2d(3d + 1), ϕ(x) = x 2, H(s, t, x, w) = x + 2 Id R d R d w, assume that ξ,1 : Ω R d is continuous uniformly distributed on [, 1] d, and let u = (u(t, x)) (t,x) [,T ] R d C 1,2 ([, T ] R d, R) be an at most polynomially growing function which satisfies for every t [, T ], x R d that u(, x) = ϕ(x) and ( u t )(t, x) = ( xu)(t, x). (126) Combining, e.g., Lemma 3.2 below with, e.g., Hairer et al. [22, Corollary 4.17 and Remark 4.1] shows that for every t [, T ], x R d it holds that u(t, x) = x 2 R d + t d. (127) 27

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations arxiv:1706.04702v1 [math.na] 15 Jun 2017 Weinan E 1, Jiequn

More information

Ch. Beck and W. E and A. Jentzen. Research Report No September 2017

Ch. Beck and W. E and A. Jentzen. Research Report No September 2017 Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations Ch. Beck and W. E and A. Jentzen

More information

Loss of regularity for Kolmogorov equations

Loss of regularity for Kolmogorov equations Loss of regularity for Kolmogorov equations Martin Hairer, Martin Hutzenthaler and Arnulf Jentzen 3,4 Mathematics Department, The University of Warwick, Coventry, CV4 7AL, United Kingdom, e-mail: M.Hairer

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Strong error analysis for stochastic gradient descent optimization algorithms

Strong error analysis for stochastic gradient descent optimization algorithms Strong error analysis for stochastic gradient descent optimization algorithms Arnulf Jentzen 1, Benno Kuckuck 2, Ariel Neufeld 3, and Philippe von Wurstemberger 4 arxiv:1801.09324v1 math.na 29 Jan 2018

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

Discretization of SDEs: Euler Methods and Beyond

Discretization of SDEs: Euler Methods and Beyond Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems Weinan E 1 and Bing Yu 2 arxiv:1710.00211v1 [cs.lg] 30 Sep 2017 1 The Beijing Institute of Big Data Research,

More information

Adaptive timestepping for SDEs with non-globally Lipschitz drift

Adaptive timestepping for SDEs with non-globally Lipschitz drift Adaptive timestepping for SDEs with non-globally Lipschitz drift Mike Giles Wei Fang Mathematical Institute, University of Oxford Workshop on Numerical Schemes for SDEs and SPDEs Université Lille June

More information

Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations

Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations Ram Sharan Adhikari Assistant Professor Of Mathematics Rogers State University Mathematical

More information

The Wiener Itô Chaos Expansion

The Wiener Itô Chaos Expansion 1 The Wiener Itô Chaos Expansion The celebrated Wiener Itô chaos expansion is fundamental in stochastic analysis. In particular, it plays a crucial role in the Malliavin calculus as it is presented in

More information

UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE

UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE Surveys in Mathematics and its Applications ISSN 1842-6298 (electronic), 1843-7265 (print) Volume 5 (2010), 275 284 UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE Iuliana Carmen Bărbăcioru Abstract.

More information

1. Stochastic Processes and filtrations

1. Stochastic Processes and filtrations 1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S

More information

Exercises in stochastic analysis

Exercises in stochastic analysis Exercises in stochastic analysis Franco Flandoli, Mario Maurelli, Dario Trevisan The exercises with a P are those which have been done totally or partially) in the previous lectures; the exercises with

More information

Stochastic methods for solving partial differential equations in high dimension

Stochastic methods for solving partial differential equations in high dimension Stochastic methods for solving partial differential equations in high dimension Marie Billaud-Friess Joint work with : A. Macherey, A. Nouy & C. Prieur marie.billaud-friess@ec-nantes.fr Centrale Nantes,

More information

arxiv: v2 [math.pr] 9 May 2013

arxiv: v2 [math.pr] 9 May 2013 arxiv:103.5809v math.pr] 9 May 013 umerical approximations of stochastic differential equations with non-globally Author address: Lipschitz continuous coefficients Martin Hutzenthaler Arnulf Jentzen LMU

More information

A Concise Course on Stochastic Partial Differential Equations

A Concise Course on Stochastic Partial Differential Equations A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original

More information

Branching Processes II: Convergence of critical branching to Feller s CSB

Branching Processes II: Convergence of critical branching to Feller s CSB Chapter 4 Branching Processes II: Convergence of critical branching to Feller s CSB Figure 4.1: Feller 4.1 Birth and Death Processes 4.1.1 Linear birth and death processes Branching processes can be studied

More information

(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS

(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS (2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS Svetlana Janković and Miljana Jovanović Faculty of Science, Department of Mathematics, University

More information

Cores for generators of some Markov semigroups

Cores for generators of some Markov semigroups Cores for generators of some Markov semigroups Giuseppe Da Prato, Scuola Normale Superiore di Pisa, Italy and Michael Röckner Faculty of Mathematics, University of Bielefeld, Germany and Department of

More information

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3.

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3. 1. GAUSSIAN PROCESSES A Gaussian process on a set T is a collection of random variables X =(X t ) t T on a common probability space such that for any n 1 and any t 1,...,t n T, the vector (X(t 1 ),...,X(t

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

EULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS

EULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS Qiao, H. Osaka J. Math. 51 (14), 47 66 EULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS HUIJIE QIAO (Received May 6, 11, revised May 1, 1) Abstract In this paper we show

More information

Stochastic optimal control with rough paths

Stochastic optimal control with rough paths Stochastic optimal control with rough paths Paul Gassiat TU Berlin Stochastic processes and their statistics in Finance, Okinawa, October 28, 2013 Joint work with Joscha Diehl and Peter Friz Introduction

More information

Stochastic Gradient Descent in Continuous Time

Stochastic Gradient Descent in Continuous Time Stochastic Gradient Descent in Continuous Time Justin Sirignano University of Illinois at Urbana Champaign with Konstantinos Spiliopoulos (Boston University) 1 / 27 We consider a diffusion X t X = R m

More information

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition Filtrations, Markov Processes and Martingales Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition David pplebaum Probability and Statistics Department,

More information

Optimal portfolio strategies under partial information with expert opinions

Optimal portfolio strategies under partial information with expert opinions 1 / 35 Optimal portfolio strategies under partial information with expert opinions Ralf Wunderlich Brandenburg University of Technology Cottbus, Germany Joint work with Rüdiger Frey Research Seminar WU

More information

ON THE PATHWISE UNIQUENESS OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS

ON THE PATHWISE UNIQUENESS OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS PORTUGALIAE MATHEMATICA Vol. 55 Fasc. 4 1998 ON THE PATHWISE UNIQUENESS OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS C. Sonoc Abstract: A sufficient condition for uniqueness of solutions of ordinary

More information

Backward Stochastic Differential Equations with Infinite Time Horizon

Backward Stochastic Differential Equations with Infinite Time Horizon Backward Stochastic Differential Equations with Infinite Time Horizon Holger Metzler PhD advisor: Prof. G. Tessitore Università di Milano-Bicocca Spring School Stochastic Control in Finance Roscoff, March

More information

Rough Burgers-like equations with multiplicative noise

Rough Burgers-like equations with multiplicative noise Rough Burgers-like equations with multiplicative noise Martin Hairer Hendrik Weber Mathematics Institute University of Warwick Bielefeld, 3.11.21 Burgers-like equation Aim: Existence/Uniqueness for du

More information

A Class of Fractional Stochastic Differential Equations

A Class of Fractional Stochastic Differential Equations Vietnam Journal of Mathematics 36:38) 71 79 Vietnam Journal of MATHEMATICS VAST 8 A Class of Fractional Stochastic Differential Equations Nguyen Tien Dung Department of Mathematics, Vietnam National University,

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary

More information

Nonlinear Lévy Processes and their Characteristics

Nonlinear Lévy Processes and their Characteristics Nonlinear Lévy Processes and their Characteristics Ariel Neufeld Marcel Nutz January 11, 215 Abstract We develop a general construction for nonlinear Lévy processes with given characteristics. More precisely,

More information

Convergence of Particle Filtering Method for Nonlinear Estimation of Vortex Dynamics

Convergence of Particle Filtering Method for Nonlinear Estimation of Vortex Dynamics Convergence of Particle Filtering Method for Nonlinear Estimation of Vortex Dynamics Meng Xu Department of Mathematics University of Wyoming February 20, 2010 Outline 1 Nonlinear Filtering Stochastic Vortex

More information

Random Process Lecture 1. Fundamentals of Probability

Random Process Lecture 1. Fundamentals of Probability Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus

More information

Introduction to self-similar growth-fragmentations

Introduction to self-similar growth-fragmentations Introduction to self-similar growth-fragmentations Quan Shi CIMAT, 11-15 December, 2017 Quan Shi Growth-Fragmentations CIMAT, 11-15 December, 2017 1 / 34 Literature Jean Bertoin, Compensated fragmentation

More information

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Controlled Diffusions and Hamilton-Jacobi Bellman Equations Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

Wiener Measure and Brownian Motion

Wiener Measure and Brownian Motion Chapter 16 Wiener Measure and Brownian Motion Diffusion of particles is a product of their apparently random motion. The density u(t, x) of diffusing particles satisfies the diffusion equation (16.1) u

More information

An iterative procedure for constructing subsolutions of discrete-time optimal control problems

An iterative procedure for constructing subsolutions of discrete-time optimal control problems An iterative procedure for constructing subsolutions of discrete-time optimal control problems Markus Fischer version of November, 2011 Abstract An iterative procedure for constructing subsolutions of

More information

Linearly-Solvable Stochastic Optimal Control Problems

Linearly-Solvable Stochastic Optimal Control Problems Linearly-Solvable Stochastic Optimal Control Problems Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014

More information

Exponential martingales: uniform integrability results and applications to point processes

Exponential martingales: uniform integrability results and applications to point processes Exponential martingales: uniform integrability results and applications to point processes Alexander Sokol Department of Mathematical Sciences, University of Copenhagen 26 September, 2012 1 / 39 Agenda

More information

Fast-slow systems with chaotic noise

Fast-slow systems with chaotic noise Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York University New York NY www.dtbkelly.com May 1, 216 Statistical properties of dynamical systems, ESI Vienna. David

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Stochastic Differential Equations

Stochastic Differential Equations CHAPTER 1 Stochastic Differential Equations Consider a stochastic process X t satisfying dx t = bt, X t,w t dt + σt, X t,w t dw t. 1.1 Question. 1 Can we obtain the existence and uniqueness theorem for

More information

Weak convergence and large deviation theory

Weak convergence and large deviation theory First Prev Next Go To Go Back Full Screen Close Quit 1 Weak convergence and large deviation theory Large deviation principle Convergence in distribution The Bryc-Varadhan theorem Tightness and Prohorov

More information

5 December 2016 MAA136 Researcher presentation. Anatoliy Malyarenko. Topics for Bachelor and Master Theses. Anatoliy Malyarenko

5 December 2016 MAA136 Researcher presentation. Anatoliy Malyarenko. Topics for Bachelor and Master Theses. Anatoliy Malyarenko 5 December 216 MAA136 Researcher presentation 1 schemes The main problem of financial engineering: calculate E [f(x t (x))], where {X t (x): t T } is the solution of the system of X t (x) = x + Ṽ (X s

More information

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems

More information

Stochastic differential equation models in biology Susanne Ditlevsen

Stochastic differential equation models in biology Susanne Ditlevsen Stochastic differential equation models in biology Susanne Ditlevsen Introduction This chapter is concerned with continuous time processes, which are often modeled as a system of ordinary differential

More information

Hölder continuity of the solution to the 3-dimensional stochastic wave equation

Hölder continuity of the solution to the 3-dimensional stochastic wave equation Hölder continuity of the solution to the 3-dimensional stochastic wave equation (joint work with Yaozhong Hu and Jingyu Huang) Department of Mathematics Kansas University CBMS Conference: Analysis of Stochastic

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real

More information

On the fast convergence of random perturbations of the gradient flow.

On the fast convergence of random perturbations of the gradient flow. On the fast convergence of random perturbations of the gradient flow. Wenqing Hu. 1 (Joint work with Chris Junchi Li 2.) 1. Department of Mathematics and Statistics, Missouri S&T. 2. Department of Operations

More information

13 The martingale problem

13 The martingale problem 19-3-2012 Notations Ω complete metric space of all continuous functions from [0, + ) to R d endowed with the distance d(ω 1, ω 2 ) = k=1 ω 1 ω 2 C([0,k];H) 2 k (1 + ω 1 ω 2 C([0,k];H) ), ω 1, ω 2 Ω. F

More information

Weak solutions of mean-field stochastic differential equations

Weak solutions of mean-field stochastic differential equations Weak solutions of mean-field stochastic differential equations Juan Li School of Mathematics and Statistics, Shandong University (Weihai), Weihai 26429, China. Email: juanli@sdu.edu.cn Based on joint works

More information

Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches

Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches Noise is often considered as some disturbing component of the system. In particular physical situations, noise becomes

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

Malliavin Calculus in Finance

Malliavin Calculus in Finance Malliavin Calculus in Finance Peter K. Friz 1 Greeks and the logarithmic derivative trick Model an underlying assent by a Markov process with values in R m with dynamics described by the SDE dx t = b(x

More information

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT A MODEL FOR HE LONG-ERM OPIMAL CAPACIY LEVEL OF AN INVESMEN PROJEC ARNE LØKKA AND MIHAIL ZERVOS Abstract. We consider an investment project that produces a single commodity. he project s operation yields

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze

More information

A random perturbation approach to some stochastic approximation algorithms in optimization.

A random perturbation approach to some stochastic approximation algorithms in optimization. A random perturbation approach to some stochastic approximation algorithms in optimization. Wenqing Hu. 1 (Presentation based on joint works with Chris Junchi Li 2, Weijie Su 3, Haoyi Xiong 4.) 1. Department

More information

Selected Exercises on Expectations and Some Probability Inequalities

Selected Exercises on Expectations and Some Probability Inequalities Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ

More information

Lipschitz continuity for solutions of Hamilton-Jacobi equation with Ornstein-Uhlenbeck operator

Lipschitz continuity for solutions of Hamilton-Jacobi equation with Ornstein-Uhlenbeck operator Lipschitz continuity for solutions of Hamilton-Jacobi equation with Ornstein-Uhlenbeck operator Thi Tuyen Nguyen Ph.D student of University of Rennes 1 Joint work with: Prof. E. Chasseigne(University of

More information

On Stopping Times and Impulse Control with Constraint

On Stopping Times and Impulse Control with Constraint On Stopping Times and Impulse Control with Constraint Jose Luis Menaldi Based on joint papers with M. Robin (216, 217) Department of Mathematics Wayne State University Detroit, Michigan 4822, USA (e-mail:

More information

Solutions to the Exercises in Stochastic Analysis

Solutions to the Exercises in Stochastic Analysis Solutions to the Exercises in Stochastic Analysis Lecturer: Xue-Mei Li 1 Problem Sheet 1 In these solution I avoid using conditional expectations. But do try to give alternative proofs once we learnt conditional

More information

Numerical discretisations of stochastic wave equations

Numerical discretisations of stochastic wave equations Numerical discretisations of stochastic wave equations David Cohen Matematik och matematisk statistik \ UMIT Research Lab, Umeå universitet Institut für Mathematik, Universität Innsbruck Joint works with

More information

Approximation of BSDEs using least-squares regression and Malliavin weights

Approximation of BSDEs using least-squares regression and Malliavin weights Approximation of BSDEs using least-squares regression and Malliavin weights Plamen Turkedjiev (turkedji@math.hu-berlin.de) 3rd July, 2012 Joint work with Prof. Emmanuel Gobet (E cole Polytechnique) Plamen

More information

ERRATA: Probabilistic Techniques in Analysis

ERRATA: Probabilistic Techniques in Analysis ERRATA: Probabilistic Techniques in Analysis ERRATA 1 Updated April 25, 26 Page 3, line 13. A 1,..., A n are independent if P(A i1 A ij ) = P(A 1 ) P(A ij ) for every subset {i 1,..., i j } of {1,...,

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

Chapter 4: Monte-Carlo Methods

Chapter 4: Monte-Carlo Methods Chapter 4: Monte-Carlo Methods A Monte-Carlo method is a technique for the numerical realization of a stochastic process by means of normally distributed random variables. In financial mathematics, it

More information

Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm

Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm Gonçalo dos Reis University of Edinburgh (UK) & CMA/FCT/UNL (PT) jointly with: W. Salkeld, U. of

More information

Nonlinear representation, backward SDEs, and application to the Principal-Agent problem

Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Ecole Polytechnique, France April 4, 218 Outline The Principal-Agent problem Formulation 1 The Principal-Agent problem

More information

The Pedestrian s Guide to Local Time

The Pedestrian s Guide to Local Time The Pedestrian s Guide to Local Time Tomas Björk, Department of Finance, Stockholm School of Economics, Box 651, SE-113 83 Stockholm, SWEDEN tomas.bjork@hhs.se November 19, 213 Preliminary version Comments

More information

Convergence at first and second order of some approximations of stochastic integrals

Convergence at first and second order of some approximations of stochastic integrals Convergence at first and second order of some approximations of stochastic integrals Bérard Bergery Blandine, Vallois Pierre IECN, Nancy-Université, CNRS, INRIA, Boulevard des Aiguillettes B.P. 239 F-5456

More information

Regularity of the density for the stochastic heat equation

Regularity of the density for the stochastic heat equation Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department

More information

Deep Learning for Partial Differential Equations (PDEs)

Deep Learning for Partial Differential Equations (PDEs) Deep Learning for Partial Differential Equations (PDEs) Kailai Xu kailaix@stanford.edu Bella Shi bshi@stanford.edu Shuyi Yin syin3@stanford.edu Abstract Partial differential equations (PDEs) have been

More information

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations.

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Lecture No 1 Introduction to Diffusion equations The heat equat

Lecture No 1 Introduction to Diffusion equations The heat equat Lecture No 1 Introduction to Diffusion equations The heat equation Columbia University IAS summer program June, 2009 Outline of the lectures We will discuss some basic models of diffusion equations and

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula

Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula Group 4: Bertan Yilmaz, Richard Oti-Aboagye and Di Liu May, 15 Chapter 1 Proving Dynkin s formula

More information

On a class of stochastic differential equations in a financial network model

On a class of stochastic differential equations in a financial network model 1 On a class of stochastic differential equations in a financial network model Tomoyuki Ichiba Department of Statistics & Applied Probability, Center for Financial Mathematics and Actuarial Research, University

More information

SOLUTIONS OF SEMILINEAR WAVE EQUATION VIA STOCHASTIC CASCADES

SOLUTIONS OF SEMILINEAR WAVE EQUATION VIA STOCHASTIC CASCADES Communications on Stochastic Analysis Vol. 4, No. 3 010) 45-431 Serials Publications www.serialspublications.com SOLUTIONS OF SEMILINEAR WAVE EQUATION VIA STOCHASTIC CASCADES YURI BAKHTIN* AND CARL MUELLER

More information

Properties of an infinite dimensional EDS system : the Muller s ratchet

Properties of an infinite dimensional EDS system : the Muller s ratchet Properties of an infinite dimensional EDS system : the Muller s ratchet LATP June 5, 2011 A ratchet source : wikipedia Plan 1 Introduction : The model of Haigh 2 3 Hypothesis (Biological) : The population

More information

Robustness for a Liouville type theorem in exterior domains

Robustness for a Liouville type theorem in exterior domains Robustness for a Liouville type theorem in exterior domains Juliette Bouhours 1 arxiv:1207.0329v3 [math.ap] 24 Oct 2014 1 UPMC Univ Paris 06, UMR 7598, Laboratoire Jacques-Louis Lions, F-75005, Paris,

More information

for all f satisfying E[ f(x) ] <.

for all f satisfying E[ f(x) ] <. . Let (Ω, F, P ) be a probability space and D be a sub-σ-algebra of F. An (H, H)-valued random variable X is independent of D if and only if P ({X Γ} D) = P {X Γ}P (D) for all Γ H and D D. Prove that if

More information

Quantization of stochastic processes with applications on Euler-Maruyama schemes

Quantization of stochastic processes with applications on Euler-Maruyama schemes UPTEC F15 063 Examensarbete 30 hp Oktober 2015 Quantization of stochastic processes with applications on Euler-Maruyama schemes Viktor Edward Abstract Quantization of stochastic processes with applications

More information

LECTURE 2: LOCAL TIME FOR BROWNIAN MOTION

LECTURE 2: LOCAL TIME FOR BROWNIAN MOTION LECTURE 2: LOCAL TIME FOR BROWNIAN MOTION We will define local time for one-dimensional Brownian motion, and deduce some of its properties. We will then use the generalized Ray-Knight theorem proved in

More information

LAN property for sde s with additive fractional noise and continuous time observation

LAN property for sde s with additive fractional noise and continuous time observation LAN property for sde s with additive fractional noise and continuous time observation Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Samy Tindel (Purdue University) Vlad s 6th birthday,

More information

Kolmogorov Equations and Markov Processes

Kolmogorov Equations and Markov Processes Kolmogorov Equations and Markov Processes May 3, 013 1 Transition measures and functions Consider a stochastic process {X(t)} t 0 whose state space is a product of intervals contained in R n. We define

More information

A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES

A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES STEFAN TAPPE Abstract. In a work of van Gaans (25a) stochastic integrals are regarded as L 2 -curves. In Filipović and Tappe (28) we have shown the connection

More information

1 Independent increments

1 Independent increments Tel Aviv University, 2008 Brownian motion 1 1 Independent increments 1a Three convolution semigroups........... 1 1b Independent increments.............. 2 1c Continuous time................... 3 1d Bad

More information

Strong Solutions and a Bismut-Elworthy-Li Formula for Mean-F. Formula for Mean-Field SDE s with Irregular Drift

Strong Solutions and a Bismut-Elworthy-Li Formula for Mean-F. Formula for Mean-Field SDE s with Irregular Drift Strong Solutions and a Bismut-Elworthy-Li Formula for Mean-Field SDE s with Irregular Drift Thilo Meyer-Brandis University of Munich joint with M. Bauer, University of Munich Conference for the 1th Anniversary

More information

Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2)

Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2) Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2) Statistical analysis is based on probability theory. The fundamental object in probability theory is a probability space,

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Unbiased simulation of stochastic differential equations using parametrix expansions

Unbiased simulation of stochastic differential equations using parametrix expansions Submitted to Bernoulli Unbiased simulation of stochastic differential equations using parametrix expansions PATRIK ANDERSSON and ARTURO KOHATSU-HIGA Department of Statistics, Uppsala University, Box 53,

More information