Approximation of BSDEs using least-squares regression and Malliavin weights

Approximation of BSDEs using least-squares regression and Malliavin weights Plamen Turkedjiev (turkedji@math.hu-berlin.de) 3rd July, 2012 Joint work with Prof. Emmanuel Gobet (E cole Polytechnique) Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 1 / 22

FBSDEs T > 0, W q-dimensional Brownian motion, (Ω, F, P) filtered probability space with usual conditions, but filtration may be larger than that generated by W, ξ L 2 (F T ), T T Y t = ξ + f(s, X s, Y s, Z s )ds Z s dw s (L T L t ) t t where X is d-dimensional, (t, x, y, z) f(t, x, y, z) is Borel measurable. Typically, X is a jump-diffusion driven by W and a Poisson random measure, L is a martingale orthogonal to W, and ξ = Φ(X T ) X is a diffusion driven by W, L 0, and ξ = Φ(X t1,..., X T ) or ξ = Φ(X T, T 0 X tdt) Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 2 / 22

Local Lipschitz condition and Quadratic BSDEs We consider time-local Lipschitz continuous driver: f(t, x, y, z) f(t, x, y, z ) L f ( x x + y y + z z (T t) (1 θ)/2 ) MOTIVATION: Assume X is a diffusion (L 0) and driver satisfies quadratic growth condition f(t, x, y, z) c(1 + y + z 2 ) f(t, x, y, z) f(t, x, y, z ) c(1 + z + z )( y y + z z ) and x Φ(x) is Hölder continuous and bounded. Then Z t L f (T t) (1 θ)/2 holds P dt-a.e. for constants L f and θ independent of t. Locally Lipschitz can replace quadratic in this special problem! Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 3 / 22

Driver with exploding bound and variance reduction For all t > 0 and x R d, there exists α (0, 1] and C f > 0 such that f(t, x, 0, 0) C f (T t) 1 α MOTIVATION: ξ = Φ(X T ) and x Φ(x) is α-hölder continuous and bounded. X is a diffusion process (L 0) so that v t (x) = E[Φ(X T ) X t = x] is smooth. f(t, x, y, z) uniformly Lipschitz continuous and unif. bounded at (y, z) = (0, 0). (v t (X t ), v t (X t )σ(t, X t )) solves BSDE with data (ξ, 0). Suppose we can solve this BSDE! v t (x) C(T t) α 1 standard from PDE theory. (Y t v t (X t ), Z t v t (X t )σ(t, X t )) solves a BSDE with data (0, f 0 ), where f 0 (t, x, y, z) = f(t, x, y + v t (x), z + v t (x)σ(t, x)). This BSDE may be better behaved for simulation purposes. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 4 / 22

Key property: discretizability of FBSDEs Time-grid: π = (0 = t 0 <... < t N = T ). Paritcularly important grid: for β (0, 1], π β for which t β i := T T (1 1 N )1/β. Theorem If α = 1, let β = 1; else let β < α. Under the given assumptions, there exists a positive constant C, independent of N, such that max sup 0 i N 1 t β i t<tβ i+1 E Y t Y t β 2 + i N 1 i=0 t β i+1 t β i E Z t Z t β 2 dt CN 1 i We say that O(N 1/2 ) is the optimal rate of convergence for a discrete-time approximation of the BSDE. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 5 / 22

Algorithm 1: Multistep dynamical programming Let i = t i+1 t i and W i := W ti+1 W ti. Recurssively build approximation of the solution, starting at i = N 1: i Z i = E i [ Wi (ξ + N 1 k=i+1 f(t k, X k, Y k+1, Z k ) k )], Y i = E i [ξ + N 1 k=i f(t k, X k, Y k+1, Z k ) k ], Y N = ξ. Consistency conditions for the time-grid: sup k<n k (T t k ) 1 θ 0 as N, lim sup N sup k<n 1 k k+1. Theorem For N sufficiently large, there exists a positive constant C independent of the time-grid such that max E Y i Y 0 k N 1 t β i N 1 2 + i=0 E Z i Z t β 2 i CN 1 i Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 6 / 22

Assumptions and properties Markov structure Let ξ = Φ(X N ) and X be a Markov chain.this ensures (Y i, Z i) = (y i(x i), z i(x i)) for measurable (unknown) functions y i and z i. Almost sure bounds Let x Φ(x) be bounded. This ensures that C y > 0 such that, k, Y k C y and Z k Cy P-almost surely. k Basis functions For each 0 l q and 0 k N 1, take a finite number of functions p l,k ( ) = (p i l,k) 1 i K such that p l,k : R d R is deterministic and E[ p l,k (X k ) 2 ] <. Form basis of finite dimensional subspaces of L 2(F tk ). Simulations Take M independent simulations of the Brownian increments W and the explanatory Markov chain X. Denote these simulations by (X m k ) 1 m M and ( W m k ) 1 m M respectively. Let p m l,k := p l,k (X m k ). Definition For R > 0, the truncated Brownian increment is defined by [ W i] R = R i W i R i. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 7 / 22

Emprical regression algorithm Set y R,M N ( ) = Φ( ). Then, for i < N, compute coefficients α M l,i α M 0,k = arg min α = arg min α 1 M 1 M M m=1 M m=1 [ W ( N 1 l,i] R Φ(XN m ) + i N 1 Φ(XN m ) + k=i+1 k=i+1 The coefficients are not independent of one another! Set f i (y R,M k+1 (Xm k+1 ), zr,m k (Xk m )) k f i (y R,M k+1 (Xm k+1 ), zr,m k (Xk m )) k α p m 0,k 2. y R,M i (x) = C y α M 0,i p 0,i(x) C y, z R,M l,i (x) = Cy αl,i M p 0,i(x) Cy. i i ) α p m l,k 2 Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 8 / 22

Key ingredient: concentration of measure inequalities Needed, amongst other things, to deal with the lack of independence between regression coefficients. The following example comes from [Györfi et al. 2002, Theorem 11.2]. Benefit: the estimates are distribution-free. Theorem Let F {f : R d [ B, B]} and (Z i ) 1 i n be i.i.d. Then, for all ε > 0. Proposition P( f F : (E[ f(z) 2 ]) 1/2 2( 1 n E[N 2 ( If F is in a K-dimensional vector space, n f(z i ) 2 ) 1/2 > ε) i=1 ) 2 24 ε, F, Z 1:n)] exp ( nε2 288B 2 ( ( )) 2eB 2 3eB 2 K N 2 (ε, F, z 1:n ) 3 ε 2 log ε 2 Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 9 / 22

Error estimates Norm: For function Ψ, define Ψ 2 k,m := 1 M M m=1 Ψ(Xm k ) 2. Theorem For N sufficiently large, there exists a possitive constant C independent of the time-grid, M and the basis functions such that N 1 k=0 { E[ y k y R,M k C N 1 k=0 + C 2 k,m ] + E[ z k z R,M k 2 k,m ] } k { min α E yr k (X k) α p 0,k 2 k + N 1 k=0 { KN k M N 1 + CNKR 2 3 CN q l=1 } + KNR2 + CN θconv M k=0 ( exp CMR 2 k KN 1+θconv } min α E zr l,k (X k) α p l,k 2 k ) N 1 i=k ( CKR 2 N θconv i k ) CK +... Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 10 / 22

Complexity analysis Aim: reduce error to O(N 2θconv ). Assume y i C κ+1+η b, z i C κ+η b. Local polynomials on disjoint hypercubes, degree κ + 1 for Y and κ for Z. Bias approximation: O(N 2θconv ) if δ z = cn θconv κ+η. K = cn d θconv κ+η up to log terms. θconv 2+2θconv+2d = cn κ+η Large deviation terms: M = ckn 2+2θconv terms. θconv 3+2θconv+2d Computational work C = cmn = cn κ+η N 2θconv cc 1 2(1+ 3 2θconv + κ+η d ). ODP scheme with θ conv = 1/2: N 1 cc if κ + η > 1, MDP has better performance. 1 2(4+ 2d κ+1+η ). up to log up to log terms. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 11 / 22

BSDE numerics: Implicit vs explicit q = 3; f(z) = 1.5 z (1) ; Φ(x) = (x (3) 100) + ; X (j) t = 100e (σwt)(j) with σ 1 1 ρ 2 σ 0 σ 1 0.01 1ρ σ = 0 σ 2 1 ρ 2 σ 2ρ, σ 2 σ 3 = 0.05 0.03 0 0 σ 3 ρ 0.1 N = 16; Basis q i=0 g i(ln(x i )) for g i Hermite polynomials with i deg(g i) 3. Explicit solution: Y t = BlackScholesCall(t, X t ; σ 3, 100), Z t = (0, 0, BlackScholesHedge(t, X t ; σ 3, 100)). Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 12 / 22

BSDE numerics 2.3 2.25 2.2 Explicit vs Implicit Multistep forward scheme Explicit Implicit 2.15 2.1 log(error) 2.05 2 1.95 1.9 1.85 1.8 1.75 19.5 20 20.5 21 21.5 22 22.5 23 23.5 log(work) Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 13 / 22

Representation theorem due to Ma/Zhang X is a diffusion: dx t = b(t, X t )dt + σ(t, X t )dw t. Also need gradient process and it s inverse: d X t = b x (t, X t ) X t dt + σ(t, X t ) X t dw t, d X 1 t = ( b x (t, X t ) σ x (t, X t ) 2 ) Xt 1 dt + σ x (t, X t ) Xt 1 dw t. Representation theorem due to Ma/Zhang for Z: T Z t = E t [ξht t + f(r, X r, Y r, Z r )Hrdr] t t where (r t)h t r = ( t r [σ 1 (s, X s ) X s X 1 t σ(t, X t )] dw s ). H t r are the Malliavin weights; the representation formula is derived by means of Malliavin s calculus, but remains true in the Lipschitz case, even though the BSDE is not Mallivin differentiable. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 14 / 22

Algorithm 2: Malliavin weights Let (t k t i )Hk i = ( k 1 j=i σ 1 (t j, X j ) X j X 1 i σ(t i, X i )] W j ). Recursively build the approximation starting at i = N 1: Z i = E i [ξhn i + N 1 k=i+1 f(t k, X k, Y k+1, Z k )Hk i k], Y i = E i [ξ + N 1 k=i f(t k, X k, Y k+1, Z k ) k ], Y N = ξ. Constraint on the time-grid: lim sup N sup i<n i+1 i <. Recall the special time-grid π β : Theorem For sufficiently high N, there exists a positive constant C independent of the time-grid such that max E Y i Y 0 k N 1 t β i N 1 2 + i=0 E Z i Z t β 2 i CN 1 i Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 15 / 22

Projection estimates Approximate conditional expectation E i by projection on finite subspace of L 2 (F ti ): Ẑ l,i = arg min α pl,i (X i) E[ Φ(X N )Hl,N i + N 1 k=i+1 f(t k, X k, Ŷk+1, Ẑk)Hl,k i k α p l,i (X i ) 2 ], Ŷ i = arg inf α p0,i(x i) E[ Φ(X N ) + N 1 k=i f(t k, X k, Ŷk+1, Ẑk) k α p 0,i (X i ) 2 ], Ŷ N = Φ(X N ). Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 16 / 22

Projection estimates Theorem There exists positive constant C independent of the time-grid such that E Y i Ŷi 2 2E Y i P Y i Y i 2 N 1 + C k=i E Z i Ẑi 2 2E Z i P Z i Z i 2 N 1 + C k=i {E Y k+1 P Y k+1 Y k+1 2 + E Z k P Z k Z k 2 } k (T t k ) 1 θ {E Y k+1 P Y k+1 Y k+1 2 + E Z k P Z k Z k 2 } k (T t k ) 1 θ Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 17 / 22

Almost sure bounds There exist positive constants C y and C z independent of the time-grid such that, i, Y i C y and Z i C z P-almost surely T ti Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 18 / 22

Finally... Thank You For Your Attention! Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 19 / 22

References [GT11] E. Gobet, T. Approximation for discrete BSDE using least-squares regression. http://hal.archives-ouvertes.fr/aut/turkedjiev/ [BD07] [CD11] [DG06] C. Bender and R. Denk. A forward scheme for backward SDEs. Stochastic Processes and their Applications, 117(12):1793 1823, 2007. D. Crisan, F. Delarue. Sharp gradient bounds for solutions of semi-linear PDEs. http://hal.archive-ouvert.fr/hal-00599543/fr, 2011. F. Delarue, G. Guatteri. Weak existence and uniqueness for forward-backward SDEs. Stochastic processes and their applications, 116(12):1712 1742, 2006. [GGG11] C. Geiss, S. Geiss, and E. Gobet. Generalized fractional smoothness and L p -variation of BSDEs with non-lipschitz terminal condition. To appear in Stochastic Processes and their Applications, 2011. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 20 / 22

References [GLW05] E. Gobet, J.P. Lemor, and X. Warin. A regression-based Monte Carlo method to solve backward stochastic differential equations. Annals of Applied Probability, 15(3):2172 2202, 2005. [LGW06] J.P. Lemor, E. Gobet, and X. Warin. Rate of convergence of an empirical regression method for solving generalized BSDEs. Bernoulli, 12(5):889 916, 2006. [GM10] E. Gobet and A. Makhlouf. L 2 -time regularity of BSDEs with irregular terminal functions. Stochastic Processes and their Applications, 120:1105 1132, 2010. [GKKW02] L. Gyorfi, M. Kohler, A. Krzyzak, and H. Walk. A distribution-free theory of nonparametric regression. Springer Series in Statistics, 2002. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 21 / 22

References [MZ02] [Mo10] J. Ma, J. Zhang. Representation theorems for BSDEs. Annals of Applied Probability, 12(4):1390 1418, 2002. T. Moseler. A Picard-type iteration for BSDEs: Convergence and importance sampling. PhD thesis. Plamen Turkedjiev (HU Berlin) Approximation of BSDEs 3rd July, 2012 22 / 22