Stochastic Differential Equations. Introduction to Stochastic Models for Pollutants Dispersion, Epidemic and Finance

Size: px

Start display at page:

Download "Stochastic Differential Equations. Introduction to Stochastic Models for Pollutants Dispersion, Epidemic and Finance"

Zoe Riley
6 years ago
Views:

1 Stochastic Differential Equations. Introduction to Stochastic Models for Pollutants Dispersion, Epidemic and Finance 15th March-April 19th, 211 at Lappeenranta University of Technology(LUT)-Finland By Dr. W.M. Charles: University of Dar-Es-salaam-Tanzania and Dr J.A.M. van der Weide: Delft University of Technology, The Netherlands

2 2

3 Contents 1 Introduction Objectives Stochastic modelling Probability Models Definitions Conditional Probability and Expectation More Properties of Conditional Expectation Stochastic Processes The Gaussian Distribution Wiener Process Random walk Construction Diffusion processes Stochastic Integrals 31 4 Itô Integral Process Motivation and problem formulation Stochastic Differential Equations Linear Stochastic Differential Equations Itô s Formula The Multidimensional Itô Formula Applications of Itô formula Examples of Linear SDEs with additive noise Examples of Linear SDEs with multiplicative noise Relation between Itô and Stratonovich SDEs Connection between Stochastic differential and PDES Markov processes and Transition Density Transition Density Estimation Forward density estimation The forward-reverse formulation

4 4 CONTENTS 6.5 The Generator of the Itô Diffusion Kolmogorov Backward equation (KBE) Feynman-Kač representation formula dimensional Fokker Planck equation(fpe) d- dimensional Fokker Planck equation(fpe)) Definition of order of convergence of Numerical scheme Derivation of Numerical schemes for SDEs Stochastic Taylor expansion and derivation of stochastic numerical schemes Numerical schemes Application of SDEs Introduction to particle models and their application to model transport in shallow water Diffusion and dispersion Molecular diffusion Molecular diffusion with a constant diffusion coefficient Molecular diffusion with a space varying diffusion coefficient Advection-diffusion process for a two dimensional model Consistence of particle model with the ADEs Introduction of SDEs to Model the Dynamics of Electricity Oil Spot Price Application of SDEs to Finance Feynman-Kač representation formula Financial Markets The One-Period Binomial Model The Discrete Model The Multi-Period Binomial Model The Financial Market and The Black-Scholes Model The Black-Scholes Model Exercises Appendix Appendix II Summary

5 Chapter 1 Introduction This in an introduction to the theory of Stochastic differential equations(sdes) for those who wish to model the dynamics of systems in Chemistry, Biology, Finance, Economics and populations to mention but a few. It assumes that the learner has some background in statistics and probability theory. But it starts with the definitions of some important concepts that a learner encounters in this course but they are not deeply discussed in this course. 1.1 Objectives 1. The course intends to provide an understanding of modeling problems related to stochastic differential equations. 2. To introduce practical skills and solutions methods to the learner which includes numerical as well as analytical. 3. To describe areas of application such as dispersion of pollutants in shallow water 1.2 Stochastic modelling There are a number of literatures in the form of textbooks that provides full details for the background of probability theory and Stochastic calculus for example see Arnold (1974), Øksendal (23), Gihman (1972), Kloeden (1999). The main definitions discussed in this chapter are taken from the above textbooks Probability Models Stochastic calculus is concerned with the study of stochastic processes, which models the uncertainties. Probability models can be used to model uncertainty. The basic object in a probability model is a probability space, which is a triple (Ω, F, P ) consisting of a set Ω, 5

6 6 CHAPTER 1. INTRODUCTION usually denoted as the sample space, a σ-field F of subsets of Ω and a probability P defined on F. The set Ω can be considered as the set of all possible scenarios that can occur. To any event we can associated the subset A Ω consisting of all scenarios at which the event occurs. Such a subset will also be denoted as an event and F is the collection of all events. From a mathematical point of view, it is important to consider only collections of events that have the structure of a σ-field Definitions Definition 1 A collection F of subsets of a set Ω is called a σ-field if 1. Ω F; 2. if A F, then A c = Ω \ A F; 3. if (A n ) is a sequence in F, then n=1 A n F. A measurable space is a pair (Ω, F), where Ω is a set and F a σ-field of subsets of Ω. As an example, the collection P(Ω) of all subsets of Ω is a σ-field. Definition 2 A probability P defined on a σ-field F is a map from F to the interval [, 1] such that 1. P (Ω) = 1; 2. P ( n=1 A n) = n=1 P (A n) for any pairwise disjoint sequence (A n ) F. Pairwise disjoint means that A i A j = for i j. Definition 3 A random variable A random variable is a real function X(ω), (ω Ω is measurable with respect to a probability measure P. That X : Ω R Definition 4 Distribution function The probabilistic behaviour of X(ω) is completely and uniquely specified by the Distribution function F (x) = P ({ω Ω : X(ω) < x}. Definition 5 Continuous random variable X(ω) is continuous random variable if there exist f(x) (the density function) such that f(x), f(x)dx = 1, F (x) = x f(u)du. By the way, random variables can have different distribution functions, for example Poisson, Exponential, or Gaussian and so on. They can take widely varying values.the moments of a random variable defines various characteristics of its distribution.

7 1.2. STOCHASTIC MODELLING 7 Definition 6 Expectation (mean) of a random variable If X is a random variable defined on the probability space (Ω, F, P ), then the expected values or the mean of X is E(X) = Ω XdP. This is the average of X over the entire probability space. For a random variable continuous over R E(X) = xf(x)dx Definition 7 Variance Variance is a measure of the spread of data about the mean µ The standard deviation is σ = Var(X). Var(X) = E((X µ) 2 ) = E(X) 2 µ 2 Definition 8 The k th -order moment The k th -order moment of a continuous random variable is defined by µ K = E(X k ) = x k f(x)dx The expectations satisfy various properties such as that of linearity and so forth see Øksendal (23),Jazwinski 197, for example. Definition 9 Gaussian random variable A random variable X is Gaussian random variable if its has the Gaussian (or Normal) density function is given by f(x) = 1 σ 2π exp ( (x µ) 2 The function f(x) is bell shaped if x = µ and stretched or compressed according to the magnitude of σ 2 see Figure 1.1 (a)- (b) and the maximum value is at 1 σ 2π see where µ is the mean and σ 2 is the variance of the normal Distribution N(µ, σ 2 ). When µ = and σ = 1, the distribution N(, 1). is known as the standard Gaussian distribution. 2σ 2 ),

8 8 CHAPTER 1. INTRODUCTION.4 Gaussian density function with σ=1.8 Gaussian density function with σ= p(x).2 p(x) x x (a) µ =, σ = 1 x = 4 : 4 (b) µ = and σ = 1 x = 4 : 4 Figure 1.1: (a) The maximum value is at is at p() =.3989 (b) the maximum value is at is at p() =.7979

9 1.2. STOCHASTIC MODELLING 9 Definition 1 Covariance The covariance of two random variables X and Y is defined to be Cov(X, Y ) = E((X µ 1 )(Y µ 2 )) = E(XY ) E(X)E(Y ) where µ 1 = E(X) and µ 2 = E(Y ). Let us now consider the convergence of random variables. Let X and X n, n = 1, 2,... be real-valued random variables defined on a probability space (Ω, F, P ). The distribution functions of X and X n are F and F n respectively. The convergence of the sequence X n to X has various definitions depending on the way in which the difference between X n and X is measured. Let us look at the following definitions Definition 11 Convergence with probability one (w.p.1), that is a.s A sequence of random variable {X n (ω)} converges with probability one to {X(ω)} if ( ) P {ω Ω : lim X n (ω) = X(ω)} = 1 n This is also called almost sure convergence. Definition 12 Convergence in mean square sense A sequence of random variable {X n (ω)} such that E(X 2 n(ω)) < for all n converges in mean square to {X(ω)} if lim E ( X n X 2) = n Definition 13 Convergence in distribution A sequence of random variable {X n (ω)} converges in probability (or stochastically) to X if lim F n(x) = F (x), x R. n Definition 14 Convergence in probability A sequence of random variable {X n (ω)} converges in probability (or stochastically) to X if lim P ({ω Ω : X n(ω) X(ω)) ε) = n Definition 15 Stochastic processes A stochastic process is a family of random variables X(t, ω) of two variables t T, ω Ω on a common probability space (Ω, F, P ) which assumes real values and is P -measurable as a function of ω for a fixed t. The parameter t is interpreted as time, with T being a time interval X(t, ) represents a random variable on the above probability space Ω, while X(, ω) is called a sample path or trajectory of the stochastic process.

10 1 CHAPTER 1. INTRODUCTION Definition 16 Stationary process A stochastic process X(t) such that E( X(t) 2 ) <, t T is said to be stationary if its distribution is invariant under time displacements: F X1,X 2, X n (t 1 + h, t 2 + h, t n + h) = F X1,X 2, X n (t 1, t 2, t n ). That is all finite dimensional distributions of X are invariant under an arbitrary time shift. If X is a stationary, then the finite dimensional distributions of X depend on only the lag between the times {t 1,... t n } rather than their values. In other words, the distribution of X(t) is the same for all t T Definition 17 A continuous -time stochastic process X = {X(t), t } is called a Markov process if it satisfies the Markov property, i.e., P r (X(t n+1 B X(t 1 ) = x 1,..., X(t n ) = x n ) = P r (X(t n+1 B X(t n ) = x n ) That is, the future behaviour of the process depends on the past only through the current process. For all Borel subsets B of R, time instances < t 1 < t 2..., t n < t n+1 and all states x 1, x 2,..., x n R for which the conditional probabilities are defined. Let X be a Markov process and write its transition probabilities as P (s, x; t, B) = P r (X(t) B X(s) = x), s < t if the probability distribution P r is discrete, the transition probabilities are uniquely determined by the transition matrix with components P (s, i; t, j) = P r (X(t) = x j X(s) = x i ) That is the probability of moving from state i at time s to state j at time t, states can simply be taken as values of a random variable X. For continuous case we have; P (s, x; t, B) = f (s, x; t, y) dy for all B B, where the density f(s, x; t, ) is called the transition density. A Markov process is said to be homogeneous if all its transition probabilities Markov processes depend only on the time difference t s rather than on specific values of s and t. Definition 18 A stochastic process X is called an {F t }-martingale if the following conditions hold 1. X is adapted to the filtration {F t } t B

11 1.2. STOCHASTIC MODELLING For all t E [ X(t) ] < 3. For all s and t with s t the following relation E [ X(t) F s ] = X(s), s t {F t } t Note: if E [ X(t) F s ] X(s), s t then X is said to be {F t }-supermartingale while if E [ X(t) F s ] X(s), then X is said to be {F t }-submartingale. The condition (1) says that we can observe the value X(t) at time t, and condition (2) is a technical condition (to mean integrability), The really important condition is the third. It means that the expectation (estimation) of a future value of X t, given the information available today F s, equals today s observed value X s, i,e t s.

12 12 CHAPTER 1. INTRODUCTION

13 Chapter 2 Conditional Probability and Expectation In this section we will give a review of conditioning, conditional probability and conditional expectation. From first courses in Statistics we know the definition of the conditional probability of the event B given the occurrence of the event A: P (B A) = P (B A). P (A) Here it is required that P (A) >. So, if X is a random variable with a probability density f, i.e. P (a X b) = b a f(x)dx, the definition of the conditional probability cannot applied if we condition on the event A = {X = a}. In first courses in Statistics one usually defines this conditional probability by using a limit argument as follows. Consider a pair of random variables (X, Y ) with joint probability density f X,Y, i.e. P ((X, Y ) G) = f X,Y (u, v) dudv. It follows that P (Y b a X a + ) = G b a+ f a X,Y (u, v) dudv f X,Y (u, v) dudv. a+ Assuming that the joint density is smooth, we can calculate the limit as : b lim P (Y b a X a + ) = f X,Y (a, v) dv f X (a) 13 a

14 14 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION where f X (a) = f X,Y (a, v) dv denotes the (marginal) density of the random variable X. The function f Y X=a (v) = f X,Y (a, v) f X (a) is a probability density and it is called the conditional density of Y given X = a. Using this density, the conditional expectation of Y given X = a is defined as E(Y X = a) = vf Y X=a (v) dv = v f X,Y (a, v) f X (a) So, we define E(Y X) as a random variable. The value it takes depends on the value of X : E(Y X) = E(Y X = a) on {X = a}. It follows that for bounded Borel functions φ E[φ(X)E(Y X)] = = = φ(a)e(y X = a)f X (a) da φ(a) = E[φ(X)Y ]. v f X,Y (a, v) f X (a) φ(a)vf X,Y (a, v) dv da dv. dvf X (a) da It is this property, that we will use as the defining property of the conditional expectation in a more general set-up where we don t have to assume the existence of probability densities. Let (Ω, F, P ) be a probability space and let G F be a sub-σ-algebra. Let X be an integrable random variable, i.e. E X <. Define the set function Q on G as follows Q(G) = X dp. G It follows that Q is a measure on (Ω, G) which is absolutely continuous with respect the restriction of P to G : P (G) = = Q(G) =. The Radon-Nikodym Theorem implies the existence of a density of Q with respect to P, i.e. a G-measurable random variable Y such that Q(G) = Y dp, G

15 15 or more general Z dq = ZY dp, for any nonnegative, G-measurable random variable Z. The density Y is unique modulo P -null-sets and is sometimes denoted as a derivative: Y = dq dp. Definition 19 The conditional expectation E(X G) of X given G is defined as the G- measurable random variable satisfying the relation X dp = E(X G) dp, for any G G. G G To see what this means we consider the special case where G = {, A, A c, Ω} and X = 1 B. It follows from the definition that E(1 B G) = P (B A)1 A + P (B A c )1 A c. More general, let G be a finite sub-σ-algebra. Then, there exists a (unique) G-measurable partition A = {A 1,..., A n } such that every element of G can be represented as a union of partition elements. Every G-measurable random variable Y is constant on the partition elements and can be represented as Y = n y i 1 Ai. So, in particular, there exist real numbers x i such that Now, hence and i=1 E(X G) = n x i 1 Ai. i=1 X dp = A j E(X G) dp = A j A j n x i 1 Ai dp = x j P (A j ), i=1 x j = 1 X dp =: E(X A j ), P (A j ) A j E(X G) = n E(X A i )1 Ai. i=1

16 16 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION If G and X are independent, then for any G G, X dp = E(X1 G ) = E(X)E(1 G ) = G G E(X) dp, so E(X G) = E(X). In the next Theorem we present a number of properties of the conditional expectation. Theorem 1 (a) If G = {, Ω}, then E(X G) = E(X), (b) if Z is bounded and G-measurable, then E(XZ G) = ZE(X G), (c) if G 1 G 2 then E(E(X G 2 ) G 1 ) = E(X G 1 ), (d) if g is a convex function on the range of X, then g(e(x G)) E(g(X) G), (e) if X n, and X n X, then E(X n G) E(X G), (e) if X n, then E(lim inf n X n G) lim inf n E(X n G), (f) if lim n X n = X almost surely and X n Y with E(Y ) <, then lim E(X n G) = E(X G). n The conditional expectation E(X G) can be considered as a projection of the random variable X on the space of G-measurable random variables as follows. Let X L 2 (Ω, F, P ), the vector space of (equivalence) classes of square integrable random variables. With the inner product X, Y = E(XY ), L 2 (Ω, F, P ) is a Hilbert space. It follows from Theorem 1(d) that Hence, the map E( E(X G) 2 ) E(E(X 2 G)) = E(X 2 ) <. X L 2 (Ω, F, P ) E(X G) L 2 (Ω, G, P ) is a linear contraction. It is the orthogonal projection of L 2 (Ω, F, P ) on L 2 (Ω, G, P ).

17 2..3 More Properties of Conditional Expectation Definition 2 Condition Expectation Let (Ω, F, P ) be a probability space and let G be a sub-σ-algebra of F. Let X be a random variable on (Ω, F, P ). Then E [X G] is defined to be any random variable Y that satisfies (a) Y is G -measurable. (b) For every A G we have the partial averaging property Y dp = XdP A 1. (Role of independence property): If a random variable X is independent of a σ- algebra H, then A 17 E [X H] = E[X] (2.1) The point of this statement is that if X is independent of H, then the best estimate of X based on the information in H is E[X], the same as the best estimate of X based on no information. 2. (Measurable property): If a random variable X is G -measurable, then E [X G] = X (2.2) The point of this statement is that if the information content of G is sufficient to determine X, then the best estimate of X based on information G is X itself. 3. (Tower property): If H is a sub -σ- algebra of G then E [E (X G) H] = E [X H] (2.3) The point of this statement is that if H is a sub -σ- algebra of G mean that G contains more information than H. If we estimate X based on the information in G, and then estimate the estimator based on the small amount of information in H, then we get the same results as if we had estimated X directly based on the information in H. 4. (Taking out what is known): If Z is G-measurable, then E [ZX G] = Z E [X G] (2.4) The point of this statement is that when conditioning on G, the G -measurable random variable Z acts like a constant. So we take it out of the expectation sign.

18 18 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION 2.1 Stochastic Processes Let (Ω, F, P ) be a probability space. A collection X = (X t : t T ) of random variables on (Ω, F, P ) is called a stochastic process. The variable t is usually considered as a time parameter and X t denotes the state of a system at time t. For a fixed state of the world ω Ω, the function t [, ) X t (ω) is called a sample path. The probability distribution of the process is a probability measures on the function space R [,T ] of all sample paths and it is determined by the finitedimensional distributions, i.e. the probabilities P (X(t 1 ) B 1,..., X(t n ) B n ) for any finite sequence t 1 t n T and Borel sets B 1,..., B n. Two stochastic processes X and Y with the same finite-dimensional distributions are identified and we say that X and Y are versions (or modifications) of one another. Under certain conditions, we can show the existence of a version with sample paths satisfying certain regularity properties. For example, if there exist α > and ɛ >, so that for any u t T, E( X t X u α ) C(t u) 1+ɛ, (2.5) for some constant C, then there exists a version of X with continuous sample paths. The σ-algebra F t = σ(x u, u t), t T, denotes the information available at time t to an observer of the process X. The increasing collection of σ-algebras (F t ) t T is called a filtration. A stochastic process Y = (Y t : t T ) defined on the same probability space (Ω, F, P ) is said to be adapted to the filtration (F t ) t T if, for any t, Y t is F t -measurable. An (F t )-adapted process Y is called a martingale, if, for any t, Y t is integrable and, for any s t T, E(Y t F s ) = Y s. 2.2 The Gaussian Distribution A random variable U has a standard normal distribution if its probability density is given by f(u) = 1 2π e 1 2 u2.

19 2.2. THE GAUSSIAN DISTRIBUTION 19 The characteristic function of a standard normal random variable is given by It follows that φ(t) = Ee itu = e 1 2 t2, t R. E[U 2k+1 ] = and E[U 2k ] = (2k)! k!2 k. A random variable X has a Gaussian distribution with mean µ and variance σ 2 if X = µ + σu, where U is a standard normal random variable and σ >. A standard normal random variable is Gaussian with mean and variance 1. The probability density of X is given by f(x) = 1 2πσ e 1 2( x µ σ ) 2. The characteristic function of a Gaussian distribution with mean µ and variance σ 2 is given by φ(t) = e iµt 1 2 σ2 t 2, t R. A random n-vector is a measurable mapping defined on some probability space (Ω, F, P ) taking values in a finite-dimensional vector space R n. A random vector X can be represented as a column vector X 1 X =. X n where the components X i are random variables. If the components X i have finite first moments, then the mean of the random vector is the column vector µ 1 E[X 1 ] µ =. =.. E[X n ] µ n If the second moments of the components are also finite, then the covariance matrix of the random vector is the n n-matrix Σ = (σ ij ) 1 i,j n where σ ij = Cov(X i, X j ). Note that a covariance matrix is a symmetric matrix, i.e. σ ij = σ ji. The covariance matrix of a non-degenerate random vector is non-negative definite:, a, Σa = Var( a, X ) >

20 2 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION for all a R n \ {}. A random n-vector X has a (non-degenerate) n-dimensional Gaussian distribution with mean vector µ and covariance matrix Σ if there exists a n n-matrix A such that det(a) and X = µ + AU, where U is a random n-vector with independent standard normal components. It follows, as in the case of Gaussian random variables, that the matrix A can be considered as the square root of the covariance matrix Σ: Σ = AA T. The characteristic function of the n-dimensional Gaussian distribution is given by φ(t) = e i t,µ 1 2 t,σt, t R n. Let B be a k n matrix, then the characteristic function of the random k-vector BX is φ BX (t) = e i t,bµ 1 2 t,bσbt t, t R k, and we see that BX is a Gaussian k-vector with mean Bµ and covariance matric BΣB T. The joint density of a non-degenerate Gaussian n-vector with mean µ and covariance matrix Σ is given by ( ) n 1 f(x 1,..., x n ) = det(σ 1 )e 1 2 x µ,σ 1(x µ). 2π 2.3 Wiener Process The concept of Wiener process is needed in order to model the Gaussian disturbances. A stochastic process W = (W (t) : t ) defined on some filtered probability space (Ω, F, (F t ) t, P ) is a Wiener process if 1. W is adapted to the filtration (F t ) t, 2. W() =, 3. The process W has independent increments if r < s u t then W (t) W (u) and W (s) W (r) are independent variables. 4. W has independent increments, i.e. if s t then W (t) W (s) is independent of F s, 5. For s < t, the stochastic process W (t) W (s) had Gaussian distribution with N[, t s] 6. The trajectories (paths) t [, ) W (t, ω) of W are continuous functions.

21 2.3. WIENER PROCESS 21 It follows immediately that the Wiener process is a martingale with respect to the filtration (F t ): E(W (t) F s ) = W (s), for s t. For any finite sequence < t 1 < < t n, the increments W (t n ) W (t n 1 ),..., W (t 2 ) W (t 1 ), W (t 1 ) are independent, Gaussian random variables, hence W (t 1 ) W (t 2 ) t1... U 1. = t1 t2 t 1... U 2...., W (t n ) t1 t2 t 1... tn t n 1 U n where U i = (W (t i ) W (t i 1 )/ t i t i 1, i = 1,..., n. It follows that the n-vector W (t 1 ) W (t 2 ). W (t n ) is Gaussian with mean and covariance matrix t 1 t 1 t 1... t 1 t 1 t 2 t 2... t 2 Σ = t 1 t 2 t 3... t t 1 t 2 t 3... t n The Wiener process is called a Gaussian process, since all finite-dimensional distributions are Gaussian. Since E( W (t) W (u) 4 ) = 3(t u) 2, the existence of a continuous version follows from formula (2.5) in section 2.1 with α = 4 and ɛ = 2. However, almost all sample paths of the Wiener process are nowhere-differentiable. We will not give a rigorous proof here, but note that (W (t + h) W (t))/ h is standard normal for every value of h >. So if we consider the ratio (W (t + h) W (t))/h and let h tend to, we see that the variance of this ratio will become arbitrarily large, and so we would never expect the existence of a limit of the ratio for every ω, which would have to be the case to have a time derivative of W t (ω). We now consider a further property of the sample paths of the Wiener process. Define for any t > the p-variation of the sample path by V p (t) = lim n 2 n i=1 W (t n i ) W (t n i 1) p,

22 22 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION where t n i = i t, i =, 1,..., 2 n is a partition of [, t]. Then 2 n if 1 p < 2 V p (t) = t if p = 2 if p > 2. To see this, define S n = 2 n i=1 W (t n i ) W (t n i 1) 2. Since E[S n ] = 2 n i=1 E( W (t n i ) W (t n i 1) 2 ) = 2 n t 2 n = t and Var[S n ] = 2 n i=1 V ar( W (t n i ) W (t n i 1) 2 ) = 2 n t2 t2 2 = 22n 2, n 1 it follows by monotone convergence, { } { N } E (S n t) 2 = lim E (S n t) 2 = lim N N Note that But n=1 n=1 N V ar(s n ) = 2t 2 <. n=1 E { (S n t) 2} = ( S 2 n 2tS n + t 2) E { (S n t) 2} = ( E(S 2 n) 2tE(S n ) + t 2) E { (S n t) 2} = ( E(S 2 n) 2t 2 + t 2) Var(S n ) = E[S 2 n] E[(S n ] 2 Therefore; E[S 2 n] = Var(S n ) + E[(S n ] 2 = t2 + t2 2n 1 = 2t2 2 n + t2

23 2.3. WIENER PROCESS 23 n=1 E { (S n t) 2} = ( E(S 2 n) 2t 2 + t 2) n=1 = 2t2 2 n + t2 2t 2 + t 2 = 2t2 2. n Putting in the summing up to N we get N N 2t 2 = lim V ar(s n ) = lim N N 2 = n 2t2 lim N N n=1 1 2 n = 2t2 1 <. It follows that the random variable n=1 (S n t) 2 is finite a.s. which implies that Let 1 p < 2, and assume that Then 2 n i=1 V p (t) = lim n V 2 (t) = lim n S n = t, 2 n i=1 a.s. W (t n i ) W (t n i 1) p = L <. W (t n i ) W (t n i 1) 2 max W (t n i ) W (t n i i 1) 2 p 2 n i=1 (L + 1) max i W (t n i ) W (t n i 1) 2 p, W (t n i ) W (t n i 1) p by uniform continuity. This is a contradiction, so V p (t) =. In the same we find V p (t) = for p > 2. Since V 1 =, we cannot define Stieltjes integration with respect to a path of the Wiener process. To understand the problem, we will consider the integral Let ξ n i t [t n i 1, t n i ). Define the Riemann sums R n = 2 n i=1 W (s)dw (s). W (ξ n i )(W (t n i ) W (t n i 1)). To study the limit behaviour of R n, we write R n as sums of terms that are squares of increments or products of increments over disjoint intervals as follows: R n = 1 2 W 2 (t) 1 2 W 2 () n i=1 2 n i=1 (W (t n i ) W (t n i 1)) 2 (W (ξ n i ) W (t n i 1))(W (t n i ) W (ξ n i )) + 2 n i=1 (W (ξ n i ) W (t n i 1)) 2.

24 24 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION Since n i=1 (W (tn i ) W (t n i 1)) 2 V 2 (t) = t, 2. put T n = 2 n i=1 (W (ξn i ) W (t n i 1))(W (t n i ) W (ξ n i )), then E(T 2 n) = 2 n i=1 (ξ n i t n i 1)(t n i ξ n i ) t2 2 2n, hence E( n T 2 n) < and T n a.s.. 3. put U n = 2 n i=1 (W (ξn i ) W (t n i 1)) 2, then E(U n ) = 2 n (ξ n i t n i 1) and Var(U n ) = 2 2 n i=1 i=1 (ξ n i t n i 1) 2 2t 2 2 2n. It follows that E ( n (U n (U n )) 2 ) < and U n (U n ), a.s. If we choose ξ n i = (1 λ)t n i 1 + λt n i, λ 1, it follows that the Riemann sums converge: lim R n = 1 n 2 W 2 (t) + (λ 1 2 )t. So the integral t W (s)dw (s) depends on the point in which we evaluate the integrand. The choice λ =, i.e. ξi n = t n i 1, leads to the Itô-integral. The choice λ = 1/2, i.e. ξi n = (t n i 1 + t n i )/2, leads to the Stratonovich integral Random walk Construction Here the focus is on the simulations of the values of a Wiener process (W (t 1 ),... W (t n ) at fixed set of points < t 1 <,... < t n. Because Wiener has independent normally distributed increments, simulation the W (t i ) from their increments is straight forward. Let U 1,... U n be independent standard normal random variables, generated using any of the methods for a standard Wiener processes. we set t = and W () =. Subsequent values can be generated as follows; W (t i+1 ) = W (t i ) + t i+1 t i U i+1, for i =, 1, 2,... n 1. For X BM(µ, σ 2 ) with constants µ and σ and given X(), set X(t i+1 ) = X(t i ) + µ t i+1 t i + σ t i+1 t i U i+1, for i =, 1, 2,... n 1.

25 2.3. WIENER PROCESS 25.8 dt= W(t) t Figure 2.1: Sample path of Wiener the sample path of the Wiener is continuous, that is, its sample paths are almost surely, continuous function of time. We can see from the fact that for s t we have Var (W (t) W (s)) = E [ (W (t) W (s)) 2] E [(W (t) W (s))] 2 = t s its mean and the variance grows without bound as time increases while the mean always remain zero, which means that many sample paths must attain larger and larger values, both positive and negative as time increases. However, almost all sample paths of the Wiener process are nowhere-differentiable. We will not give a rigorous proof here, this is easy to see in the mean-square sense [ (W ) ] 2 (t + h) W (t) E = E [ (W (t + h) W (t)) 2] = h h h 2 h 2 The properties E(W (t)(w (s)) = min(s, t) can be used to demonstrate the independence of the Wiener increments. Let us assume that the time interval : t <... < t i 1 < t i <... t j 1 < t j... < t n, thus E[(W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] = E(W (t i )W (t j )) E(W (t i )W (t j 1 )) E(W (t i 1 )W (t j )) + E(W (t i 1 )W (t j 1 )) = t i t i t i 1 + t i 1 =. Where the increments (W (t i ) W (t i 1 )) and (W (t j ) W (t j 1 )) are independent.

26 26 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION The mean and variance of the Wiener process W t (see Figure 2.2)can be determined by using the following program. clear dt=1;n=1; t=zeros(n,1);mean=zeros(n,1); var=zeros(n,1); for i=1:n t(i)=i; end for j=1:n Rn=randn(N,1); W=zeros(N,1); W(1)=; for i=2:n W(i)=W(i-1)+sqrt(dt)*Rn(i); end mean=mean+w; var=var+w.*w; end mean=mean/n;var=var/n; plot(t,mean,.-,t,var, - ); set(gca, FontName, Times New Roman, FontSize,16); x=xlabel( time t ); set(x, FontName, Times New Roman, FontSize,16); y=ylabel( Mean and Variance ); set(y, FontName, Times New Roman, FontSize,16); title( Mean and Variance of 1 samples of standard Wiener Process ) legend( Mean, Variance,) 12 1 Mean and Variance of 1 samples of standard Wiener Process Mean Variance 1 8 Mean and Variance of 1 samples of standard Wiener Process 8 Mean and Variance Mean and Variance Mean Variance time t (a) 1 samples used time t (b) 1 samples Figure 2.2: The mean and variance of the Wiener processes generated using different number of samples In this example, 1 and 1 are used respectively to determine their mean and variance see Figure 2.2 (a). It shows that the standard Wiener process has mean zero and variance

27 2.3. WIENER PROCESS 27 t. Note that, for large samples, the variance is expected to be a straight line as growing linearly with time Figure 3 (b) where 1 samples have been used. Thus Var(W t ) = t and so this processes increases as time increases even though the mean stays at see Figure 2.2. Because of this, typical sample paths of the Wiener processes attain larger values in magnitude as time progresses, and consequently the sample paths of the Wiener process are not of bounded variation (Øksendal 23) for example. That is the reason why the stochastic integral cannot be considered as Riemann-Stieljets integral. Note that more general have martingales property can be used, but we here use Wiener processes W (t). Wiener process is Martingale. For each t, W (t) is F t measurable, and F t contains exactly the information learned by observing the Wiener process up to time t. {F t } t is called filtration generated by the Wiener process. Example 1 If X t = W 2 t t show that the process X t is a martingale. W 2 t t is a martingale Example 2 Use the fact that to calculate E[W t W 3 s ] E[X t F s ] = E[Wt 2 t F s ] = E[(Wt 2 F s ] E[t F s ] = t s + Ws 2 E[t] = t s + W 2 s t = W 2 s s E[Wt 2k ] = (2k)!tk and E[W 2k+1 k!2 k t ] = Solution 1 E[W t Ws 3 ] = E[(W t W s + W s ) Ws 3 ] = E [ ] Ws 3 (W t W s ) + Ws 4 = E [ ] Ws 3 + E (Wt W s ) + E [ ] Ws 4 = + 3s 2

28 28 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION Generation of a stochastic process Consider the function x(w) = e bw and the Wiener process W t. We can now define another stochastic process X t = e bwt. This can be achieved for example by using the following program. This program, the realisation of W t and X t are generated and plotted for b =.3. clear dt=.1; b=.2; N=1; randn( seed,134) Rn=randn(N,1);t=zeros(N,1);W=zeros(N,1); W=; t(1)=1; W(1)=W; for i=2:n t(i)=i; W(i)=W(i-1)+sqrt(dt)*Rn(i); end ew=exp(b*w); plot(t,w, -,t,ew,.-. ); set(gca, FontName, Times New Roman, FontSize,16); x=xlabel( time t ); set(x, FontName, Times New Roman, FontSize,16); y=ylabel( W(t),X(t) ); set(y, FontName, Times New Roman, FontSize,16); title( Generation of a Stochastic Process ) legend( W(t), X(t),) 2 1 Generation of a Stochastic Process W(t) X(t) 1 W(t),X(t) time t Figure 2.3: Generation of a stochastic processes driven by Wiener noise

29 2.3. WIENER PROCESS 29 Wiener process is an example of a homogeneous Markov processes and is a diffusion process for which the transition density is ) 1 (y x)2 p(s, x; t, y) = exp ( 2π(t s) 2(t s) (2.6) Diffusion processes The transition density of the standard Wiener process is a smooth function of its variables for t > s satisfies the following PDEs; p t 1 2 p =, (s, x) fixed (2.7) 2 y2 p s p =, (t, y) fixed (2.8) 2 x2 The first equation is an example of a heat equations which describes the variations in temperature as heat passes through a physical medium. The standard Wiener process serves as a prototypical example of a (stochastic) diffusion process. Diffusion processes, which we now define in one dimensional case, are a rich and useful class of Markov processes. Definition 21 Diffusion process A Markov process with transition densities p(s, x; t, y) is called a diffusion if the following three limits exist: (i) For any for all ε >,s and x R 1 lim t s t s x y >ε p(s, x; t, y)dy =, (2.9) the condition (2.9) tells that it is very unlikely that the process X(t) undergoes large changes in a short period of time. (ii) There exist functions µ(x, t) and σ(x, t) such that for all ε >, t [, T ] and x (, ) (a) 1 lim (y x)p(s, x; t, y)dy = µ(s, x), (2.1) t s t s y x <ε a(s, x) represents the average velocity vector of the random process X(t).

30 3 CHAPTER 2. CONDITIONAL PROBABILITY AND EXPECTATION (b) 1 lim (y x) 2 p(s, x; t, y)dy = σ 2 (s, x), (2.11) t s t s y x <ε while in condition (2.11), σ 2 (s, x) measures the local magnitude of the fluctuations of X(t) X(s) about the mean values. Where µ and σ are well-defined functions. Exercise 1 Check to see whether the following processes X t are martingales w.r.t F s : (i) X t = W t + 4t (ii) X t = W 2 t (iii) X t = t 2 W t 2 t sw sds (iv) X t = W 1 (t), W 2 (t), where B 1 (t), B 2 (t), is 2-dimension Brownian motion. (v) X t = W 3 t 3tW t

31 Chapter 3 Stochastic Integrals We say that a stochastic process X is a diffusion it its local dynamics can be approximated by a Stochastic differential equation of the type. X(t + t) X(t) = µ(t, X(t)) t + σ(t, X(t))Z(t) (3.1) Example of a processes that can be described by diffusion processes are Asset prices, position of a moving particle such as a pollutant in air or water. Where Z(t) is a normally distributed disturbance term which is independent of everything which has happened up to time t. Intuitively equation (3.1) is that over the interval [t, t + t], the process X is driven by following two separate terms µ(t, X(t)) is a locally deterministic velocity which is called the drift term σ(t, X(t)) is a locally deterministic amplification factor of the Gaussian disturbance Z(t) which is called the diffusion term. In equation (3.1) we replace the disturbance term Z(t) by W (t) = W (t + t) W (t) where W = (W (t) : t ) is the Wiener process and get X(t + t) X(t) = µ(t, X(t)) t + σ(t, X(t)) W (t). (3.2) How to interpret equation (3.2)? 1. Fix ω and divide by t and let t tend to. We obtain: Ẋ(t, ω) = µ(t, X(t, ω)) + σ(t, X(t, ω))v(t, ω) where v(t, ω) is the time derivative of a path of the Wiener process. This Stochastic differential equations can be solved in principle. But a path of the Wiener process is nowhere differentiable so this does not work. 31

32 32 CHAPTER 3. STOCHASTIC INTEGRALS 2. Fix again ω and let t tend to without dividing by t : dx(t, ω) = µ(t, X(t, ω))dt + σ(t, X(t, ω))dw (t, ω). Interpret this equation as a shorthand version of the integral equation X(t, ω) X(, ω) = t µ(s, X(s, ω))ds + t σ(s, X(s, ω))dw (s, ω). The first integral can be interpreted as a Riemann integral and the second as a Riemann-Stieltjes integral. But a path of the Wiener process is of unbounded variation so this does not work either. We have seen so far that there are problems with an interpretation to equation (3.2) for each trajectory of the Wiener process separately. Therefore, we will give a global construction for integrals of the form: T g(s)dw (s), (3.3) for a class of (F t )-adapted integrands g = (g(t)) t, also defined on Ω. Consider first a simple, non-random integrand: { c if t = n g(t) = c i i if t i 1 < t t i, i = 1,..., n = c 1 (t) + c i 1 1 (ti 1,t i ](t), where = t < t 1 <..., t n = T and c, c 1,..., c n 1 R. Define: T g(s)dw (s) = i=1 n c i 1 (W (t i ) W (t i 1 )). i=1 ( T ) ( n ) Var g(s)dw (s) = Var c i 1 (W (t i ) W (t i 1 )). i=1 ( T ) n Var g(s)dw (s) = Var (c i 1 (W (t i ) W (t i 1 ))). ( T Var ) g(s)dw (s) = i=1 n c 2 i 1Var (W (t i ) W (t i 1 )). i=1 So the outcome of the integral is a random variable defined on Ω with mean and variance ( T ) Var g(s)dw (s) = n c 2 i 1(t i t i 1 ). i=1

33 Example 3 Let g(t) = 2 for t 1, g(t) = 2 for 1 < t 2, and g(t) = 3 and 2 < t 3 Then (note that t i =, 1, 2, 3, c i 1 = g(t i ), c = 2, c 1 = 2, c 2 = 3). Find the mean and variance of the following integral: 33 Solution 2 3 g(t)dw (t) = c (W (1) W ()) + c 1 (W (2) W (1) + c 2 (W (3) W (2))) (3.4) = 2W (1) + 2 (W (2) W (1) + 3 (W (3) W (2))) (3.5) The distribution of the integral (3.4) is N[, 17] comes from the direct sum of N[, 4] + N[, 4] + N[, 9]. In order to get random process it is vital to replace the constants c i 1 with random variables ξ i 1 and let the random variable ξ i 1 depend on the values W (t) for t t i 1 but not on future values of W (t) for t > t i 1. To get an adapted integrand, we have to assume that if F t is the σ-field generated by Brownian motion up to time t, then ξ i 1 is F ti 1 -measurable: g(t) = ξ 1 (t) + n ξ i 1 1 (ti 1,t i ](t). i=1 Define, as before T g(s)dw (s) = n ξ i 1 (W (t i ) W (t i 1 )). i=1 We get [ T ] E g(s)dw (s) = n E [ ] ξ i 1 (W (t i ) W (t i 1 )) F ti 1. i=1 Using the F ti 1 -measurability of ξ i 1 and the independent increments property, we get E[ξ i 1 (W (t i ) W (t i 1 ) F ti 1 )] = ξ i 1 E[W (t i ) W (t i 1 )] =, and it follows that [ T ] E g(s)dw (s) =.

34 34 CHAPTER 3. STOCHASTIC INTEGRALS If we also assume that E[ξi 1] 2 < we get that [ ( T ) 2 ] E g(s)dw (s) = 2 = 2 = 2 = 2 = 2 = = n E [ξ i 1 (W (t i ) W (t i 1 ))ξ j 1 (W (t j ) W (t j 1 ))] i=1 + j<i n E [ ξk 1(W 2 (t k ) W (t k 1 )) 2] k=1 n E [ξ i 1 ξ i j (W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] i=1 + j<i n E [ ξk 1(W 2 (t k ) W (t k 1 )) 2] k=1 n E [ξ i 1 ξ i j (W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] i=1 + j<i n EE [ ( )] ξk 1 2 (W (tk ) W (t k 1 )) 2 F ti 1 k=1 n E [ξ i 1 ξ i j (W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] i=1 + j<i n E [ ξk 1E ( )] 2 (W (t k ) W (t k 1 )) 2 F ti 1 k=1 n E [ξ i 1 ξ i j (W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] i=1 + j<i n E [ ξk 1E ( 2 (W (t k ) W (t k 1 )) 2)] k=1 n E [ ξk 1] 2 (t n k t n k 1) k=1 T E[X 2 (s)]ds. (3.6) The results above has been obtained by similar conditioning to the first sum for i < j we get E [ξ i 1 ξ i j (W (t i ) W (t i 1 ))(W (t j ) W (t j 1 ))] = Note also that the integral is linear: T (αx(t) + βy (t))dw (t) = α T X(t)dW (t) + β T Y (t)dw (t).

35 To complete the construction of the integral, we introduce the space 2 [, T ] of (equivalence classes of) (F t )-adapted processes g = (g(t)) t satisfying T E[g(s) 2 ]ds <. For a general process g 2 [, T ] which is not simple we can find a sequence (g n ) of simple processes such that 35 Since T E[{g n (s) g(s)} 2 ]ds. it follows that {g n (s) g m (s)} 2 2{g n (s) g(s)} 2 + 2{g(s) g m (s)} 2, T E[{g n (s) g m (s)} 2 ]ds. Now g n g m is simple, so formula (3.6) implies T E[{g n (s) g m (s)} 2 ]ds [ ( T ) 2 ] = E (g n (s) g m (s))dw (s) [ ( T = E g n (s)dw (s) Define the random variable Z L 2 : It follows that Z n = T T g n (s)dw (s). E [ (Z n Z m ) 2]. ) 2 ] g m (s)dw (s). One can find a random variable Z L 2 such that Z n Z in L 2 : lim E [ (Z n Z) 2]. n We define T g(s)dw (s) = Z. If Z n Z in L 2, then lim n E[Z n ] = E[Z] and lim n E[Zn] 2 = E[Z 2 ]. It follows that [ T ] E g(s)dw (s) =,

36 36 CHAPTER 3. STOCHASTIC INTEGRALS and E [ ( T ) 2 ] T g(s)dw (s) = E[g(s) 2 ]ds. Remark. By this procedure we have defined the stochastic integral T g(s)dw (s) as a random variable in the space L 2 (Ω, F T, P ). It is in general not true that the sequence of random variables T g n(s)dw (s), n = 1, 2,... converges P -a.s.. Note that in the following stochastic integrals can be evaluated and shown to be; and b a W (t)dw (t) Itô = 1 ( W 2 (b) W 2 (a) ) + (λ 1 )(b a) (3.7) 2 2 b a W (t)dw (t) Str = 1 ( W 2 (b) W 2 (a) ) 2 It is known that that stochastic calculus is about systems driven by white noise. Integrals involving white noise may be expressed as: Y (T ) = T F (t)dw (t) an Itô integral when F is random but adapted. The Itô integral like Riemann has a definition as a certain limit. The fundamental theorem of of calculus allows people to evaluate the integral without coming back to the original definition. Itô s formula plays that role for Itô integral. Itô s formula has an extra term not present in the fundamental theorem that is due to the non smoothness of Brownian motion paths. If we let F t be filtration generated by Brownian motion up to time t, and F (t) F t be an adapted stochastic process. Corresponding to the Riemann sum approximate to Riemann integral we define the following approximation to the Itô integral Itô integral Y t (t) = F (t k ) W k (3.8) t k <t with the usual notation t k = n t and the forward difference if the limit exists, the Itô integral becomes W k = W (t k+1 W (t k ) Y (t) = lim t (t)

37 37 That is E [F (t k ) W k F tk ] = F (t k ) [W (t k+1 ) W (t k )] (3.9) t k <t Hence by taking condition expectation where F (t k ) is -F tk -measurable we get E [F (t k ) (W (t k+1 ) W (t k )) F tk ] (3.1) t k <t F (t k )E [(W (t k+1 ) W (t k )) F tk ] (3.11) t k <t Since wiener increment (W (t k+1 ) W (t k )) is independent of the information F tk, we get F (t k )E [(W (t k+1 ) W (t k ))] = (3.12) t k <t because the wiener increment has mean zero. Thus we have shown that E [F (t k ) W k F tk ] = (3.13) Note that it is essential that we use forward difference and not backward difference W k = W (t k W (t k 1 ) so that equation equals ZERO. Note that when you use W (t k+1 ) (W (t k+1 W (t k )) in the integral (3.8) you will not get ZERO. That is, by taking expectation of the following inside the summation sign we get E [W (t k+1 ) (W (t k+1 ) W (t k ))] = t. That is W (t k+1 ) = (W (t k+1 ) W (t k )) + W (t k ), it follows that E [[(W (t k+1 ) W (t k )) + W (t k )] (W (t k+1 ) W (t k ))] = E [ (W (t k+1 ) W (t k )) 2] + E [W (t k ) (W (t k+1 ) W (t k ))] = t k+1 t k + = t so that we get: Each of the form in (3.8) is measurable in F t, thus Y n (t) is also measurable. If we evaluate at the discrete times t n is Y t a martingale. That is E [ Y (tn+1) F tn ] = Y (tn) (3.14) If the limit as t exist, this should make Y (t) also a martingale measurable.

38 38 CHAPTER 3. STOCHASTIC INTEGRALS Example 4 Itô integral Let F (t) be a random function W (t), then Y (T ) = T W (t)dw (t) Note that if W(t) were differentiable with respect to t its derivative would be Ẇ (t) the limit of(3.8) could be calculated using dw (t) = Ẇ (t)dt and that could wrongly lead to the following equation: T W (t)dw (t) = 1 2 t s ( W 2 (s)ds ) = 1 2 W (t)2 The when use definition(3.8) we get a different expression with the actual rough path of Brownian motion. The steps are as follows: Write Brownian motion such that W (t k ) = 1 2 [W (t k+1) + W (t k )] 1 2 [W (t k+1) W (t k )] and we put in the Itô sum(3.8) to get Y t (t n ) = k<n W (t k ) [W (t k+1 ) W (t k )] = k<n k<n = k<n 1 2 [W (t k+1) + W (t k )] [W (t k+1 ) W (t k )] 1 2 (W (t k+1) W (t k )) 2 1 ( W 2 (t k+1 ) W 2 (t k ) ) (W (t k+1) W (t k )) 2 ) k<n The first on the bottom right is 1 2 (W 2 (t n ) since 1 2 W 2 () = The second term is the sum of n-independent random variables each with expected value: E[ 1 2 (W (t k+1) W (t k )) 2 ] = 1 2 E[(W (t k+1) W (t k )) 2 ] = 1 2 (t k+1 t k ) = t 2

39 39 and variance, that is Var[ 1 2 (W (t k+1) W (t k )) 2 ] = 1 4 Var[(W (t k+1) W (t k )) 2 ] = 3 ( ) 2 t 4 ( t)2 = ( t)2 2 2 As a consequence, the sum is a random variable with mean This implies that 1 2 n t 2 t k <T = t n 2 and variance n( t) 2 2 = t n t 2. (W (t k+1 ) W (t k )) 2 1 T, as t 2 Putting together these results we arrive at the correct Itô answer. T W (t)dw (t) = 1 2 W 2 (T ) 1 2 T Same expression as in equation (3.7) when b = T, a = λ =

40 4 CHAPTER 3. STOCHASTIC INTEGRALS

41 Chapter 4 Itô Integral Process Let g be a very simple process: g(s, ω) = 1 A (ω)1 ]u,v] (s), with A F u. Define for t : t if t u I(t) = g(s)dw (s) = 1 A (W (t) W (u)) if u < t v 1 A (W (v) W (u)) if t > v. The process (I(t)) t is an (F t )-martingale. In general we have the following result: for any process g 2, the process (X(t)), defined by: X(t) = t g(s)dw (s) is an (F t )-martingale. Consider for a process g 2, the process (X(t)), defined by: X(t) = t g(s)dw (s). Let [X, X](t) be the quadratic variation of X over [, t] : [X, X](t) = lim n X(t n i ) X(t n i 1) 2. In Section 2.3 we derived the quadratic variation of W : Consider the example [W, W ](t) = t. g(t) = ξ 1 [, 1 2 )(t) + ξ 11 ( 1 2,1](t). 41

42 42 CHAPTER 4. ITÔ INTEGRAL PROCESS Then X(t) = t g(s)dw (s) = { ξ W (t) if t 1 2 ξ W ( 1 2 ) + ξ 1(W (t) W ( 1 2 )) if t > 1 2 Then for example for t 1/2, we get [X, X](t) = lim ξ 2 (W (t n i t) W (t n i 1 t)) 2 = ξ 2 n [W, W ](t) = ξt. 2 In general, we have for g 2 : i Written in differential form, [X, X](t) = t g 2 (s)ds. (dx(t)) 2 = g 2 (t)dt. Let X = (X(t)) t be a stochastic process and assume that there exist a real number x and adapted processes µ = (µ(t)) and σ = (σ(t)) such that T µ(t) dt < and σ 2 such that for all t X(t) = x + t µ(s)ds + t σ(s)dw (s). (4.1) Such a process X is also called an Itô process. The processes µ and σ in the representation (4.1) are unique a.s. To see this, let X(t) = x + Then x = x and t µ(s)ds + t t (µ(s) µ(s))ds = σ(s)dw (s) = x + t t µ(s)ds + (σ(s) σ(s))dw (s). t σ(s)dw (s). Let M(t) = t (µ(s) µ(s))ds. It follows that M is a martingale with finite variation since M(t n i ) M(t n i 1 i T µ(s) ds + T µ(s) ds <. So, the quadratic variation of M is, see Section 2.3. Note that for s < t E[(M(t) M(s)) 2 ] = E[M 2 (t)] 2E[M(s)M(t)] + E[M 2 (s)] = E[M 2 (t)] 2E[M(s)E(M(t) F s )] + E[M 2 (s)] = E[M 2 (t)] E[M 2 (s)].

43 43 So, using this and monotone convergence = E[lim M(t n i ) M(t n n i 1) 2 ] = lim E( M(t n i ) M(t n n i 1) 2 ) i i = lim {E[(M 2 (t n i )] E[M 2 (t n n i 1)]} = E[M 2 (T )]. It follows that M(T ) = a.s. and M(t) = E(M(T ) F t ) = a.s. for all t. So µ = µ a.s. and it follows that t (σ(s) σ(s))dw (s) = for all t. Hence [ ( t ) 2 ] t = E (σ(s) σ(s))dw (s) = E[(σ(s) σ(s)) 2 ]ds, and this implies σ = σ a.s. Let X be an Itô process with representation i X(t) = x + t µ(s, X s )ds + Usually we write this equation in differential form: Or t σ(s, X s )dw (s). dx(t) = µ(t, X t )dt + σ(t, X t )dw (t), X() = x, (4.2) X(t) = x + t µ(s)ds + Usually we write this equation in differential form: t σ(s)dw (s). dx(t) = µ(t)dt + σ(t)dw (t), X() = x, Or Note that a stochastic process X, having a stochastic differential, is a martingale if and only if the stochastic differential has the form dx(t) = g(t)dw (t), i.e. X has no dt term. Note that the quantity µ(t, x) is called the drift of the diffusion process and σ(s, x) its diffusion coefficient at time t and position x in eqn. (4.2) implies that 1 µ(t, x) = lim E ([X(t) X(s) X(s) = x]) (4.3) t s t s

44 44 CHAPTER 4. ITÔ INTEGRAL PROCESS so drift µ(s, x) is the instantaneously rate of change in the mean of the process given that X(s) = x. Similarly σ 2 1 (s, x) = lim t s t s E ( [(X(t) X(s)) 2 X(s) = x] ) (4.4) Thus that the squared diffusion coefficient denotes the instantaneously rate of change of the squared fluctuations of the processes given that X(t) = x. When the drift µ and the diffusion coefficient σ of a diffusion process are sufficiently smooth functions, the transition density p(s, x; t, y) also satisfies the following partial differential equations. p t + y {aµ(t, y)p} y 2 {σ2 (t, y)p} =, (s, x) fixed (4.5) p p + µ(s, x) s x σ2 (s, x) 2 p =, (t, y) fixed (4.6) x2 with the former equation (4.5) giving the forward evolution with respect to the final state (t, y) and the latter equation (4.6) giving the backward evolution with respect to the initial state (s, x). The forward equation (4.5) is commonly called the Fokker-Planck equation, especially by physicists and engineers. 4.1 Motivation and problem formulation Many physical problems and time varying behaviours are described by the deterministic ordinary differential equation(ode). For instance, When the state of the physical system is denoted as x(t) we obtain the following ordinary differential equation: dx dt = f(t, x), x(t ) = x(), (4.7) as a degenerate form of a stochastic differential equation, which is about to be defined in the absence of uncertainties. The detailed and thorough introduction to the theory of stochastic differential equations can be found in (Øksendal (23) [15]). The differential equation (4.7) can be written as dx(t) = f(x, t)dt and it can be written in the integral form as follows x(t) = x() + t t f(s, x(s))ds.

45 4.1. MOTIVATION AND PROBLEM FORMULATION 45 It follows that x(t) = x(t x, t ) is a solution with initial condition x(t ) = x. Nevertheless, when there are uncertainty, physical system behaviour often can be described in terms of the probability and must be described by means of a stochastic model. Therefore in this chapter we discuss stochastic differential equation as a model for a stochastic process X(t). Roughly, we can think of a stochastic differential equation(sde) as an ordinary differential equation(ode) with an added random perturbation in the dynamics. If {X(t), t }, is real valued process describing the state of a system at each time t, the stochastic differential equation(sde) governing the time evolution of this process X is given by dx(t) dt = f(t, X(t)) + g(t, X(t))ξ(t), X(t ) = x. (4.8) The white noise ξ(t) is a stochastic process. It is introduced so as to model uncertainties in the underlying deterministic differential equation. Generally ξ(t) is understood in the engineering literature as stationary Gaussian process < t <. The initial condition is also assumed to be a random variable and independent of ξ(t). Essentially, equation (4.8) should be a Markov. This implies that future behaviour of the state of the systems depend only on the present state if it is known and not on its past (Arnold 1974),Øksendal 23). The SDEs such as (4.8) arise when a variety of random dynamic phenomena in the physical, biological, engineering and social sciences are modelled. Solutions of these equations are often of diffusion processes and hence are connected to the subject of partial differential equations(pdes). For instance, let us now consider the model for Biochemical-Oxygen Demand (BOD) in stream bodies as described by the following equation: db dt = K 1B + s 1, B(t ) = B where the deterministic process B(t) is the BOD (mg/l), K 1 is the reaction rate coefficient (l/day) and s 1 is the source or sink along the stream Let us suppose that there are uncertainties associated with the source input s 1. This can be modeled by adding a white noise process ξ t with intensity σ to s 1. The resulting stochastic model for the stochastic process B t now becomes: db t = K 1 B t + s 1 + σξ t, B t = B dt Another source of uncertainty can be the parameter K 1. Adding a white noise process to this parameter results in: db t = (K 1 + σξ t )B t + s 1, B t = B dt Note that both stochastic models are of the general type (4.8). An essential property of the stochastic model(4.8) is that it should be Markov. This property implies that information on the probability density of the state X t at time t is sufficient for computing model predictions for times > t. If the model is not Markovian, information on the system state for times < t would also be required. This would make the model very impractical. As we will show in this Chapter the stochastic differential equation

46 46 CHAPTER 4. ITÔ INTEGRAL PROCESS (4.8) is Markovian if ξ t is a continuous Gaussian white noise process where assume that the stochastic process ξ(t) is Gaussian, i.e. for all t t 1... t n T the random vector Z = (ξ(t 1 ),..., ξ(t n )) R n has a normal distribution. Furthermore assume that ξ(t) is a white noise process, i.e. ξ(t) satisfies the following conditions E(ξ(t)) = E(ξ(t 1 )ξ(t 2 )) = δ(t 2 t 1 ), t 2 t 1 where δ(t) is the Dirac delta function. The name white noise comes from the fact that such a process has a spectrum in which all frequencies participate with the same intensity, which is characteristic of white light. As in [?], for example let us consider 1-dimensional white process. White noise has a constant spectral density f(λ) on the entire real axis. If E[ξ(s)ξ(t + s)] = C(t) is the covariance function of ξ(t), then, the spectral density is given: f(λ) = 1 e iλt C(t)dt = c 2π 2π, λ R1. (4.9) The positive constant c without loss of generality can take a value equals 1. White noise ξ(t) can be approximated by an ordinary stationary Gaussian process X(t), for example one with covariance: C(t) = ae b t, (a >, b > ), it can be shown that such a process has a spectral density. f(λ) = ab π(b 2 + λ 2 ).

47 4.1. MOTIVATION AND PROBLEM FORMULATION 47 f(λ) = 1 e iλt C(t)dt 2π = a e b t e iλt dt 2π = a e b t [cos(λt) i sin(λt)] dt 2π = a e b t cos(λt)dt a ie b t sin(λ)dt 2π 2π }{{}}{{} even odd = 2a e bt cos(λt)dt 2π f(λ) = = a e bt cos(λt)dt π = a [ ] b λ π b 2 + λ 2 e bt cos(λt) + b 2 + λ 2 e bt sin(λt) = a [ ] b π b 2 + λ 2 e bt cos(λt) = a [ ] b + π b 2 + λ 2 ab π(b 2 + λ 2 ). (4.1) If we now let a and b approach in such a way that a 1, we get b 2 f(λ) 1 2π λ R, C(t) = t, t =, so that C(t) δ(t), that is, X(t) converges in a certain sense to ξ(t) [9]. C(t)dt 1,

48 48 CHAPTER 4. ITÔ INTEGRAL PROCESS white noise 1 1 Wiener motion time (a) White noise time (b) Wiener motion Figure 4.1: (a) A white noise process is shown and is discontinuous in any point, it cannot be integrated in the sense of Lebesgue or Riemann integrals. The Brownian motion (or Wiener process) (b) can be considered as a formal integral of the white noise process. The following MATLAB program produces a white noise and Wiener track see Figure 4.1. clear t=:.2:1; l=size(t); white_noise=randn(l(1),l(2)); white_noise(1)=; figure(1); plot(t,white_noise); set(gca, FontName, Times New Roman, FontSize,16); x=xlabel( time ); set(x, FontName, Times New Roman, FontSize,16); y=ylabel( white noise ); set(y, FontName, Times New Roman, FontSize,16); figure(2); dt=t(2)-t(1); wiener_process=zeros(l(1),l(2)); wiener_process(1)=; for k=2:l(2) wiener_process(k)=wiener_process(k-1)+sqrt(dt)*white_noise(k); end plot(t,wiener_process); set(gca, FontName, Times New Roman, FontSize,16,... YLim,[ ]); x=xlabel( time ); set(x, FontName, Times New Roman, FontSize,16); y=ylabel( Wiener motion ); set(y, FontName, Times New Roman, FontSize,16);

49 4.2. STOCHASTIC DIFFERENTIAL EQUATIONS 49 Therefore the formal integration of the equation (4.8) leads us to the equation X(t) = X(t ) + t t f(s, X(s))ds + t t g(s, X(s))ξ(s)ds (4.11) However, it is impossible to find the last integral in (4.11) using only the standard mathematical instruments known from the real analysis [9]. The new mathematical theory that allows to solve the equation (4.11) was developed in the middle of the last century by Itô, K, Stratonovich, R.L. Formally the white noise process ξ(t) it considered as a derivative of Brownian motion W (t) (see, for instance, Jazwinski 197 [9]). Equation (4.11) can be written in the form dw (t) = ξ(t)dt (4.12) dx(t) = f(t, X(t))dt + g(t, X(t))dW (t) (4.13) or in the integral form Equation (4.13) is called the stochastic differential equation. X(t) = X(t ) + t t f(sx(s))ds Stochastic Differential Equations Let M(n, d) denote the class of n d-matrices. Consider as given A d-dimensional Wiener process W ; A function µ : R + R n R n ; A function σ : R + R n M(n, d); A vector x R n. Consider the stochastic differential equation (SDE) t t g(s, X(s))dW (s) (4.14) dx(t) = µ(t, X(t))dt + σ(t, X(t))dW (t). (4.15) The process X = (X(t)) is called a strong solution of the SDE (4.15) if for all t >, X(t) is a function F (t, (W (s), s t)) of the given Wiener process W, integrals t µ(s, X(s))ds and t σ(s, X(s))dW (s) exist, and the integral equation X(t) = X() + t µ(s, X(s))ds + t is satisfied. If the coefficients µ and σ satisfy the following conditions σ(s, X(s))dW (s)

50 5 CHAPTER 4. ITÔ INTEGRAL PROCESS 1. A Lipschitz condition in x and y. K, x R n, y R n, t : 2. A linear growth condition: K, x R n, t : 3. x is a constant. µ(t, x) µ(t, y) + σ(t, x) σ(t, y) K x y µ(t, x)) + σ(t, x) K(1 + x ) then there exists a unique strong solution X to the stochastic differential equation (4.15) with continuous trajectories and there exists a constant C such that E[ X t 2 ] Ce Ct (1 + x 2 ). 4.3 Linear Stochastic Differential Equations The Itô stochastic integral provides us with the means for formulating the stochastic differential equations. Such equations describe the dynamics of many importants continuous time stochastic system. General Linear Stochastic differential equation In general the Linear SDEs are written in the following form: dx t = (a 1 (t)x t + a 2 (t)) dt + (b 1 (t)x t + b 2 (t)) dw t (4.16) The linear SDE is autonomous if all coefficients are constants The linear SDE is homogeneous if a 2 (t) = and b 2 (t) =. The SDE is linear in the additive sense if b 1 (t) =. This implies that the Itô integral looks like t gdw s The SDE is linear in the multiplicative sense integral looks like t X sgdw s The general solution to a linear SDE with additive noise: if b 2 (t) =. This implies that the Itô dx t = (a 1 (t)x t + a 2 (t)) dt + (a 1 (t)x t + a 2 (t)) dw t are well detailed in the following books [13, 12], for example.

51 4.3. LINEAR STOCHASTIC DIFFERENTIAL EQUATIONS 51 Types of solution to Stochastic differential equation Under some regularity conditions on f and g to the SDE (4.16) is a diffusion process The solution is a strong solution if it is valid for each given Wiener process (and initial value), that is it is sample pathwise unique. A strong solution is an adapted function X(W (t), t) where Brownian motion path W (t) again plays the major task of abstract random variable ω, X(t),that is X(W (t), t) is being measurable in F t implying that X(t) is a function of values W (s) for s t. For example X(t) = e (a 1 b )t+b 1W (t) is a strong solution of Geometric Browian motion from equation (4.16) when a 1 (t) = a 1 and b 2 (t) = b 2 and a 2 (t) = b 2 (t) =. Note that it depends only on W (t), while X(t) = σ t e γ(t s) dw (s) is a strong solution of the Ornstein-Uhlenback equation dx(t) = γx(t)dt + σdw (t) where X() =. This solution depends on the whole path up to time t. A diffusion process with its transition density satisfying the Fokker-Planck equation is a solution of SDE. A solution is a weak solution if it is valid for given coefficients, but unspecified Wiener process, that is its probability law is unique. That is, a weak solution is a stochastic process X(t) defined perhaps on a different probability space and filtration that the statistical properties of the SDE dx(t) = µ(x(t), t)dt + σ(x(t), t)dw (t) where roughly speaking the strong solution also satisfies the following E [(X(t + t) X(t)) F t ] = µ(x(t), t) t + O( t) (4.17) E [ (X(t + t) X(t)) 2 F t ] = σ 2 (X(t), t) t + O( t) (4.18) Thus the strong solution is a weak solution but not the other way around, since for weak solution we have no information on how or even whether the weak solution depend on W (t). Exercise 2 Given the following SDE dx = γxdt + αdw (t), X() = x. (4.19) The solution X t of the equation 6.43 is a diffusion process that may be used to describe a range of problems, depending on the interpretation of the variables incorporated in the model. For instance, it may represent the location of a particle, initially released at a given point, as a function of time, or it may describe the concentration distribution of some colorant, released at an infinesimal droplet at the starting point. with γ and α >

52 52 CHAPTER 4. ITÔ INTEGRAL PROCESS constants. The initial distribution of variable X is given by a delta-peak located at x. For negative values of γ, the process is one of the Ornstein-Uhlenbeck type, and will converge to a stable distribution. When γ equals zero, the model reduces to a scaled version of the Wiener process. The process X t is Markov; so, in order to determine the value of x t+1, an instance of the process X at time t + 1, we only need to know x t. Since the process is Gaussian that is X(t) N(e γt x, σ 2 (t)), then generally we are interested in the mean and variance of the process. Therefore (a) Show that the expectation of the process X t is : and E{X(t)} = e γt x, (4.2) (b) a variance of X t is σ 2 (t) = var{x(t)} = α 2 e2γt 1. (4.21) 2γ

53 Chapter 5 Itô s Formula Let X be an Itô process with stochastic differential dx(t) = µ(t)dt + σ(t)dw (t). Assume now further that we are given a C 1,2 function f : R + R R. Define a new process Z by Z(t) = f(t, X(t)). Then Z has a stochastic differential given by where the term µ f x df(t, X(t)) = f f dt + t x dx(t) f 2 x 2 [dx(t)]2 { f = t + µ f x + 1 } 2 f 2 σ2 dt + σ f dw (t), (5.1) x 2 x is shorthand for and so on. Note that formally µ(t) f (t, X(t)) x [dx(t)] 2 = [µdt + σdw (t)] 2 = µ 2 [dt] 2 + 2µσ[dt][dW (t)] + σ 2 [dw (t)] 2 = σ 2 dt, where we used the following multiplication table dt dw (t) dt dw (t) dt In the special case where the function f : R R is twice differentiable, we get: { df(x(t)) = µf (X(t)) + 1 } 2 σ2 f (X(t)) dt + σf (X(t))dW (t). To check if this is really the case, consider the following example: 53

54 54 CHAPTER 5. ITÔ S FORMULA Example 5 As an example, we use Itô s formula to calculate E[e αw (t) ]. or in integrated form de αw (t) = 1 2 α2 e αw (t) dt + αe αw (t) dw (t), e αw (t) = t 2 α2 e αw (s) ds + α t e αw (s) dw (s). Taking expected values will make the stochastic integral vanish. Define m(t) = E[e αw (t) ], then E[e αw (t) ] = t 2 α2 E[e αw (s) ]ds. m(t) = t 2 α2 m(s)ds. Taking the derivative with respect to t { m (t) = 1 2 α2 m(t), m() = 1. Solving this equation we get Exercise 3 Evaluate the following (a) (b) Example 6 Let us now use the integral E[e αw (t) ] = e α2 t/2. I = E[e β(wt Ws) ] E[e (t+ 1 2 Wt) ] t W s dw s choose X t = W t and g(t, X t ) = 1 2 X2 t. It follows that Y t = g(t, X t ) = 1 2 W 2 t

55 55 Then by Itô formula (5.1) Hence dy t = g g dt + t x dw t g x (dw t) 2 = W 2 t dw t dt. d( 1 2 W 2 t ) = W t dw t dt. In other words 1 2 W 2 t = t W s dw s t. which means, t W s dw s = 1 2 W 2 t 1 2 t. Example 7 Let us now consider the population growth model as it explained in chapter five of the book [15] where dn t dt = a t N t, where N is given (5.2) In which we choose a t = r t + σξ t, i.e., we include the uncertainty in the model. Let us assume that r t = r constant. By the Itô interpretation, equation (5.2) is equivalent to equivalently It follows that dn t = rn t dt + σn t dw t (5.3) dn t N t = rdt + σdw t (5.4) t dn s N s = rt + σw t, where W = One can see that the evaluation of the integral on the left hand side requires the use of the Itô formula for the function g(t, x) = ln x; x >

56 56 CHAPTER 5. ITÔ S FORMULA In this case get d(ln N t ) = 1 dn t + 1 ( 1 ) (dn N t 2 Nt 2 t ) 2 = dn t 1 σ 2 N 2 N t 2Nt 2 t dt = dn t N t 1 2 σ2 dt Therefore dn t N t = d(ln N t ) σ2 dt Now if you equate this equation by equation (5.4) we find that d(ln N t ) σ2 dt = rdt + σdw t It follows that d(ln N t ) = (r 12 σ2 ) dt + σdw t t d(ln N s ) = ln t ( Nt (r 12 ) σ2 ds + ) = N t σdw s (r 12 ) σ2 t + σw t Hence For Stratonovich interpretation, we have N t = N e (r 1 2 σ2 )t+σw t (5.5) dn t = rn t dt + σn t dw t dn t N t = rdt + σdw t (5.6)

57 57 t dn s N s = t rds + σ direct integration of (5.7) gives the Stratonovich solution N t : t The solutions N t and N t are both processes of the type dw s (5.7) N t = N e rt+σwt (5.8) X t = X e rt+σwt Such kind of processes are called geometric Brownian motions. They are important models in stochastic prices in economics. Note that it seems reasonable that if W t is independent of N we should have E[N t ] = E[N ]e rt that is the same when there is no noise in a t in equation (5.2). therefore as anticipated we obtain E[N t ] = E[N ]e rt (5.9) but for Stratonovich solution however, the same calculation gives E[N t ] = E[N ]e (r+ 1 2 σ)t (5.1) The explicit solutions [N t ] and [N t ] in (5.5) and (5.8) respectively can be analysed by using our knowledge about the behaviour of W t to gain information on these solutions. For example, if we consider the Itô solution N t i.e equation (5.5) we see that (a) If r > 1 2 σ2 then N t, a.s (b) If r < 1 2 σ2 then N t, a.s (c) If r = 1 2 σ2 then N t will fluctuate between arbitrary large and arbitrary small values as t a.s. The properties above are shown in the Figure (5.1) (a)- (b) below

58 58 CHAPTER 5. ITÔ S FORMULA Milstein scheme compared with the analystical solution analytical Milstein Milstein scheme compared with the analystical solution analytical Milstein 4 N(t) 3 1 N(t) t t (a) r = 3, σ = 2 N = 6 (b) r =.5, σ = 2 N = 6 Figure 5.1: The population grows in (a) and it decays as expected due to the change of parameters r and σ

59 5.1. THE MULTIDIMENSIONAL ITÔ FORMULA 59 For Stratonovich solution N t we get the same argument that N t a.s if r < and N t a.s if r >. Thus the two solutions have fundamentally different properties and it is an interesting question what solution gives the best description of the process/situation. Remark 1 There is a fundamental Itô formula (see Arnold (1974)), in the stochastic calculus for example; The differential form of the Itô formula applied to the function W (t) W n (t) for an integer n 1 and t gives d (W n t ) = nw n 1 t dw t + Exercise 4 Use Itô formula to prove that t W 2 s dw s = 1 3 W 2 t Hint choose n = 3 and apply in the above formula. n(n 1) 2 t Wt n 2 dt W s ds 5.1 The Multidimensional Itô Formula When you encounter a situation in higher dimensional case we consider a vector of stochastic process X 1 X n. X =., where the component X i has a stochastic differential dx i (t) = µ i (t)dt + d σ ij (t)dw j (t), j=1 with W 1,..., W d independent Wiener processes. Define the drift term µ and the d- dimensional Wiener process W by µ 1 W 1. µ =. and W =. W d µ n

60 6 CHAPTER 5. ITÔ S FORMULA respectively, and the n d diffusion matrix σ by So in matrix notation we have: σ 11 σ 1d... σ =.... σ n1 σ nd dx(t) = µ(t)dt + σdw (t). Let f : R + R n R be a C 1,2 mapping. Define a new process Z by Z(t) = f(t, X(t)). Then Z has a stochastic differential given by df(t, X(t)) = f n t dt + f dx i + 1 x i 2 with multiplication table i=1 n i=1 n j=1 2 f x i x j dx i dx j, dt dw i (t) dt dw j (t) δ ij dt with δ ij = { 1 if i = j if i j. It follows that d dx i dx j = (µ i dt + σ ik dw k )(µ j dt + k=1 d = σ ik σ jk dt k=1 = C ij dt, d σ jl dw l ) l=1 where C is the n n-matrix So, we can write df(t, X(t)) = { f t + n i=1 µ i f x i C = σσ T. n i,j=1 C ij 2 f x i x j } dt + n f x i i=1 d σ ij dw j (5.11) j=1

61 5.2. APPLICATIONS OF ITÔ FORMULA Applications of Itô formula In this part we present various examples of Stochastic differential equation and how their solutions can be found by the aid of Itô formula Examples of Linear SDEs with additive noise Let us consider the case which models the molecular bombardment of a speck of dust on water surface responsible for Brownian motion. The intensity of this bombardment does not depend on the state variables, for instance the position of the speck. Taking X t as one of the components of the velocity of the particle, Langevin equation (see [13] can be written as follow dx t dt = ax t + bξ (5.12) for the acceleration of the particle, that is the sum of the retarding frictional force depending on the velocity and the molecular forces represented by a white noise process ξ, with the intensity b independent of the velocity of the particle. Where in this case a and b are positive constants. The linear Equation (5.12) can be written as an Itô SDE or in the stochastic integral as dx t = ax t dt + bdw t (5.13) X(t) = X() t X s ds + t bdw s (5.14) Where the second integral is an Itô integral. Such kind of a process is called Ornstein- Uhlenbeck process. Note that in this kind of SDEs it does not matter whether you choose Itô or Stratonovich integral as both lead to the same process. Equation (5.14) is said to have additive noise because the noise term does not depend on the state of the variable of a system, it is also a linear equation. It can be shown that the equation (5.14) has the explicit solution t X(t) = e at X + e at e as bdw s X(t) = e at X + b t e a(t s) dw s,, O t T

62 62 CHAPTER 5. ITÔ S FORMULA But when there external fluctuations the intensity of the noise usually depends o the state of the system. For example, the growth coefficient in an exponentail growth equation dx(t) = αx(t)dt may fluctuate on the account of environmental effects, taking the form α = a + bξ t where a and b are positive constants and ξ t is a white noise process. As it has been shown before this lead to the following Itô SDE: In the stochastic integral equation dx t = ax t dt + bx(t)dw (t) (5.15) X(t) = X() + t ax s ds + t bx s dw s (5.16) The second integral is again an Itô integral but in contrast with equation (5.13) its integrand involves the unknown solution (5.16) has a multiplicative noise. Still is is a linear equation. 1. (i) Show that the solution of the SDE dx t = µx(t)dt + σdw (t) is given by X t = e µt (X + σ t e µs dw s ) (ii) Show that the solution of the SDEs given by given by dx t = (ax(t) + b)dt + σdw (t) X t = e at ( X + b a (iii) Show that the solution of the SDE dx t = ( ) t ) 1 e at + σ e as dw s ( b Xt T t ) dt + dw (t) is given by ( X t = X 1 t ) + bt t 1 + (T t) T T T s dw s

63 5.2. APPLICATIONS OF ITÔ FORMULA Solve the Ornstein-Uhlenbeck equation (or Langevin equation) (a) dx t = µx t dt + σdw t, X t = x where µ, σ are real constants The solution is called the Ornstein-Uhlenbeck process. (b) Find E[X t ] and var[x t ] := E[(X t E[X t ]) 2 ] Examples of Linear SDEs with multiplicative noise Examples 1. By an application of Itô s formula, X(t) = e W (t) t/2 is a strong solution of the SDE dx(t) = X(t)dW (t), X() = Consider the SDE dx(t) = a(t)dw (t), where a(t) is a deterministic (non-random) C 1 function. The solution must be X(t) = X() + Integrating by parts, t X(t) = X() + a(t)w (t) a(s)dw (s). t W (s)a (s)ds, and we see that the solution is a strong solution. 3. The SDE dx(t) = rx(t)dt + σx(t)dw (t), X() = 1, where r and σ are real constants has a strong solution given by X(t) = e (r σ2 /2)t+σW (t).

64 64 CHAPTER 5. ITÔ S FORMULA 4. Consider the general linear SDE (scalar case, i.e. n = d = 1) dx(t) = (µx(t) + ν)dt + (σx(t) + τ)dw (t). The solution in the homogeneous case ν = τ = is U(t) = e (µ σ2 /2)t+dW (t). By an application of Itô s formula we get: It follows that It follows that du 1 (t) = U 1 (t) [ ( µ + σ 2 )dt σw (t) ]. d ( X(t)U 1 (t) ) = X(t)dU 1 (t) + U 1 (t)dx(t) + dx(t)du 1 (t) X(t) = U(t) = U 1 (t) [( ν + στ)dt + τdw (t)]. { x + ( ν + στ) t U 1 (s)ds + τ Example 8 By an application of Itô s formula, it can be shown that is a strong solution of the SDE X(t) = e W (t) t/2 dx(t) = X(t)dW (t), X() = 1. That is Let g(x t, t) = e W (t) t/2 = g(x, t) = e x t/2 then dg(t, X(t)) = g g dt + t x dw (t) t 2 g (dw (t))2 x2 } U 1 (s)dw (s). dg(t, X(t)) = 1 2 ew (t) t/2 dt + e W (t) t/2 dw (t) ew (t) t/2 dt dx t = 1 2 X tdt + X t dw (t) X tdt dx t = X t dw (t) Example 9 Using Itô formula we can verify that SDE (5.15) has the explicit solution X t = e (a b2 /2)t+bW t.

65 5.2. APPLICATIONS OF ITÔ FORMULA 65 That is Let g(x t, t) = e (a b2 /2)t+bW (t) = g(x, t) = e (a b2 /2)t+bx then Hence dg(t, X(t)) = g g dt + t x dw (t) g (dw (t))2 2 x2 ) dg(t, X(t)) = (a b2 e (a b2 /2)t+bW (t) dt + be (a b2 /2)t+bW (t) dw (t) + 2 b 2 /2)t+bW (t) 2 e(a b2 (dw (t)) 2 ) dx t = (a b2 X t dt + b 2 X t dw (t) + b2 2 2 X t(dw (t)) 2 dx(t) = ax(t)dt + bx(t)dw (t), X() = 1, while the Stratonovich would yield a different solution X t = e (at+bwt 1. ds t = µs t (t)dt + σs t dw (t) using Itô calculus we can verify that S t = S e ) (µ σ2 t+σw 2 t 2. ds t = 1 2 S t(t)dt + S t dw (t) using Itô calculus we can verify that S t = S e Wt 3. ds t = S t dw (t) using Itô calculus we can verify that S t = S e Wt 1 2 t

66 66 CHAPTER 5. ITÔ S FORMULA Example 1 Consider the SDE dx(t) = a(t)dw (t), where a(t) is a deterministic (non-random) C 1 function. The solution must be Integrating by parts, X(t) = X() + t a(s)dw (s). X(t) = X() + a(t)w (t) and we see that the solution is a strong solution. t Exercise 5 Consider the stochastic differential equation: W (s)a (s)ds, dx t = µdt + σdw (t). where µ and σ are constants. This equation is well defined. The exact solution is written as X t = X t + µ(t t ) + b(w t W t ), where, X t =, W t = Such a kind of a process is called the generalized Wiener process. The random tracks of X t can be generated by Using the next program where µ =.2 and σ =.3. clear dt=1; mu=.2;sigma=.3;n=1;rn=randn(n,1);t=zeros(n,1); W=zeros(N,1); t(1)=1;w(1)=; for i=2:n t(i)=i; W(i)=W(i-1)+sqrt(dt)*Rn(i); end X=mu*t+sigma*W; plot(t,x, - ); xlabel( time t ) ylabel( X(t) ) set(gca, FontName, Times New Roman, FontSize,16); x=xlabel( time t ); set(x, FontName, Times New Roman, FontSize,16); y=ylabel( X(t) ); set(y, FontName, Times New Roman, FontSize,16); title( Simulation of a Generalized Wiener Process ) the result is depicted below

67 5.2. APPLICATIONS OF ITÔ FORMULA Simulation of a Generalized Wiener Process X(t) time t Figure 5.2: Simulation of a Generalised Wiener Process Relation between Itô and Stratonovich SDEs The difference between the two interpretation lies in the location where the diffusion function say σ(t, X t ) is evaluated as we shall soon see in the next part. Fortunately, transformation between Itô and Stratonovich concept do exist for the model in any dimensions (see below and in Jazwinski 197 [9]), for example. The same SDEs can lead to different solutions depending on the type of numerical numerical scheme one chooses to use. Therefore it is essential that one defines the SDEs in a particular sense and use the corresponding numerical schemes (more details on the choice of numerical schemes can be found in [13, 11], for example. Therefore, if a physical system on one hand is defined by the Itô SDEs: dx(t) Itô = µ(x(t), t)dt + σ(x t, t)dw t (5.17) then the same process can be described also with the Stratonovich equation: ( Str dx t = µ(t, X t ) 1 2 σ(t, X t) σ(t, X ) t) dt + σ(t, X t )dw t (5.18) x that is where dx t Str = µ(t, X t )dt + σ(t, X t )dw t µ(t, X t ) = ( µ(t, X t ) 1 2 σ(t, X t) σ(t, X ) t) x On the other hand, if a physical process is described by the Stratonovich stochastic differential equation: dx t Str = µ(t, X t )dt + σ(t, X t )dw t (5.19)

68 68 CHAPTER 5. ITÔ S FORMULA then the same process can be described also with the Itô equation: or in general, ( Itô dx t = µ(t, X t ) σ(t, X t) σ(t, X ) t) dt + σ(t, X t )dw t (5.2) x that is where dx t Itô = µ(t, X t )dt + σ(t, X t )dw t µ(t, X t ) = ( µ(t, X t ) σ(t, X t) σ(t, X ) t) x Note that as long as the function σ(x(t), t) = g(t) is only time dependent both interpretations will produce the same results. It is essential to note that the Stratonovich formula agrees well with the classical differential formula unlike in the Itô formula there is no additional term (Kloeden (1999)). Example 11 Let us consider the following geometric Brownian processes that is often applied in finance as models for stochastic prices such that Itô SDE is written as follows; dx(t) Itô = ax(t)dt + bx(t)dw (t), X t = 1, (5.21) with the aid of Itô s differential rule, and the function φ(x, t) = ln(x), x > the following general Itô solution can be obtained. X(t) = e (a b2 2 )t+bw (t), X(t ) = 1, W () =, where a, b are positive constants. While by using equation (5.2), we obtain dx(t) Str = ax(t)dt + bx(t)dw (t), X t = 1, (5.22) dx(t) Itô = X(t)(a + b2 2 )dt + bx(t)dw (t), X t = 1. (5.23) Again now with the aid of the Itô s differential rule, and the function φ(x, t) = ln(x) the following general Itô solutions can be obtained. X(t) Itô = e at+bw (t), X(t ) = 1, (5.24) While the Stratonovich Eqn. (5.22) has the Stratonovich solution: which is the same solution. X(t) Str = e at+bw (t), X(t ) = 1, (5.25)

69 5.2. APPLICATIONS OF ITÔ FORMULA 69 Example 12 Consider the process M t satisfying the stochastic differential equation: dm t = am t dt + adw t (5.26) Note that in this case there is no difference in the Itô and Stratonovich interpretation. It is easy to derive that the auto covariance of this process. For example E[M ] = and E[M 2 ] = a 2 then E[M t ] = E[M t M s ] = a 2 e a(t s) The solution of equation (5.26) can be shown to be: M t = M e at + a for τ > we have t + τ > t : we have t e a(t+r) dw r (5.27) t+τ M t+τ = M e a(t+τ) + a e a((t+τ)+s) dw s (5.28) If τ > we have t + τ > t : The autocovariance can be calculated in many ways let us use this approach: t+τ t E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + + a 2 e a(t+τ) s e a(t+r) E[dW s dw r ] t+τ t E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + + a 2 e a(t+τ) s e a(t+r) E[ξ s ξ r ]dsdr we have discussed earlier that the white noise has the property that E[ξ s ξ r ] = δ(s r) t+τ t E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + a 2 e a(t+τ) s e a(t+r) δ(s r)dsdr t+τ E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + a 2 e a(t+τ) s δ(s r) t+τ E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + a 2 F (s)δ(s r) t t e a(t+r) dsdr e a(t+r) dsdr (5.29) By using the property of impulse/delta function (see Jazawinki (197) [9]) for instance, if F (t) is a function of t then q p F (t)δ(t a)dt = F (a).1 if t = a (5.3)

70 7 CHAPTER 5. ITÔ S FORMULA Using this property equation we see that t+τ F (s)δ(s r)ds = F (r) 1 if s = r thus F (r) now takes this value: F (r) = e a(t+τ r) If this expression is plugged into the eqn (5.29) above the expression becomes E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + a 2 F (r) t e a(t+r) dr t E[M t+τ M t ] = E[M 2 e a(2t+τ) ] + a 2 e a(t+τ r) e a(t+r) dr E[M t+τ M t ] = a 2 e aτ E[M t M s ] = a 2 e a t s t = E[M 2 e a(2t+τ) ] + a 2 e a(2t+τ) e 2ar dr = a 2 e a(2t+τ) + a { e a2t aτ+2at e a(2t+τ)} 2 = a 2 e a(2t+τ) + a 2 e aτ e a(2t+τ) M t is called a continuous colored noise process. For very large values of the parameter a, the process M t approaches a white noise process ξ t. If the white noise forcing in the stochastic differential equation: or: dx t dt = f(t, X t ) + g(t, X t )ξ t (5.31) dx t = f(t, X t )dt + g(t, X t )dw t (5.32) is physically not accurate it is also possible to force the system with a colored noise process: dx t dt = f(t, X t ) + g(t, X t )M t (5.33) Unfortunately X t is now not a Markov process. However we can rewrite the system as a standard vector SDE of the form

71 5.2. APPLICATIONS OF ITÔ FORMULA 71 (5.34) In the vector form dx t = f(t, X t )dt + g(t, X t )M t dt dm t = am t dt + adw t (5.35) [ dxt dm t ] = [ f(t, Xt ) g(t, X t )M t am t ] dt + [ a ] dm t (5.36) The vector process [X t, M t ] is now again a Markov process. The final set of model equations (5.36) in (example 12) has the same solution in either the Itô or Stratonovich sense (verify this by using the Itô and Stratonovich relations discussed in section 5.2.3). Let us now consider the case that the value of the parameter a is very large. In this case the auto covariance of the process M t is close to a delta function and M t approaches a white noise process ξ t. As a result, the set of equations of (5.36) reduces to the standard stochastic differential equation in terms of the Wiener process, SDE (5.32). This equation does not have the same solution in Itô or Stratonovich sense. The question now is: Is this an Itô or Stratonovich equation? The answer is: A Stratonovich SDE. If the white noise process (with zero time correlation) in the SDE is the mathematical approximation of a noise process with a very short correlation time, the SDE has to be interpreted in the Stratonovich sense. This is the case for most physical stochastic systems that have been derived by embedding a deterministic differential equation into a stochastic environment. Exercise 6 Consider the stochastic BOD model considered before: db t dt or in terms of the Wiener process: = K 1 B t + s 1 B t σξ t, B t = B (5.37) db t = ( K 1 B t + s 1 ) dt B t σdw t, B t = B (5.38) Since the white noise process in this stochastic model is a mathematical approximation of a noise process with a relatively short correlation scale, this SDE has to be interpreted in the Stratonovich sense. Since the Euler scheme can only be used for Itô equations, the model above is rewritten as an Itô SDE: db t = ( K 1 B t + s ) B tσ 2 dt B t σdw t, B t = B (5.39)

72 72 CHAPTER 5. ITÔ S FORMULA Generate different realizations of B t. Using the following program %The programme below simulate the BOD model but in the stratonovich sense dt=1; %time stepsize %=======Ininitalization of the column vector(preallocation of memory N=1 % numnber of samples Rn=randn(N,1); t=zeros(n,1); w=zeros(n,1); %=======================Initial conditions W=; t(1)=1; %declaraton of parameters k1=.1; s1=1; % Integration stage for i=2:1 t(i)=i; W(i)=w(i-1)+sqrt(dt)*Rn(i); end b=.1; %the numerical solution of the BOD model with the Euler scheme B=2.; B(1)=B-k1*B*dt+s1*dt+.5*x*b*b*dt-B*b*sqrt(dt)*Rn(1); for i=2:1 B(i)=B(i-1)-k1*B(i-1)*dt+s1*dt+.5*B(i-1)*b*b*dt-x*b*sqrt(dt)*Rn(i); end %plot the results h=plot(t,b, - ); set(h, LineWidth,2); set(gca, FontName, Times New Roman, FontSize,16); title( Simulation of a Sample BOD realization-stratonovitch ) xlabel( Distance ) ylabel( B(t) ) While the mean and standard deviation of BOD can be determined using this program clear; %The programme below simulate the BOD model EQn (68) in the notes %and computes the mean and variance of the process in the stratonovich sense N=1; dt=1; t=zeros(n,1); w=zeros(n,1); mean=zeros(n,1); B=zeros(N,1); var=zeros(n,1); %Initial parameters in the BOD model 68 k1=.1; s1=1; b=.1; for i=1:n t(i)=i; end for j=1:n

73 5.2. APPLICATIONS OF ITÔ FORMULA 73 Rn=randn(1,1); w=; for i=2:n w(i)=w(i-1)+sqrt(dt)*rn(i); end %the numerical solution of the BOD model with the Euler scheme B=2.; % x(1)=x-k1*x*dt+s1*dt+.5*x*b*dt-x*b*sqrt(dt)*r(1); B(1)=B-k1*B*dt+s1*dt+.5*B*b*b*dt-B*b*sqrt(dt)*Rn(1); for i=2:1 %x(i)=x(i-1)-k1*x(i-1)*dt+s1*dt+.5*x(i-1)*b*dt-x*b*sqrt(dt)*r(i); B(i)=B(i-1)-k1*B(i-1)*dt+s1*dt+.5*B(i-1)*b*b*dt-B*b*sqrt(dt)*Rn(i); end %evaluation of the mean and variance of N samples of BOD mean=mean+b; var=var+b.*b; end mean=mean/1; rms=sqrt(var/1-mean.*mean); h=plot(t,mean, -,t,rms, - ); set(h, LineWidth,2); set(gca, FontName, Times New Roman, FontSize,16); title( Mean and Standard Dev. of 1 BOD Realizations - Stratonovitch ) xlabel( Distance ) ylabel( Mean and Std. Deviation ) legend( Mean, Std. Deviation.,). Exercise 7 Consider the model describing the Carbonaceous Biochemical Oxygen Demand (CBOD) process. The model is an example of several models used in water quality modelling of river and estuary systems and is an extension of the so-called Streeter-Phelps model. In this model, the concentration levels of CBOD, Dissolved Oxygen (DO), and Nitrogenous Biochemical Oxygen Demand (NBOD) are related to each other by a set of differential equations. With b, o, and n representing the levels (mg/l) of CBOD, DO, and NBOD, respectively, the system is given by db do dn /dt = k b k c k 2 k n k n b o n + s 1 s 2 s 3. (5.4) where we defined k b = k c + k 3 and s 2 = k 2 d s + p r + s 2 for sake of conciseness. A description of the parameters used in the model, along with their units and typical values is given in table 5.1. Although the model is given as differentials to time, the intention behind the model is to monitor the concentration levels within a fixed volume of water, flowing downstream a river. An underlying assumption is that the velocity of the flow is constant and thus time and distance are linearly related.for more information on the model

74 74 CHAPTER 5. ITÔ S FORMULA Par. Description Unit Value k c Reaction rate coefficient (l/day).763 k 2 Reaeration rate coefficient (l/day) 4.25 k 3 Sedimentation and adsorption loss rate for CBOD (l/day).254 k n Decay rate of NBOD (l/day).978 p Photosynthesis of oxygen (l/day) 7.28 r Respiration of oxygen (l/day) 7.75 d s Saturation concentration of oxygen (l/day) 1. s 1 Independent source for CBOD (l/day) 3. s 2 Independent source for OD (l/day). s 3 Independent source for NBOD (l/day). Table 5.1: Description and typical values for the parameters used in the CBOD model and its background, we refer to [11, 2]. where the parameters involved in the model are summarised in the table 5.1. A major problem with the deterministic CBOD model, as given above, is that the values for some of the parameters are hard to determine in practice. For example, k c, k 2, and k 3 all dependent on temperature and are difficult to measure. Therefore, it might be more realistic to incorporate some level of uncertainty in the model. This can be done by modelling the uncertain parameters as stochastic processes. The most convenient way of doing so is by adding white noise to the constant part of each equation, effectively turns the deterministic sources and sinks into stochastic ones. Therefore, show that by adding white noise to the constant part of each equation in the system (5.4), the following stochastic version of the CBOD model is given by; db do dn = k b k c k 2 k n k n B O N dt + s 1 s 2 s 3 dt + σ B σ O σ N dw t, (5.41) where the dw t term denotes the Wiener process increment at time t. Like the diffusion process, the stochastic CBOD model is Markov and its variables Gaussian. In addition, because the added noise term does not depend on the values of B, O, or N, the model can be interpreted in both the Itô and Stratonovich sense.

75 Chapter 6 Connection between Stochastic differential and PDES 6.1 Markov processes and Transition Density A stochastic process is Gaussian if the probability law of the process is normal. One of the nice properties of Gaussian processes is that you can apply linear algebraic operations on them, to get a new process, that is also Gaussian. Another interesting fact about the Gaussian distribution is that it may be approximated by adding up properly scaled, independent random variables, regardless of their individual distribution. A process {X t, t T } is called a Markov process if, for any finite parameter set {t i : t i < t i+1 } T, and for every real λ, Pr{X tn λ x t1,..., x tn 1 } = Pr{X tn λ x tn 1 }. (6.1) This means that future realisations of the process can be based on the current state, without having to know how the current state came about, a property expressed in the generalised causality principle: the future can be predicted from a knowledge of the present. Finally, we note that the probability law of a Markov process can be specified by giving p(x t ), and p(x t X τ ), for all t > τ T. The conditional densities, p(x t X τ ) are called the transition probability densities of the Markov process and will play a major role in this thesis. Before moving on, we would like to mention the Chapman-Kolmogorov equation p(x n X n 2 ) = p(x n x n 1 )p(x n 1 X n 2 )dx n 1, (6.2) which is valid for all Markov processes. 6.2 Transition Density Estimation In most cases, the solution of stochastic differential equations can be approximated using various numerical schemes by stepping through time in small strides. Often, we are not 75

76 76CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES really interested in the exact values the realisations of the state vector assume between the initial state and final state. What we are interested in is the distribution of the final state vector, and what better to describe this distribution than the probability density function. That is, given an equation like dx(s) = a(s, X s )ds + σ(s, X s )dw s, t s T, (6.3) we are particularly interested in either the transition density function p(t, x, T, y), or density function p(t, y). Estimation of the transition density function for fixed values of t, x, T, and y, can be done by several methods. For example the classical and well established forward estimation method and another one is a new method by Milstein, et al. [19], which is based on a combined forward-reverse estimation. These techniques can be conveniently implemented using Monte Carlo methods. This kind of methods is often used to generate data about stochastic processes or variables. To do so, a vast number of realisations of the random variable is generated. The laws of probability state that, as number of realisations goes to infinity, the distribution of the sampled realisations approaches the exact distribution and the instances of the stochastic variable can be used instead of the real variable. In practice, only a limited number of realisation is required to obtain, within a given level of uncertainty, the sought-after result of some problem that depends on the variable. 6.3 Forward density estimation Suppose we are interested in the transition density, p(t, x, T, y), of the process of equation 6.3. When we fix the initial condition of the process to x, X t = x, the transition density becomes a conditional probability density function p xt=x(t, y) which, because of the limitation imposed on the initial condition, is more specific and hence easier to determine. The first step in our forward density estimation is to approximate X t,x, e.g. using numerical techniques, and get X t,x. Using this process, we then generate N independent (n) realisations for time T, denoted by X t,x (T ) for n = 1,..., N. The second and final step is to use these realisations to obtain an estimate for the probability density for one single value of y at a time. There are many ways to do this and we will one of them is to use a kernel density estimator to approximate the probability density. In this technique, every realisation is replaced by a distribution function (the kernel) with a mean that coincides with the realisation. Before application, two choices have to be made. First, the kernel function, K, needs to be selected, which determines the shape of the kernel, and, in many cases, the efficiency of the method. Second, the influence of the kernel (i.e. scaling of the distribution function) has to be determined, by choosing an appropriate bandwidth, δ. The approximation of the transition density for a d-dimensional process is now given by ˆp(t, x, T, y) = 1 N N K (X n y), (6.4) n=1

77 6.4. THE FORWARD-REVERSE FORMULATION 77 Depending on the shape of kernel K, the resulting probability density can be continuous and smooth, for more details the reader is advised to read for example [19, 11]. 6.4 The forward-reverse formulation The idea behind the forward-reverse (FRE) method is that by generating both the forward process with an initial value of x at time t, and the reverse process with initial value y at time T and somehow combining these two, we can get an estimation for the transition density p(x, t, y, T ). This way, we can avoid generating a huge number of forward realisations only to hope a significant number of them will end up in the vicinity of y. Note that, one needs to re-write the SDE 6.3 so as to obtain the reverse process for an SDE in the Itô sense, with a forward process in the form of equation 6.3. In addition, one need to combine the realisations of these two processes to get the wanted probability density, interested readers are referred to the original article [19] and [2, 22, 23]for the full derivation of the Forward reverse model and the applications. 6.5 The Generator of the Itô Diffusion If X t is an Itô diffusion process, it is fundamentally associated in many application with a second order partial differential operator A. The basic connection between the diffusion process and the operator is in such a way that the operator A is the Generator of the process: Definition 22 cf [15]. If X t is a time-homogeneous Itô diffusion in R n, then the (infinitesimal) generator of the operator A of the process X t is defined by: E x [f(x t )] f(x) A(f(x)) = lim, x R n (6.5) t t The set of function f : R n R such that the limit exist at x is denoted by D A (x), while D A denotes the set of all functions for which the limit exists x R n. The detailed exposition including proofs this theorem and next theorem are found in Chapter VII of the book by Øksendal [15]. The relation between the operator differential A of X t and the coefficients µ and σ of the Itô SDE dx t = µ(x t )dt+σ(x t )dw t, t s; X s = x can be obtained by applying Itô formula to the function f(x t ) and writing it in the integral from as well as taking expectation on both side of the resulting integral. Let F (X t ) be a twice continuously differentiable function, by using itô formula, we get f(x t ) f(x)) = t ( i µ i (x) f x i ) (σσ T 2 f ) i,j ds + x i,j i x j i,k t ( ) f σ i,k dw k (s) x i

78 78CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES [ ( ) t E[f(X t )] f(x)) = E µ i (x) f + 1 (σσ T 2 f ) i,j ds + x i i 2 x i,j i x j i,k [ ( ) ] t E[f(X t )] f(x)) = E µ i (x) f + 1 (σσ T 2 f ) i,j ds x i 2 x i x j E[f(X t )] f(x)) lim t (t ) E[f(X t )] f(x)) lim t (t) = i [ i = Af(x) Consider the following theorem as it is in [15]: Theorem 2 Let X t be the Itô diffusion. If f C 2 (R n ), then f D A and i,j µ i (x) f x i dx t = µ(x t )dt + σ(x t )dw t, t s; X s = x Af(x) = i µ i (x) f x i Example 13 Find the generator of the following Itô diffusions: µ σ are constants solution t ] (σσ T 2 f ) i,j x i x j i,j ( ) ] f σ i,k dw k (s) x i (σσ T 2 f ) i,j (6.6) x i x j i,j dx t = µx t dt + σdw t (W t R) To answer the following questions make use the above of the Theorem 2: µ σ are constants solution Af(x) = µx f x σ2 2 f x 2 ; f C2 (R) dx t = rx t dt + σx t dw t (GBM), (W t R) Af(x) =rx f x σ2 x 2 2 f x 2 ; f C2 (R)

79 6.5. THE GENERATOR OF THE ITÔ DIFFUSION 79 Exercises Find the generator of the following Itô diffusions: 1. dy t = [ dt dx(t) ] where dx t = µx t dt + σw t 2. (i) dx(t) = [ dx1 (t) dx 2 (t) ] = [ 1 ] dt + [ 1 X 1 (t) ] [ dw1 (t) dw 2 (t) ] Example Find an Itô diffusions (i.e., write down the stochastic differential equations for it) whose generator is the following: 2. solution dx t = dt + 2dB t solution Af(x) = f x + 2 f x 2 ; f C2 (R) Af(t, x) = f t + cx f x α2 x 2 2 f x 2 dx(t) = [ dx1 (t) dx 2 (t) ] [ = 1 cx 2 (t) ] dt + [ αx 2 (t) ] dw t Exercise Find an Itô diffusions (i.e., write down the stochastic differential equations for it) whose generator is the following: Af(x 1, x 2 ) = 2x 2 f x 1 + ln(1 + x x 2 2) f x (1 + x2 1) 2 f x x 1 2 f x 1 x f x 2 2

80 8CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES Transition density of Wiener process Let p(t, x, y) be the probability that a stochastic process in particular Wiener/Brownian process changes value from x to y in time t, p(t, x, y) = ( ) 1 (y x)2 2t e 2πt g(x) = E x (h(w (t)) = E (h(w (s + t) F s ) = g(w (s)) = h(y)p(t, x, y)dy h(y)p(t, W (s), y)dy Now if we denote p(t, t 1 ; x, y) the density (in y variable ) of a stochastic process X(t 1 ), conditioned on X(t ) = x and let h(y) be a function, and E,x h(x(t 1 )) is the expectation h(x(t 1 )) given that X(t ) = x. In other words, E t,x (h(x(t 1 )) = R h(y)p(t, t 1 ; x, y)dy Example 15 Drifted Brownian /Wiener motion Consider the SDEs dx(t) = adt + dw (t), X(t ) = x In integral form X(t) = x + a(t 1 t ) + (W (t 1 ) W (t )) conditioned on X(t ) = x, the random variable X(t 1 ) is normal with mean x + a(t 1 t ) and variance (t 1 t ), that is p(t, t 1 ; x, y) = ( ) 1 2π(t1 t ) e (y (x+a(t 1 t )))2 2(t 1 t ) Remark 2 Note that p depend on t and t 1 only through their differences (t 1 t ). This is always the case when the coefficients a(t, x) and σ(t, x) don t depend on t Note that in general, the transition density function should satisfy the Kolmogorov-Chapman equations, for any t [t, T ] p(t, x; s, y) = p(t, x; t, u)p(t, u; s, y)du (6.7) R

81 6.6. KOLMOGOROV BACKWARD EQUATION (KBE) Kolmogorov Backward equation (KBE) Consider the SDEs dx(t) = a(t, X t )dt + σ(t, X t )dw (t) (6.8) and let p(t, t 1 ; x, y) be the transition density. Then the Kolmogorov Backward equation(kbe) is p(t, t 1 ; x, y) t + a(t, x) x p(t, t 1 ; x, y) σ2 (t, x) 2 x p(t, t 2 1 ; x, y) p(t, t 1 ; x, y) t + a(t, x) x p(t, t 1 ; x, y) σ2 (t, x) 2 x p(t, t 2 1 ; x, y) = IC p(t, t, x, y) = δ(x y) variables t and x in KBE are called backward variables. In general the d-dimensional Kolmogorov Backward Equation can be represented if its transition density function p(t, x; s, y) is continuous with respect to t and s, and all first and second derivatives respect to x and y exist and are continuous. Then p(t, x; s, y) is the solution of the Kolmogorov Backward equation (t o t s T ): p t + d a i (t, x) p + 1 i=1 x i 2 p(s, x; s, y) = δ(x y) d i,j=1 2 p b ij (t, x) = x i x j (6.9) Note that differential operator for KBE operates on backward variables (t, x) hence, backward equation. In the case that a and σ are function of x alone and p(t, t 1 ; x, y) depend on t and t 1 only through their difference τ = t 1 t then p(t, t 1 ; x, y) is written as p(τ; x, y) and the KBE is: p(τ; x, y) = a(x) τ x p(τ; x, y) σ2 (x) 2 p(τ; x, y) x2 Example 16 Drifted Brownian motion dx(t) = adt + dw (t), X(t ) = x has the KBE : p(τ, x, y) = a τ x p(τ, x, y) Which can be shown where p becomes: 2 p(τ, x, y) = (6.1) x2

82 82CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES p(τ; x, y) = ( ) 1 (y (x+aτ))2 2τ e 2πτ (6.11) p(τ, x, y) τ = p(τ, x, y) x = 2 p(τ, x, y) x2 = [ 1 a(y x aτ) + + 2τ τ [ ] (y x aτ) p τ ] 1τ (y x aτ)2 [ p + p τ 2 ] (y x aτ)2 p 2τ 2 Finally we see that KBE is satisfied by p, p τ = ap x p xx that is ap x p xx p τ = 6.7 Feynman-Kač representation formula In this section, explore the intimate connection which exists between stochastic differential equations and certain partial differential equations. The detailed discussion can be found in [21]. For simplicity we consider the one-dimensional case. Let us consider the following so called Cauchy problem. We are given three scalar functions µ(t, x), σ(t, x) and Φ(x). Our task is to find a function F (t, x) which satisfies the following boundary value problem on [t, T ] R Consider the n-dimensional stochastic differential equation dx(s) = µ(s, X(s))ds + σ(s, X(s))dW (s). The infinitesimal operator of X is the partial differential operator A defined for any function h C 2 (R n ) by Ah(t, x) = n i=1 µ i (t, x) h x i (x) n 2 h C ij (t, x) (x), x i x j i,j=1 where C = σσ T. The Itô formula takes now the form: dh(t, X(t)) = { } h t + Ah dt + n h x i i=1 d σ ij dw j (t) j=1

83 6.7. FEYNMAN-KAČ REPRESENTATION FORMULA 83 Now we are going to discuss the remarkable property that the solution of certain partial differential equations of parabolic type can be represented as a mean value of a random variable. Consider the partial differential equation where we are given three scalars µ(t, x), σ(t, x) and Φ(x), so our task is to find Find a function F that satisfies the Boundary value problems on [, T ] R: F (t, x) + µ(t, x) F t x (t, x) σ2 (t, x) 2 F (t, x) x2 =, F (T, x) = Φ(x), where (t, x) [, T ] R and Φ is a given real function. Fix a point (t, x) [, T ] R and define the stochastic process X on the time interval [t, T ] as the solution of the SDE dx(s) = µ(s, X(s))ds + σ(s, X(s))dW (s) X(t) = x. The infinitesimal operator for this process is given by The given PDE can now be written as A = µ(s, x) x σ2 (s, x) 2 x 2. F (t, x) + AF (t, x) t =, F (T, x) = Φ(x). Let F be a solution of the PDE, then applying the Itô formula to F (s, X(s)), we get df (s, X(s)) { F = s + µ F x F 2 σ2 x { 2 F = (s, X(s)) + AF (s, X(s)) t = σ(s, X(s)) F (s, X(s))dW (s), x and it follows, after integration, that } ds + σ F x } dw (s) ds + σ(s, X(s)) F (s, X(s))dW (s) x hence F (T, X(T )) F (t, X(t)) = T t σ(s, X(s)) F (s, X(s))dW (s), x Φ(X(T )) F (t, x) = T t σ(s, X(s)) F (s, X(s))dW (s). x

84 84CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES If the process ( σ(s, X(s)) F x (s, X(s))) 2 [t, T ], we can take expectations and we get F (t, x) = E[Φ(X(T ))]. This result is known as the Feynman-Kač stochastic representation formula. The following closely related partial differential equation is important in finance: F (t, x) + µ(t, x) F t x (t, x) σ2 (t, x) 2 F (t, x) rf (t, x) x2 =, F (T, x) = Φ(x), where r is a given real number, (t, x) [, T ] R, and Φ is a given real function. Fix a point (t, x) [, T ] R and define the stochastic process X on the time interval [t, T ] and the operator A as before. Apply now Itô s formula to the process e rs F (s, X(s)), de rs F (s, X(s)) { ( ) F = e rs (s, X(s)) rf (s, X(s)) t +σ(s, X(s)) F (s, X(s))dW (s) x = e rs σ(s, X(s)) F (s, X(s))dW (s). x } + e rs AF (s, X(s)) ds Integrating and taking expectations, again assuming ( e rs σ(s, X(s)) F x (s, X(s))) 2 [t, T ], we get: e rt Φ(X(T )) e rt F (t, x) =, and F (t, x) = e r(t t) E t,x [Φ(X(T ))]. Example 17 Solve The following partial differential equation F t (t, x) σ2 (t, x) 2 F (t, x) x2 = F (T, x) = x 2, where σ is constant Solution 3 From the proposition above we know that F (t, x) = E[X 2 T ] where dx(s) = ds + σdw (s), X(t) = x. This equation can be solved, and we get thus X T has the distribution X T = x + σ (W T W t )

85 6.7. FEYNMAN-KAČ REPRESENTATION FORMULA 85 Hence, Example 18 Solve The following PDE F (t, x) = E[X 2 T ] F (t, x) = Var[X T ] + [EX T ] 2 F (t, x) = σ 2 (T t) + x 2 F (t, x) = σ 2 (T t) + x 2 F (t, x) + µx F t x (t, x) σ2 x 2 2 F (t, x) x2 = F (T, x) = ln(x), where σ is constant Therefore its SDE is: Solution 4 dx(s) = µx s ds + σx s dw (s), X(t) = x. By applying Itô formula to the function F (s, X s ) and taking expectation to the resulting equation, we get F (t, x) = E[ln(X T )] Where Hence, X T = xe (µ σ2 2 ) (T t)+σ(w T W t) ) (µ σ2 (T t)+σ(w T W t) ) 2 F (t, x) = E[ln(xe [ ) ] F (t, x) = E ln(x) + (µ σ2 (T t) + σ(w T W t ) 2 [ ) ) F (t, x) = ln(x) + (µ σ2 (T t) + 2 ) F (t, x) = ln(x) + (µ σ2 (T t) 2 Example 19 Solve The following PDE F (t, x) + µx F t x (t, x) σ2 x 2 2 F (t, x) x2 = F (T, x) = ln(x 2 ), where σ is constant Therefore its SDE is:

86 86CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES Solution 5 dx(s) = µx s ds + σx s dw (s), X(t) = x. By applying Itô formula to the function F (s, X s ) and taking expectation to the resulting equation, we get F (t, x) = E[ln(XT 2)] Where X T = xe X 2 T = ( xe X 2 T = x 2 ( e 2 ( (µ σ2 2 (µ σ2 2 µ σ2 2 ) (T t)+σ(w T W t) ) ) 2 (T t)+σ(w T W t) ) ) (T t)+2σ(w T W t) Hence, [ ( F (t, x) = E ln (x 2 e 2 [ F (t, x) = E ln(x 2 ) + 2 [ F (t, x) = ln(x 2 ) + 2 F (t, x) = ln(x 2 ) + 2 µ σ2 2 ) )] (T t)+2σ(w T W t) ) (µ σ2 2 ) ) (µ σ2 (T t) + 2 ) (µ σ2 (T t) 2 F (t, x) = ln(x 2 ) + 2µ(T t) σ 2 (T t) ] (T t) + 2σ(W T W t ) dimensional Fokker Planck equation(fpe) If X(s) is a diffusion Markov process which is a solution of the Itô stochastic differential equation: dx(s) Itô = µ(x(s), s)ds + σ(x s, s)dw s, X(s) = x (6.12) and if there exist appropriate continuous derivatives, the transition density p(t, x; s, y) is a solution of the FPE. To gain insight into the probability evolution of the process X s, we need to know the transition density function of X t. Assume that its transition density function p(t, x; s, y) where the interval [t,s].then p(t, x; s, y) will satisfy the Kolmogorov forward equation commonly known as the Fokker-Planck equation:

87 DIMENSIONAL FOKKER PLANCK EQUATION(FPE) 87 p s (t, x; s, y) + y (µ(s, y)p(t, x; s, y)) y 2 (σ2 (s, y)p(t, x; s, y)) = (6.13) p(t, x, t, y) = δ(x y) The decisive property of the stochastic process X(t) is that its transition density function p(t, x; s, y) is uniquely determined merely by the drift vector function µ (t, x) and diffusion matrix C (t, x) := σ T (t, x) σ (t, x) d- dimensional Fokker Planck equation(fpe)) Suppose that the transition density function p(t, x; s, y) is continuous with respect to t and s, and all first and second derivatives respect to x and y exist and are continuous. Then p(t, x; s, y) is the solution of the Kolmogorov forward (Fokker-Planck) equation (t o t s T ) p s + d (µ i (s, y)p) 1 d 2 (C i,j (s, y)p) = i=1 y i 2 i,j=1 y i y j (6.14) p(t, x; t, y) = δ(x y) where δ(x y) is the Dirac function. The proof of this theorem is given in (Gihman (1972, for example). Note that differential operator for operates on forward variables (s, y) hence, forward equation. Where the differential operator for FPE operates forward in time. The probabilistic structure of a diffusion X s can be determined via the following deterministic PDE: L = n i=1 µ i (s, y) y i n 2 C ij (s, y), y i y j where C = σσ T. Thus, the Fokker-Planck equation (6.14) can be written in the compact form: i,j=1 p = Lp (6.15) s Note that the Fokker-Planck equation is a deterministic partial differential equation that in general has to be solved numerically. For vector systems with dimension larger than, say, 3 this is very time consuming. In this case the probability density can be determined more efficiently by generating a large number of tracks of the underlying SDE. Therefore FPE is one way of solving SDEs by solving the deterministic parabolic differential equation for the density function. However, if a physical system represented by the FPE, it can be rewritten as a system of differential equations.

88 88CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES Example 2 It can be shown that the FPE is an advection diffusion type of partial differential equation that describes the evolution of the conditional probability density function. For the more general one dimensional SDE given by the Itô equation: dx(t) = f(t, X t )dt + g(t, X t )dw (t), X(o) = X (6.16) the transition probability density function p = p(t, x t, x ) of the stochastic process X t is propagated according to the following Fokker-Planck equation: p t = x (f(t, x)p) x 2 (g2 (t, x)p) (6.17) p(t, x t, x ) = δ(x X ) Note that the SDE (6.16) must be interpreted in the Itô sense if it is to be consistent with the FPE (6.17). The solution of the FPE is the conditional probability density function associated with the solution X t of the stochastic differential equation (6.16). Therefore, the PDE (6.17) is completely consistent with the Lagrangian description of equation (6.16). It gives a complete description of variation in time and space of the probability density function. Formulation of underlying SDE epidemic model SIS From Fokker Planck Equations Let us also show how SIS stochastic models of the SDEs form can be derived or related their FPE. Note that the stochastic SIS epidemic models depends on the number of infectives, {I(t)}, t [, ], where I(t) has an associated probability density function p(x, t). where P r{a I(t) b = b a p(x, t)dx} It is also assumed that the SIS model obeys Markov property P r{i(t n ) I(t ), I(t 1 ) I(t n 1 ) = P r{i(t n ) I(t n 1 )} for any sequence of the real numbers t t 1 < t n 1 < t n. Thus, the transition probability density function for the stochastic process is denoted by p(y, t + t; x, t) where at time t I(t) = x and at time t + t, I(t + t), the process is time homogeneous if the transition probability density function does not depend on time t but depends on the length of time t. As we discussed in our earlier topics, the stochastic process is referred to as diffusion process if it is a Markov process in which the infinitesimal mean and variance exist. The Stochastic SIS epidemic models are time homogeneous, diffusion process where the infinitesimal mean and variance can be derived from the FPE. Therefore for the stochastic SIS epidemic model, it can be shown that the probability density function satisfies a forward Kolmogorov differential equation. This is a second

89 6.9. DEFINITION OF ORDER OF CONVERGENCE OF NUMERICAL SCHEME 89 order PDE commonly called FPE. For instance, the FPE of an epidemic model (SIS) can be written as p(x, t) = {[ ] } β x(n x) (b + γ)x p(x, t) t x N + 1 {[ ] } (6.18) 2 β x(n x) (b + γ)x p(x, t) 2 x 2 N The corresponding SDE can be extracted from (6.18) to have That is, di = di dt = β I(N I) (b + γ)i + N β dw (t) I(N I) (b + γ)i N dt (6.19) [ ] [ ] β β I(N I) (b + γ)i dt + I(N I) (b + γ)i dw (t) (6.2) N N where S(t) and I(t) be continuous random variables for number of susceptible and infected individuals at time t and S(t) and S(t), I(t) [, ), β is the contact rate, γ is recovery rate, b birth rate and N is the total population. For comprehensive details of stochastic epidemic models the reader is referred to for example [24] 6.9 Definition of order of convergence of Numerical scheme Strong convergence In the numerical methods, there are two ways of measuring accuracy, namely strong convergence and weak convergence. For strong convergence, we require an instance of the stochastic process to match the exact solution of the process which is driven by the same random function as closely as possible. Definition 23 Strong order of Convergence Under suitable conditions of the SDE, for a fixed time T, the strong order of convergence is β 1 if there exist a positive constant K and a positive constant such that T = N t: for all < t <, E{ X(T ) X(T N ) } K( t) β 1 (6.21) where X(T ) is the exact solution and X(T N ) the approximated solution. The strong concept measures the rate at which the mean of the error decays as t. But when one is interested in the distribution of the random process such as those of X(t), leads to a less demand. This leads to the concept of the weak convergence.

90 9CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES Weak convergence Strong convergence is often too strict, and in many cases we can attenuate our demands and require only weak convergence. For this type of convergence it is not necessary for the tracks to closely match the exact ones, as long as the characteristics of the stochastic state vector remain the same as those found for the exact solution. Definition 24 Weak order of Convergence The weak order of convergence is β 2 if there exist a positive constant K and a positive constant such that for a fixed time T = N t: {Eh(X(T ), T )} E{h( X(T N ), T N )} K( t) β 2 (6.22) for all < t < and for each functions h with polynomial growth. The errors in (6.21) and (6.22) are global discretisation errors, and the largest possible values of β 1 and β 2 give the corresponding strong and weak orders, respectively, of the scheme. Apart from the emphasis of order of convergence of the scheme, the choice of the numerical scheme is emphasized as well. Sometimes it may be required to transform the original stochastic differential into Itô or Stratonovich type in order to be able to use a specific scheme so as to avoid solving numerically a different physical concept. Any interested reader on the rigorous analysis of the accuracy of the numerical schemes is referred to (Stijnen 22,Kloeden 23). 6.1 Derivation of Numerical schemes for SDEs The numerical schemes used for SDEs are quite similar to those used for ODEs. One important difference, however, is that different schemes solve different interpretations of the same differential equation. Therefore, in order to approximate the right solution, care must be taken with the selection of the stochastic numerical scheme. The derivation of the stochastic schemes can be done in several ways, some of such ways are the expansion of the stochastic Taylor series and or the use of derivative free schemes (Kloeden (1999)) Stochastic Taylor expansion and derivation of stochastic numerical schemes The concept of expansion the stochastic Taylor expansion requires the inclusion of more terms of Taylor series so as to obtain higher order of convergence. Both the Stratonovich and the Itô sense can be derived but let us consider only the expansion of the following Itô SDE: dx(t) Itô = f(t, X(t))dt + g(t, X(t))dW (t), X(t ) = x, (6.23)

91 6.1. DERIVATION OF NUMERICAL SCHEMES FOR SDES 91 with the solution such as X(t) Itô = X(t ) + t t f(s, X(s))ds + t t g(s, X(s))dW (s). (6.24) Let us assume that v is sufficiently smooth function and by the help of 1-dimensional Itô SDE (6.23), the differential of v(t, X(t)) is evaluated and leads to the following Itô s formula: d[v(t, X(t))] = v t t,x(t)dt + f(t, X(t)) v x t,x(t)dt g2 (t, X(t)) 2 v x 2 t,x(t)dt Consequently; dv(t, X(t)) = [ v t + f(t, X(t)) v x + 1 ] 2 g2 (t, X(t)) 2 v dt. x 2 + g(t, X(t)) v dw (t) + odt. x + g(t, X(t)) v dw (t) (6.25) x with the following partial operators; dv = L vdt + L 1 vdw (t), L = + f(t, X(t)) t x g2 (t, X(t)) 2 x 2 L 1 = g(t, X(t)) x. Next, we employ the Itô formula to the drift function f(s, X(s)) in equation (6.24), to get the differential: d[f(s, X(s))] = L fds + L 1 fdw (s), or we write it in the integral form: f(s, X(s)) = f(x(t ), t ) + Similarly for diffusion coefficient function g(s, X(s)) we get d(g(s, X(s))) = L gds + L 1 gdw (s), s s L fdz + L 1 fdw (z). (6.26) t t

92 92CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES whose solution is given by g(s, X(s)) = g(x(t ), t ) + s t L gdz + By substituting equations (6.26) and (6.27) into (6.24), we get X(t) = X(t ) + + t t s t L 1 gdw (z). (6.27) t { s s } f(t, X(t )) + L f(z, X(z))dz + L 1 f(x(z), z)dw (z) ds t t t { s s } g(t, X(t )) + L g(z, X(z))dz + L 1 g(z, X(z))dW (z) dw (s). t t t X(t) = X(t ) + f(x(t ), t ) ds + g(t, X(t )) t + t s t t L fdzds + + t t s This leads to a first approximation of the form; t t t s t t dw (s) L 1 fdw (z)ds t L gdzdw (s) + t s X(t) = X(t ) + f(t, X(t ))(t t ) + g(t, X(t ))(W (t) W (t )) + + t s t t s t t L f(x(z), z)dzds + t L g(z, X(z))dW (z)ds + t s t t t s t t t L 1 f(x(z), z)dw (z)ds L 1 gdw (z)dw (s). t L 1 g(z, X(z))dW (z)dw (s). (6.28) X(t) = X(t ) + f(t, X(t ))(t t ) + g(t, X(t )) (W (t) W (t )) + Errt1, (6.29) X(t + t) = X(t) + f(t, X(t)) t + g(t, X(t)) (W (t + t) W (t)), or with t = n t we get an iterative equation: X n+1 = X n + f(t n, X n ) t n + g(t n, X n ) W tn. (6.3) The equation (6.28) is the simplest non-trivial Itô Taylor expansion of X t. It involves integrals with respect to both the time and the Wiener processes, with multiple integrals with respect to both in the remainder. In this derivation it has been assumed that the coefficient functions f and g are sufficiently smooth. Again have applied the Itô s rule

93 6.1. DERIVATION OF NUMERICAL SCHEMES FOR SDES 93 to the higher order terms of the integrand in (6.28) to obtain schemes with higher order of convergence. Note that the first three terms of (6.28) lead to the stochastic Euler scheme where, Errt1 = t s t t L f(z, X(z))dzds + + t s t t s t t t L g(z, X(z))dzdW (s) + L 1 f(z, X(z))dW (z)ds t s t t L 1 g(z, X(z))dW (z)dw (s). (6.31) Furthermore, if you analyse the next error term (with the lowest order) from equation (6.31) t s t t L 1 g(z, X(z))dW (z)dw (s). (6.32) The next higher order approximation can be obtained by applying the Itô differentiation formula to the function L 1 g and get the following or in the integral form: d[l 1 g] = L L 1 gdz + L 1 L 1 gdw (z), (6.33) L 1 g(r, X(r)) = L 1 g(t, X(t )) + z t L L 1 gdr + the substitution of Eqn. (6.34) into Eqn. (6.28), yields X(t) = X(t ) + f(t, X(t )))(t t ) + g(t, X(t ))[W (t) W (t )] + L 1 g(t, X(t )) z t s t t L 1 L 1 gdw (r), (6.34) t dw (z)dw (s) + Errt2. (6.35) This result gives a with higher order of accuracy, thus a more accurate scheme for scalar stochastic differential equations if you compare with Heun scheme.this scheme is called Milstein scheme defined by X(t) = X(t ) + f(t, X(t )))(t t ) + g(t, X(t ))[W (t) W (t )] g(t, X t ) g { (Wt W ( t )) 2 (t t ) } + Errt2 (6.36) x where the remainder Errt2 = + t s t t s t t L f(x(z), z)dzds + t s t t L g(z, X(z))dzdW (s) + + L 1 f(z, X(z))dW (z)ds t t s z t t t s z t t t L L 1 g(r, X(r))drdW (z)dw (s) t L 1 L 1 g(r, X(r))dW (r)dw (z)dw (s). (6.37)

94 94CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES If we now let us analyze the error in the fourth term of Eqn. (6.35) as follows; t s t t L 1 g(x(t ), t )dw (z)dw (s) < K 4 t s t t dw (z)dw (s). This last error term dominates and determines the strong order of the convergence of the Euler scheme, where K 4 = L 1 g(t, X(t )) is a known constant. The error in this scheme can be analysed: K 4 t t (W s W (t ))dw s = K 4 { = K 4 { t t [W s W t ]dw s t t W s dw s = K 4 { W 2 t W 2 t 2 t t W t dw s } (t t ) 2 W t (W t W t )} = K 4 2 {W 2 t W 2 t 2W t (W t W t ) (t t )} = K 4 2 {[W (t) W (t )] 2 (t t )} = K 4 2 [ W 2 (t n ) t] = O( t). Note that, the stochastic integral (6.32) has a local truncation error of O( t). The increments of Wiener process are independent of each other for a time step size t regardless of their size. Due to this fact, we need to add the variances of these local errors in order to obtain the variance of the global error instead of adding local errors for every time step. Therefore the stochastic integral (6.32) has a global truncation error of O( t 1 2 ). This last term dominates and determines the strong order of convergence of the Euler Scheme. Note that, for weak convergence, many realizations are generated and averaged to determine an approximation of (recall the definition of weak order of convergence): E[h (X T, T )]. Because of the averaging procedure all random error terms cancel out and vanish for increasing number of realizations(samples). As a result for weak order of convergence it is only the deterministic error term that has to be taken into account. This results into a weak order of convergence of the Euler scheme of O( t). This implies that if we use the Euler scheme and generate many tracks then the individual tracks are only half order accurate (strong convergence) while for example the results on the mean and variance of the tracks are first order accurate (weak convergence). This is caused by the fact that the errors due to the random term in the SDE when you do trackwise computations cancel out if you compute the ensemble mean. Whereas the Milstein scheme has O( t) in the strong sense for scalar equations. While Milstein scheme has O( t 1 2 ) which is the same order as that of Euler scheme. Nevertheless, in the weak sense the both schemes have the same order of convergence.

95 6.1. DERIVATION OF NUMERICAL SCHEMES FOR SDES 95 The 1 dimensional Milstein scheme can be written as follows; X(t) = X(t ) + f(t, X(t )) t + g(t, X(t )) W (t ) g(t, X(t )) g { [W (t) W (t )] 2 (t t ) } + Errt2. (6.38) x Thus, the Milstein scheme for scalar stochastic differential equation which has higher strong order than that of Euler scheme has been obtained: X n+1 = X n + f(t n, X n ) t + g(t n, X n ) W n g(t n, X n ) g x [ W 2 (t n ) t]. (6.39) More analysis of the error terms in Eqn. (6.37) can heuristically lead to higher order of convergence of the numerical schemes. For example the following error term leads to; t s t t L f(z, X(z))dzds < K 1 t s t t dzds = O( t 2 ), where K 1 is a constant. This deterministic error term introduces a local error of O( t 2 ) and as a consequence, a global error of O( t). Next, consider the following; t s t t L 1 f(z, X(z))dW (z)ds < K 2 t s t t s t t dw (z)ds = O( (W (t n ) t) = ( t 1.5 ), t L g(z, X(z))dzdW (s) < K 3 dw (z)ds t t t = O( t W (t n )) = O( t 1.5 ). Those two stochastic terms introduce a strong local error of O( t 1.5 ) and as a consequence, a strong global error of O( t). One can keep on expanding the next multiple Itô integrals to an arbitrary higher order. Each time, the remainder will involve the next set of multiple Itô integrals with nonconstant integrand. Some of these integrals can be solved analytically. Take the following examples whose solutions are well known and are obtained in for example [15]: t s t t s t t t ds = t t t dzds = 1 2 (t t ) 2 t dw (z)dw (s) = 1 2 {[W (t) W (t )] 2 (t t )}.

96 96CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES However, it is not possible to derive an analytical expression for most of the stochastic integrals. Consequently these terms also have to be evaluated numerically. Generally, higher order numerical methods that are based on stochastic Taylor expansion are therefore not very useful in practice. The exception here is that in some applications based on some specific nature of the functions f and g, terms in stochastic the Taylor expansion may drop out, in this way, a higher-order scheme can be obtained. Having discussed the Taylor stochastic expansion. It is now the right time to discuss various numerical schemes in the following section Numerical schemes The numerical schemes which can be used to implement the Itô or Stratonovich SDEs such as the equation (5.17) and (5.19) respectively are different. Therefore, in order to approximate the right solution, care must be taken with the selection of the numerical scheme see (Kloeden 23, Milstein 1995). The schemes discussed in this section can only be applied to either Itô or Stratonovich SDEs. Because transformation rules between these two interpretations exist as seen ealier, the selection of numerical scheme is unrestricted, as long as the model is given in the right interpretation, or is transformed likewise. Euler scheme The Euler scheme is a result of the stochastic Taylor expansion. The basic Euler scheme for scalar SDEs is derived from the Itô stochastic differential equation (6.23) (see Eqn. (6.3 in section 6.1.1) for 1 -dimension Euler scheme): X n+1 = X n + f(t n, X n ) t n + g(t n, X n ) W (t n ) (6.4) If Y (t) = X 2 (t), the 2-dimensional Itô SDE with 2-dimensional Brownian process is written: X n+1 = X n + f(x n, t n ) t n + g(t n, X n ) W 1 (t n ) Y n+1 = Y n + f(t n, Y n ) t n + g(t n, Y n ) W 2 (t n ). The scheme computes discrete approximations X n X(t n ), at times t n = n 1 l= t l. In practice it is common to use a single pre-chosen value for the step size t l. The stochastic Euler scheme is consistent with the Itô calculus because the noise term in (6.4) approximates the relevant stochastic integral over [t n, t n+1 ] by evaluating the integrand at lower end point, thus tn+1 g(s, X(s))dW (s) g(t n, X n ) W (t n ). t n As in the scalar case, the vector stochastic Euler scheme have strong convergence order β 1 = 1 2 and have weak order β 2 = 1 (see e.g., Desmond (21)). The Euler scheme is applicable only in a sense of the Itô interpretation, therefore, if we are given the Stratonovich SDEs, first we have to transform them into their Itô equivalents.

97 6.1. DERIVATION OF NUMERICAL SCHEMES FOR SDES 97 Milstein Scheme In Section 6.1.1, equation (6.39) is what we call the Milstein scheme. This is a more accurate scheme in case of the scalar stochastic differential equation compared to the Euler scheme (see the paper by [1] for more details about the two schemes, for example. X n+1 = X n + f(t n, X n ) t + g(t n, X n ) W n g(t n, X n ) g x ( W 2 (t n ) t). (6.41) Milstein scheme is O( t) i.e., β 1 = 1 in the strong sense for scalar equations. For vector systems it is generally only O( t) 1 2, i.e., β 1 = 1. In the weak sense the Milstein scheme 2 has the same order of convergence as that of the Euler scheme. The 2-dimensional Milstein scheme can be written as follows; X n+1 = X n + f(t n, X n ) t + g(t n, X n ) W n g(t n, X n ) g x ( W 2 (t n ) t) Y n+1 = Y n + f(t n, Y n ) t + g(t n, Y n ) W n g(t n, Y n ) g y ( W 2 (t n ) t). Note that the partial derivative of the stochastic g must be available. Furthermore, similar lines with minor changes can be followed to derive the Stratonovich schemes. Note that, the development of higher-order schemes based on Taylor expansions require more and more derivatives of the coefficient functions f or g. It is also necessary to deal with multiple stochastic integrals (see section 6.1.1), which cannot be calculated exactly anymore, and thus require numerical approximations. However, the alternatives is the use of derivative free explicit schemes to avoid the derivative of the drift f and the diffusion coefficients g. Examples of such schemes are Heun and Runge-kutta, more of their information can be found in [13, 12, 14]. Heun Scheme The Heun scheme evaluates both the f and g functions at the current point as well as at the estimated succeeding point, and the results of both functions are averaged to get the definite rate of change. Using these rates, an improved approximation of the next point is made. X n+1 = X n + f(t n, X n ) t + g(t n, X n ) W (t n ) X n+1 = X n {f(t n, X n ) + f(t n+1, X n+1)} t {g(t n, X n ) + g(t n+1, X n+1)} W (t n ) Y n+1 = Y n + f(t n, Y n ) t + g(t n, Y n ) W (t n ) Y n+1 = Y n {f(t n, Y n ) + f(t n+1, Y n+1)} t {g(t n, Y n ) + g(t n+1, Y n+1)} W (t n ).

98 98CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES It is sometimes called the improved Euler. It uses the basic Euler method as an intermediate step and is an example of a predictor-corrector method. This means that we first make a prediction of the next value for X and, using this predicted value, apply some form of correction to get our final prediction. The Heun scheme uses the basic Euler method to find an intermediate value for X n+1, denoted below by X n+1. Using this value, the rate of change at time t n+1 is determined. The final estimation is then made by using the average of the slope at the current point and approximated slope at the next time-step: This scheme can only be used for SDEs formulated in the Stratonovich sense. Stratonovich SDE use Heun scheme, it has order O(( t) 1 ) i.e., β = 1 in the strong sense and O(( t) 1 ) i.e., β = 1 [13, 1, 11]. Runge-Kutta The stochastic extension of the Runge-Kutta scheme described earlier solves Stratonovich SDEs and is given by K = f(x i, t i ), G = g(x i, t i ), X () i = X i + 1K 2 t i + 1G 2 W i, K 1 = f(x () i, t i + 1 t 2 i), G 1 = g(x () i, t i + 1 t 2 i), X (1) i = X i + 1K 2 1 t i + 1G 2 1 W i, K 2 = f(x (1) i, t i + 1 t 2 i), G 2 = g(x (1) i, t i + 1 t 2 i), X (2) i = X i + K 2 t i + G 2 β i, K 3 = f(x (2) i, t i + t i ), G 3 = g(x (2) i, t i + t i ), (6.42) X i+1 = X i + 1(K 6 + 2K 1 + 2K 2 + K 3 ) t i + 1(G 6 + 2G 1 + 2G 2 + G 3 ) W i. Let us consider the diffusion model represented by the SDE: dx = axdt + bdw (t), X() = x. (6.43) with a and b constants. The initial distribution of variable X is given by a delta-peak located at x. For negative values of a, the process is one of the Ornstein-Uhlenbeck type, and will converge to a stable distribution. When a equals zero, the model reduces to a scaled version of the Wiener process. The model is stochastic due to the presence of the Brownian motion term W (t), and Gaussian as the result of this and the linearity of the terms. Note that these properties obviously hold only for non-zero values of b. Finally, the process is Markov; so, in order to determine the value of X t+1, an instance of the process X at time t + 1, we only need to know X t. The following program has three numerical schemes that approximate the SDE (6.43) clear; a =-1;b = 1;x= 1; steps = 5; n = 1; %single sample strong sense

99 6.1. DERIVATION OF NUMERICAL SCHEMES FOR SDES 99 %1; 1 sample weak sense t = linspace(, 1, steps+1); dt = t(2) - t(1); xeuler = ones(n,length(t)) * x;xheun = ones(n,length(t)) * x; xrunge = ones(n,length(t)) * x; for i=2:length(t) dwiener = sqrt(dt) * randn(n,1); xeuler(:,i) = xeuler(:,i-1) + a * xeuler(:,i-1) * dt + b * dwiener; xheunstar= xheun(:,i-1) + a * xheun(:,i-1) * dt + b * dwiener; xheun(:,i)= xheun(:,i-1) +.5 * a * (xheunstar + xheun(:,i-1)) * dt... + b * dwiener; k = xrunge(:,i-1); y = xrunge(:,i-1) +.5 * a * k * dt+.5 * b* dwiener; k1 = y; y1 = xrunge(:,i-1) +.5*a*k1*dt +.5 *b*dwiener; k2 = y1; y2 = xrunge(:,i-1) +a * k2 * dt + b * dwiener; k3 = y2; xrunge(:,i)=xrunge(:,i-1)+a*(k +2*k1 +2*k2 +k3)*dt/ b*dwiener; end; figure h=plot(t,xeuler(:,:), b,t,xheun(:,:), g--,t,xrunge(:,:), r.- ); set(h, LineWidth,2);set(gca, fontsize,2) xlabel( time );ylabel( X(t) ); Exercise 8 Show that the fact the the process X t is Gaussian, the mean and variance of the process described by equation (6.43) can be analytically shown to be E{X(t)} = e at x, and its variance equals σ 2 (t) = var{x(t)} = b 2 e2at 1. 2a

100 1CHAPTER 6. CONNECTION BETWEEN STOCHASTIC DIFFERENTIAL AND PDES

101 Chapter 7 Application of SDEs 7.1 Introduction to particle models and their application to model transport in shallow water There are basically two ways to look at the movement of a group of particles. The first approach is to describe what happens at a fixed point (or region) in space and time, and this is called an Eulerian view point. This allows the observation of the phenomena at a specific location. The other approach, the Lagrangian method, follows the particles through space at every time step, and allows the observation of fluctuations in the path of the particles. The deterministic approach of solving the differential equations is closely related to the Eulerian approach, while the stochastic approach is connected with the Lagrangian approach. The later one is the one we shall discuss in this course in application part. Some basic properties, characteristic and background information of this method are briefly introduced in the following section. 7.2 Diffusion and dispersion Molecular diffusion In this section some equations and concepts underlying molecular diffusion are presented. Molecular diffusion itself is not of great direct consequence in the environmental problems, it should be taken into account only on a microscopic scale. However, in many cases environmental dispersion problems can be described by processes that are strongly analogous to molecular diffusion, but on larger scale. Also, as we will see later, the molecular diffusion plays a significant role in multi-particle models. The law of molecular diffusion was first formulated by a German physiologist Adolph Fick in Fick s law says that the flux of solute mass, that is, the mass of a solute crossing a unit per unit time in a given direction, is proportional to the gradient of solute concentration in that direction [5]. 11

102 12 CHAPTER 7. APPLICATION OF SDES 7.3 Molecular diffusion with a constant diffusion coefficient Diffusion takes place at different scales, the smallest one being at molecular level. Firstly we shall assume that the diffusion coefficient is constant. When considering a molecular diffusion, Fick s law which plays a great role. Fick s law states that The mass of a solute crossing a unit area per unit time in a given direction is proportional to the gradient of the solute concentration in that direction (Fischer (1979)[5]). Another relationship between flux and concentration irrespective of the molecule transport comes from the conservation of mass. Combining this relationship with the Fick s law the following diffusion equation in three dimensions can be obtained: ( ) C 2 t = D C x + 2 C 2 y + 2 C 2 z 2 (7.1) where C(t; x, y, z) stands for concentration in kg/m 3 and D is the diffusion coefficient in m 2 /s still at the molecular level. In the year 1828 a French botanist R.Brown when observed a pollen of particle which was suspended in fluid, he noted that it was moving in an irregular and random way. This random movement of the particle is due to the bombardment of the particle by the molecules of the fluid which are constantly in motion. This collision makes particle to easily lose track of their previous velocities. In 195 Albeit Einstein [2] suggested that the diffusion equation could be modelled as Brownian process. He showed that as long as the diffusion coefficient remain constant, the average square distance over which a single particle travelled from its starting point was linearly proportional to time. x 2 = 2Dt (7.2) where means the ensemble, or average over many trials, D is the diffusion coefficient (m 2 /s, i.e., D is defined as a unit area per unit time) and is independent of the dimension of the process, t is time and x is the distance. As time increases, the most probable position of the particle would be on an expanding spherical surface with the location of the particle on its center(see [11] for more details). It is the area of the surface that increases linearly with time as the distance of the surface from the starting point increases with the square root of time. In this way, the diffusion coefficient is imagined as the rate at which that probability sphere grows. It has been shown that if the concentration is interpreted as a probability density function, then the diffusion equation becomes consistent with a particle model (Heemink 199 [8]). Let us consider an example of one dimensional diffusion model with two different ways of solving it. Let look at one dimensional model of the spreading of a cloud of pollutant in a

103 7.3. MOLECULAR DIFFUSION WITH A CONSTANT DIFFUSION COEFFICIENT13 channel with constant depth and diffusion coefficient. At this moment we do not consider the advection, thus we only assume that the dispersion of the cloud of particles. The partial differential equation describing the evolution of the concentration of this process, with the initial deployment of particles at x = at time t = t is given by: C t = D ( ) 2 C x 2 C(t, x) = δ(x) (7.3) The solution of this equation is know exactly and is given by: C(t, x) = 1 4πDt e x2 4Dt (7.4) under the assumption that the domain is infinite (see (Stijnen 22) for more details). The behaviour of the cloud can on the one hand be obtained by discretizing the diffusion equation (7.3), but it can also be obtained well by using the stochastic approach. Traditional Brownian motion can be used to model the diffusion process, for example. The position of the partic;e are randomly disturbed as follows: dx t Itô = σdw t, X() = X (7.5) where dw t is the standard Brownian motion with E[dW t ] = and E [dw t dw t ] = dt. The constant σ is an indicator of the intensity of the diffusion. In order to find out how the probability density of this process varies in space and time, we use an advection -diffusion type partial differential equation, called the Fokker Planck equation. For this particular example the FPE for the probability density function p(t,) is described by p t = 1 2 p 2 σ2 (7.6) x 2 p(t, x) = δ(x X ) Note that when one thinks in terms of particles one must realize that the probability density at the location is in fact the concentration at that location(more details can be found in (Jan W, Stijnen 22, Fischer, 1979)). the C(t, x) at a certain location (x, y) in nothing more than the probability that a particle ends up in that location. Thus, when we substitute σ = 2Dt this comes from equation (7.2) and p(t, x) = C(t, x) in equation (7.6) we end up with the original diffusion equation (7.3). It can be concluded that they describe the same process, and therefore have the same solution: p(t, x) = 1 2πσ 2 e x 2σ 2 (7.7) = N[, σ 2 ]

104 14 CHAPTER 7. APPLICATION OF SDES with N(, σ 2 ] is a normal distribution with zero mean and variance equal to σ 2. Therefore equation (7.5) is consistent with equation (7.3). On the other hand, the Eulerian approach is an alternative to a Lagrangian approach where one can follow the tracks of individual particles described by the SDE. Note that the FPE is an advection-diffusion type of ordinary differential equation that describes the evolution of the conditional probability density function. Example 21 For the more general one dimensional SDE given by the Itô equation: dx(t) = f(t, X t )dt + g(t, X t )dw (t), X() = X (7.8) the transition probability density function p = p(t, x t, x ) of the stochastic process X t is propagated according to the following Fokker-Planck equation: p t = y (f(t, x)p) x (g(t, X t)p) (7.9) 2 p(t, x t, x ) = δ(x X ) Note that the SDE (7.8) must be interpreted in the Itô sense if it to be consistent with the FPE (7.9). The solution of the FPE is the conditional probability density function associated with the solution X t of the stochastic differential equation (7.8). Therefore, the PDE (7.9) is completely consistent with the Lagrangian description of equation (7.8). It gives a complete description of variation in time and space of the probability density function. More detailed information can be found in Heemink(199). For a one-dimensional diffusion process, Fick s law can be stated mathematically as q = D C x (7.1) where q is the solute mass flux, C is the concentration, D is the coefficient of proportionality, called diffusion coefficient. The minus sign indicates transport is from high to low concentration. For diffusion in three dimensions Fick s law can be written as ( C q = D + C + C ) (7.11) x 1 x 2 x 3 The equations (7.1) and (7.11) can be written in another form if we use the law of mass conservation [?] C t = D 2 C (7.12) 2 x and ( C 2 t = D C x C x 2 2 ) + 2 C x 2 3 (7.13) Let us notice that the equations (7.12) and (7.13) describe the spreading of mass in a fluid with no mean velocity.

105 7.4. MOLECULAR DIFFUSION WITH A SPACE VARYING DIFFUSION COEFFICIENT15 C(t,x ;t,x) t=1/π t=1/(4π) t=1/(16π) t=1/(64π) x Figure 7.1: The initial delta function can be thought as spike distribution. This illustration uses M = 1 kg and D = 1 kg/m. Example Suppose that the solute is spreading in one direction. Suppose also that the initial slug of mass M was introduced at time t at the x origin and there are no boundaries to prevent the mass from diffusing to infinity in both directions. Mathematically this initial condition can be written as C(x, t ; x, t ) = Mδ(x x ) (7.14) Physically the delta function represents a unit mass concentrated into an infinitely small space with an infinitely large concentration and Mδ(x x ) represents a mass M concentrated into a very small space. For example, if some accident took place and some pollutant spilled we can represent the initial concentration distribution by a delta function [25]. The direct solution of the diffusion equation (7.13) with the initial condition (7.14) is the function ( M C(x, t ; x, t ) = 4πD(t t ) exp (x x ) ) 2 4D(t t ) With M = 1kg, C(t, x ; t, x) is the density function of a normal distribution with parameters (x, 2D(t t )). Examples of this concentration for different moment of times t are shown on Figure Molecular diffusion with a space varying diffusion coefficient Here we are interested in the stochastic version of the diffusion equation where the diffusion coefficient is varying with the space: C t = x D C x

106 16 CHAPTER 7. APPLICATION OF SDES with C(t, x) the concentration, and D(x) the diffusion coefficient. Rewriting this equation leads to the following C t = 2 x (DC) 2 x C dd dx With the aid of the FPE (7.9) and by substituting C(t, x) = p(t, x) it is possible to identify the drift f(t, X t ), and the diffusion g(t, X t ) that are necessary for a description of the stochastic model: f(t, X t ) = dd dx g(t, X t ) = 2D finally, we find the following stochastic differential equations. dx(t) = dd dx dt + 2DdW (t) 7.5 Advection-diffusion process for a two dimensional model The mathematical description of the transport processes developed in this section will be founded on the relationship between the advection-diffusion equations and Kolmogorov s forward equation known as Fokker-Planck equations(fpe). The transport of substances in shallow waters is often described by the depth averaged advection-diffusion equation: HC t = 2 i=1 x i (U i HC) + 2 i=1 2 j=1 x i ( ) C HD i,j + S + Q, (7.15) x j where H is water depth; C concentration; U i is the flow velocity in the x i direction, D i,j is the dispersion coefficient in x i -direction due to the component of concentration in x i gradient-direction, S and Q are terms catering for sinks and sources. The advectiondiffusion is widely applied in a variety of engineering problems. For example with S = Q =, equation (7.15) can be used for prediction of the dispersion of pollutants in shallow waters (See Heemink (199), Robert (25)), for example. In addition to Eqn. (7.15) the boundary conditions are needed. Boundaries in transport models are defined by physical boundaries such as banks, shores, water level and bed or by numerical (open) positioned at for instance tidal inlets. At closed boundary a Neumann boundary condition is often prescribed which excludes mass transfer through such a boundary, mathematically this is denoted by C =, with n the normal vector to the n boundary. Dirichlet boundary conditions are often imposed at for instance bottom or open

107 7.6. CONSISTENCE OF PARTICLE MODEL WITH THE ADES 17 boundaries to prescribe a fixed concentration [4]. In regions far away from the discharge location it is sometimes justified to prescribe C = at open boundaries. An equilibrium bed concentration C e is assumed a bottom boundary condition in sediment transport problems may be C = C e. Initial conditions address a concentration distribution measured at initial state and account for instantaneous discharges of for instance, waste material. More information on the initial and boundary condition can be found in [4, 8]. In many practical situations the analytical solution of equation (7.15) cannot be easily obtained creating the need to numerically approximate it. 7.6 Consistence of particle model with the ADEs The position of a particle (X(t), Y (t)) at time t is assumed to be a Markov process. Thus a 2-dimensional Itô SDEs to describe the position of a particle is given by the following equations; dx(t) dy (t) Itô = [U + ( H x Itô = [V + ( H y D)/H + D x ]dt + 2DdW 1 (t) (7.16) D)/H + D y ]dt + 2DdW 2 (t). (7.17) D(x, y) stands for a dispersion coefficient and W (t) is a Wiener process with independent increments that are normally distributed with the zero mean and variance t. The probability density function p(x, y, t) for variation in time and space of the positions of particles in two dimension is described by the Fokker-Planck equation. Thus, the probability density function p(x, y, t), t t is determined by the following Itô Fokker-Planck equation (Heemink (199)). p t = x D [(U + ( H D)/H + x x )p] y With the initial condition: D [(V + ( H D)/H + y 2 y )p] x (2Dp) + 1 (2Dp). (7.18) 2 2 y2 p(x, y, t ) = δ(x x )δ(y y ) (7.19) due to Itô SDE (6.25). If we relate the particle concentration to the probability density function f: C(x, y, t) = p(x, y, t)/h(x, y, t) (7.2) and substitute Eqn. (7.2) into the Fokker-Planck Eqn. (7.18), the resulting equation is called the advection-diffusion equation. It was shown in [8] that the underlaying

108 18 CHAPTER 7. APPLICATION OF SDES SDEs (7.16)-(7.17) is consistent with the 2-dimensional advection-diffusion equation (7.15) where S = Q = : (HC) t = (HUC) x (HV C) y + x (D x CH) + y (D CH). (7.21) y The drift and diffusion coefficient of the Fokker-Planck Eqn. (7.18),.i.e., A i and L i,j respectively are deduced as A i = U i + 2 j=1 ( H x j D i,j )/H + 2 D i,j j=1 x j L i,j = D i,j p = CH relation between C and p with this relation equation (7.18) should be referred to as Itô Fokker-Planck equation. However, the Stratonovich integration rule leads to different values of A i and L i,j. Both rules are correct in the sense that both can be used in the simulation processes. By matching the Fokker-Planck equation with the advection-diffusion equation, the underlying particle model is shown to be consistent with the ADE. Where, The drift component A i consists of a contribution due to the local flow velocity and a contribution due to a correction term. The hydrodynamic flow model provides the inputs to the particle model. It also provides a diffusion tensor which is isotropic in the horizontal plane, that is, the off-diagonal elements of the diffusion tensor are set to zero while the diagonal elements equal D 11 = D 22 = D H. The following results is a numerical experiment to show the how a cloud of particle spreads in the absence of the drift term. The model SDEs (7.16)-(7.17) was implemented and the simulation was carried out in such a way that the drift/advective part was set equal to zero. The spreading cloud of particle due to diffusion takes the Gaussian shape and if the variance of the cloud is determined one can see that it grows linearly with time. In this case only were interested in the spreading of particles due to the diffusion process. 1 particles were released at location (x, y) = (, ) in an ideal empty domain. The domain was partitioned into grid cells of equal size, that x = y = 4, all particles were assumed to be of the same size and same mass. The integration of the movement of particles was done by using an Euler numerical scheme with fixed t = 86.4s the simulation was carried out for iterations. the following results were obtained The following results is a numerical experiment to show the how a cloud of particle spreads in the presence of drift term and the diffusion term. The model SDEs (7.16)-(7.17) was implemented and the simulation was carried out to see how the spreading cloud of particle were behaving 4, particles were released at location (x, y) = (, ) in an ideal domain in which artificial flow fields were created. The domain was partitioned into grid cells of equal size, that x = y = 4, all particles were assumed to be of the same size and same mass. The integration of the movement of particles was done by using an Euler numerical scheme with fixed t = 86.4s the simulation was carried out for iterations, snapshots for various positions of particles were taken after certain time. The following results were

109 7.6. CONSISTENCE OF PARTICLE MODEL WITH THE ADES 19 3 x 14 1 Particles at t= days 1 Particles at t=5 days x y[grid index m] 1 1 y[grid index m] x[grid index n] x 1 4 x[grid index n] x 1 4 (a) (b ) 3 x Particles at t=15 days 1 Particles at t=35 days x y[grid index m] 1 1 y[grid index m] x[grid index n] x x[grid index n] x 1 4 (a ) (b) obtained A real life application of this model (7.16)-(7.17) for predicting the dispersion of pollutants in the Dutch coastal waters was simulated by (Heemink 199 [8]).

110 11 CHAPTER 7. APPLICATION OF SDES (a) (b ) (c)

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

The concentration of a drug in blood Exponential decay C12 concentration 2 4 6 8 1 C12 concentration 2 4 6 8 1 dc(t) dt = µc(t) C(t) = C()e µt 2 4 6 8 1 12 time in minutes 2 4 6 8 1 12 time in minutes