APPLIED STOCHASTIC PROCESSES

Similar documents
ELEMENTS OF PROBABILITY THEORY

M4A42 APPLIED STOCHASTIC PROCESSES

M5A42 APPLIED STOCHASTIC PROCESSES

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

A Short Introduction to Diffusion Processes and Ito Calculus

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Handbook of Stochastic Methods

Lecture 4: Stochastic Processes (I)

Lecture 4: Introduction to stochastic processes and stochastic calculus

Stochastic process for macro

Handbook of Stochastic Methods

Exercises. T 2T. e ita φ(t)dt.

FE 5204 Stochastic Differential Equations

Lecture 12: Detailed balance and Eigenfunction methods

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

Example 4.1 Let X be a random variable and f(t) a given function of time. Then. Y (t) = f(t)x. Y (t) = X sin(ωt + δ)

p 1 ( Y p dp) 1/p ( X p dp) 1 1 p

Brownian motion and the Central Limit Theorem

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Verona Course April Lecture 1. Review of probability

Introduction to Diffusion Processes.

Introduction to Random Diffusions

M5A44 COMPUTATIONAL STOCHASTIC PROCESSES

Lecture 12: Detailed balance and Eigenfunction methods

Local vs. Nonlocal Diffusions A Tale of Two Laplacians

1.1 Review of Probability Theory

This is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

{σ x >t}p x. (σ x >t)=e at.

Stochastic Calculus and Black-Scholes Theory MTH772P Exercises Sheet 1

Wiener Measure and Brownian Motion

Stochastic differential equation models in biology Susanne Ditlevsen

Probability and Measure

Spectral representations and ergodic theorems for stationary stochastic processes

1 Introduction. 2 Diffusion equation and central limit theorem. The content of these notes is also covered by chapter 3 section B of [1].

1.1 Definition of BM and its finite-dimensional distributions

I forgot to mention last time: in the Ito formula for two standard processes, putting

16. Working with the Langevin and Fokker-Planck equations

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have

Some Tools From Stochastic Analysis

Random Process Lecture 1. Fundamentals of Probability

A Concise Course on Stochastic Partial Differential Equations

STOCHASTIC PROCESSES IN PHYSICS AND CHEMISTRY

Homogenization with stochastic differential equations

Lecture 1: Pragmatic Introduction to Stochastic Differential Equations

Lecture 12. F o s, (1.1) F t := s>t

GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM

Convoluted Brownian motions: a class of remarkable Gaussian processes

Lecture 17 Brownian motion as a Markov process

Kolmogorov Equations and Markov Processes

Introduction to nonequilibrium physics

1 Elementary probability

Branching Processes II: Convergence of critical branching to Feller s CSB

1. Stochastic Processes and filtrations

Langevin Methods. Burkhard Dünweg Max Planck Institute for Polymer Research Ackermannweg 10 D Mainz Germany

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde

Harnack Inequalities and Applications for Stochastic Equations

Table of Contents [ntc]

1. Stochastic Process

Gaussian Processes. 1. Basic Notions

Brownian Motion and Conditional Probability

Continuous and Discrete random process

Stochastic Processes

ECE534, Spring 2018: Solutions for Problem Set #4 Due Friday April 6, 2018

08. Brownian Motion. University of Rhode Island. Gerhard Müller University of Rhode Island,

Week 9 Generators, duality, change of measure

The Smoluchowski-Kramers Approximation: What model describes a Brownian particle?

Anomalous diffusion in biology: fractional Brownian motion, Lévy flights

Stochastic Differential Equations.

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

The dynamics of small particles whose size is roughly 1 µmt or. smaller, in a fluid at room temperature, is extremely erratic, and is

Introduction to asymptotic techniques for stochastic systems with multiple time-scales

µ X (A) = P ( X 1 (A) )

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3.

Definition: Lévy Process. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 2: Lévy Processes. Theorem

JUSTIN HARTMANN. F n Σ.

Asymptotic stability of an evolutionary nonlinear Boltzmann-type equation

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

Some Background Material

LANGEVIN THEORY OF BROWNIAN MOTION. Contents. 1 Langevin theory. 1 Langevin theory 1. 2 The Ornstein-Uhlenbeck process 8

PRESENT STATE AND FUTURE PROSPECTS OF STOCHASTIC PROCESS THEORY

Fast-slow systems with chaotic noise

where r n = dn+1 x(t)

Stochastic Volatility and Correction to the Heat Equation

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Introduction to multiscale modeling and simulation. Explicit methods for ODEs : forward Euler. y n+1 = y n + tf(y n ) dy dt = f(y), y(0) = y 0

2012 NCTS Workshop on Dynamical Systems

Topics in fractional Brownian motion

Applications of Ito s Formula

Lecture 1: Brief Review on Stochastic Processes

LAN property for sde s with additive fractional noise and continuous time observation

1 Brownian Local Time

Universal examples. Chapter The Bernoulli process

Multivariate Generalized Ornstein-Uhlenbeck Processes

Lecture 6: Bayesian Inference in SDE Models

Transcription:

APPLIED STOCHASTIC PROCESSES G.A. Pavliotis Department of Mathematics Imperial College London, UK January 16, 2011 Pavliotis (IC) StochProc January 16, 2011 1 / 367

Lectures: Mondays, 10:00-12:00, Huxley 6M42. Office Hours: By appointment. Course webpage: http://www.ma.imperial.ac.uk/~pavl/stoch_proc.htm Text: Lecture notes, available from the course webpage. Also, recommended reading from various textbooks/review articles. The lecture notes are still in progress. Please send me your comments, suggestions and let me know of any typos/errors that you have spotted. Pavliotis (IC) StochProc January 16, 2011 2 / 367

This is a basic graduate course on stochastic processes, aimed towards PhD students in applied mathematics and theoretical physics. The emphasis of the course will be on the presentation of analytical tools that are useful in the study of stochastic models that appear in various problems in applied mathematics, physics, chemistry and biology. The course will consist of three parts: Fundamentals of the theory of stochastic processes, applications (reaction rate theory, surface diffusion...) and non-equilibrium statistical mechanics. Pavliotis (IC) StochProc January 16, 2011 3 / 367

1 PART I: FUNDAMENTALS OF CONTINUOUS TIME STOCHASTIC PROCESSES Elements of probability theory. Stochastic processes: basic definitions, examples. Continuous time Markov processes. Brownian motion Diffusion processes: basic definitions, the generator. Backward Kolmogorov and the Fokker Planck (forward Kolmogorov) equations. Stochastic differential equations (SDEs); Itô calculus, Itô and Stratonovich stochastic integrals, connection between SDEs and the Fokker Planck equation. Methods of solution for SDEs and for the Fokker-Planck equation. Ergodic properties and convergence to equilibrium. Pavliotis (IC) StochProc January 16, 2011 4 / 367

2. PART II: APPLICATIONS. Asymptotic problems for the Fokker Planck equation: overdamped (Smoluchowski) and underdamped (Freidlin-Wentzell) limits. Bistable stochastic systems: escape over a potential barrier, mean first passage time, calculation of escape rates etc. Brownian motion in a periodic potential. Stochastic models of molecular motors. Multiscale problems: averaging and homogenization. Pavliotis (IC) StochProc January 16, 2011 5 / 367

3. PART III: NON-EQUILIBRIUM STATISTICAL MECHANICS. Derivation of stochastic differential equations from deterministic dynamics (heat bath models, projection operator techniques etc.). The fluctuation-dissipation theorem. Linear response theory. Derivation of macroscopic equations (hydrodynamics) and calculation of transport coefficients. 4. ADDITIONAL TOPICS (time permitting): numerical methods, stochastic PDES, Markov chain Monte Carlo (MCMC) Pavliotis (IC) StochProc January 16, 2011 6 / 367

Prerequisites Basic knowledge of ODEs and PDEs. Elementary probability theory. Some familiarity with the theory of stochastic processes. Pavliotis (IC) StochProc January 16, 2011 7 / 367

Lecture notes will be provided for all the material that we will cover in this course. They will be posted on the course webpage. There are many excellent textbooks/review articles on applied stochastic processes, at a level and style similar to that of this course. Standard textbooks are Gardiner: Handbook of stochastic methods (1985). Van Kampen: Stochastic processes in physics and chemistry (1981). Horsthemke and Lefever: Noise induced transitions (1984). Risken: The Fokker-Planck equation (1989). Oksendal: Stochastic differential equations (2003). Mazo: Brownian motion: fluctuations, dynamics and fluctuations (2002). Bhatthacharya and Waymire, Stochastic Processes and Applications (1990). Pavliotis (IC) StochProc January 16, 2011 8 / 367

Other standard textbooks are Nelson: Dynamical theories of Brownian motion (1967). Available from the web (includes a very interesting historical overview of the theory of Brownian motion). Chorin and Hald: Stochastic tools in mathematics and science (2006). Swanzing: Non-equilibrium statistical mechanics (2001). The material on multiscale methods for stochastic processes, together with some of the introductory material, will be taken from Pavliotis and Stuart: Multiscale methods: averaging and homogenization (2008). Pavliotis (IC) StochProc January 16, 2011 9 / 367

Excellent books on the mathematical theory of stochastic processes are Karatzas and Shreeve: Brownian motion and stochastic calculus (1991). Revuz and Yor: Continuous martingales and Brownian motion (1999). Stroock: Probability theory, an analytic view (1993). Pavliotis (IC) StochProc January 16, 2011 10 / 367

Well known review articles (available from the web) are: Chandrasekhar: Stochastic problems in physics and astronomy (1943). Hanggi and Thomas: Stochastic processes: time evolution, symmetries and linear response (1982). Hanggi, Talkner and Borkovec: Reaction rate theory: fifty years after Kramers (1990). Pavliotis (IC) StochProc January 16, 2011 11 / 367

The theory of stochastic processes started with Einstein s work on the theory of Brownian motion: Concerning the motion, as required by the molecular-kinetic theory of heat, of particles suspended in liquids at rest (1905). explanation of Brown s observation (1827): when suspended in water, small pollen grains are found to be in a very animated and irregular state of motion. Einstein s theory is based on A Markov chain model for the motion of the particle (molecule, pollen grain...). The idea that it makes more sense to talk about the probability of finding the particle at position x at time t, rather than about individual trajectories. Pavliotis (IC) StochProc January 16, 2011 12 / 367

In his work many of the main aspects of the modern theory of stochastic processes can be found: The assumption of Markovianity (no memory) expressed through the Chapman-Kolmogorov equation. The Fokker Planck equation (in this case, the diffusion equation). The derivation of the Fokker-Planck equation from the master (Chapman-Kolmogorov) equation through a Kramers-Moyal expansion. The calculation of a transport coefficient (the diffusion equation) using macroscopic (kinetic theory-based) considerations: D = k BT 6πηa. k B is Boltzmann s constant, T is the temperature, η is the viscosity of the fluid and a is the diameter of the particle. Pavliotis (IC) StochProc January 16, 2011 13 / 367

Einstein s theory is based on the Fokker-Planck equation. Langevin (1908) developed a theory based on a stochastic differential equation. The equation of motion for a Brownian particle is m d 2 x dt 2 where ξ is a random force. = 6πηadx dt + ξ, There is complete agreement between Einstein s theory and Langevin s theory. The theory of Brownian motion was developed independently by Smoluchowski. Pavliotis (IC) StochProc January 16, 2011 14 / 367

The approaches of Langevin and Einstein represent the two main approaches in the theory of stochastic processes: Study individual trajectories of Brownian particles. Their evolution is governed by a stochastic differential equation: dx dt = F(X) + Σ(X)ξ(t), where ξ(t) is a random force. Study the probability ρ(x, t) of finding a particle at position x at time t. This probability distribution satisfies the Fokker-Planck equation: ρ t = (F(x)ρ) + 1 : (A(x)ρ), 2 where A(x) = Σ(x)Σ(x) T. Pavliotis (IC) StochProc January 16, 2011 15 / 367

The theory of stochastic processes was developed during the 20th century: Physics: Smoluchowksi. Planck (1917). Klein (1922). Ornstein and Uhlenbeck (1930). Kramers (1940). Chandrasekhar (1943).... Mathematics: Wiener (1922). Kolmogorov (1931). Itô (1940 s). Doob (1940 s and 1950 s).... Pavliotis (IC) StochProc January 16, 2011 16 / 367

The One-Dimensional Random Walk We let time be discrete, i.e. t = 0, 1,... Consider the following stochastic process S n : S 0 = 0; at each time step it moves to ±1 with equal probability 1 2. In other words, at each time step we flip a fair coin. If the outcome is heads, we move one unit to the right. If the outcome is tails, we move one unit to the left. Alternatively, we can think of the random walk as a sum of independent random variables: n S n = X j, j=1 where X j { 1, 1} with P(X j = ±1) = 1 2. Pavliotis (IC) StochProc January 16, 2011 17 / 367

We can simulate the random walk on a computer: We need a (pseudo)random number generator to generate n independent random variables which are uniformly distributed in the interval [0,1]. If the value of the random variable is 1 2 then the particle moves to the left, otherwise it moves to the right. We then take the sum of all these random moves. The sequence {S n } N n=1 indexed by the discrete time T = {1, 2,...N} is the path of the random walk. We use a linear interpolation (i.e. connect the points {n, S n } by straight lines) to generate a continuous path. Pavliotis (IC) StochProc January 16, 2011 18 / 367

50 step random walk 8 6 4 2 0 2 4 6 0 5 10 15 20 25 30 35 40 45 50 Figure: Three paths of the random walk of length N = 50. Pavliotis (IC) StochProc January 16, 2011 19 / 367

1000 step random walk 20 10 0 10 20 30 40 50 0 100 200 300 400 500 600 700 800 900 1000 Figure: Three paths of the random walk of length N = 1000. Pavliotis (IC) StochProc January 16, 2011 20 / 367

Every path of the random walk is different: it depends on the outcome of a sequence of independent random experiments. We can compute statistics by generating a large number of paths and computing averages. For example, E(S n ) = 0, E(S 2 n) = n. The paths of the random walk (without the linear interpolation) are not continuous: the random walk has a jump of size 1 at each time step. This is an example of a discrete time, discrete space stochastic processes. The random walk is a time-homogeneous (the probabilistic law of evolution is independent of time) Markov (the future depends only on the present and not on the past) process. If we take a large number of steps, the random walk starts looking like a continuous time process with continuous paths. Pavliotis (IC) StochProc January 16, 2011 21 / 367

Consider the sequence of continuous time stochastic processes Z n t := 1 n S nt. In the limit as n, the sequence {Zt n } converges (in some appropriate sense) to a Brownian motion with diffusion coefficient D = x2 2 t = 1 2. Pavliotis (IC) StochProc January 16, 2011 22 / 367

2 1.5 mean of 1000 paths 5 individual paths 1 U(t) 0.5 0 0.5 1 1.5 0 0.2 0.4 0.6 0.8 1 Figure: Sample Brownian paths. t Pavliotis (IC) StochProc January 16, 2011 23 / 367

Brownian motion W(t) is a continuous time stochastic processes with continuous paths that starts at 0 (W(0) = 0) and has independent, normally. distributed Gaussian increments. We can simulate the Brownian motion on a computer using a random number generator that generates normally distributed, independent random variables. Pavliotis (IC) StochProc January 16, 2011 24 / 367

We can write an equation for the evolution of the paths of a Brownian motion X t with diffusion coefficient D starting at x: dx t = 2DdW t, X 0 = x. This is an example of a stochastic differential equation. The probability of finding X t at y at time t, given that it was at x at time t = 0, the transition probability density ρ(y, t) satisfies the PDE ρ t = D 2 ρ y 2, ρ(y, 0) = δ(y x). This is an example of the Fokker-Planck equation. The connection between Brownian motion and the diffusion equation was made by Einstein in 1905. Pavliotis (IC) StochProc January 16, 2011 25 / 367

Why introduce randomness in the description of physical systems? To describe outcomes of a repeated set of experiments. Think of tossing a coin repeatedly or of throwing a dice. To describe a deterministic system for which we have incomplete information: we have imprecise knowledge of initial and boundary conditions or of model parameters. ODEs with random initial conditions are equivalent to stochastic processes that can be described using stochastic differential equations. To describe systems for which we are not confident about the validity of our mathematical model. Pavliotis (IC) StochProc January 16, 2011 26 / 367

To describe a dynamical system exhibiting very complicated behavior (chaotic dynamical systems). Determinism versus predictability. To describe a high dimensional deterministic system using a simpler, low dimensional stochastic system. Think of the physical model for Brownian motion (a heavy particle colliding with many small particles). To describe a system that is inherently random. Think of quantum mechanics. Pavliotis (IC) StochProc January 16, 2011 27 / 367

ELEMENTS OF PROBABILITY THEORY Pavliotis (IC) StochProc January 16, 2011 28 / 367

A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable unions of its elements. A sub-σ algebra is a collection of subsets of a σ algebra which satisfies the axioms of a σ algebra. A measurable space is a pair (Ω, F) where Ω is a set and F is a σ algebra of subsets of Ω. Let (Ω, F) and (E, G) be two measurable spaces. A function X : Ω E such that the event {ω Ω : X(ω) A} =: {X A} belongs to F for arbitrary A G is called a measurable function or random variable. Pavliotis (IC) StochProc January 16, 2011 29 / 367

Let (Ω, F) be a measurable space. A function µ : F [0, 1] is called a probability measure if µ( ) = 1, µ(ω) = 1 and µ( k=1 A k) = k=1 µ(a k) for all sequences of pairwise disjoint sets {A k } k=1 F. The triplet (Ω, F, µ) is called a probability space. Let X be a random variable (measurable function) from (Ω, F, µ) to (E, G). If E is a metric space then we may define expectation with respect to the measure µ by E[X] = X(ω) dµ(ω). More generally, let f : E R be G measurable. Then, E[f(X)] = f(x(ω)) dµ(ω). Ω Ω Pavliotis (IC) StochProc January 16, 2011 30 / 367

Let U be a topological space. We will use the notation B(U) to denote the Borel σ algebra of U: the smallest σ algebra containing all open sets of U. Every random variable from a probability space (Ω, F,µ) to a measurable space (E, B(E)) induces a probability measure on E: µ X (B) = PX 1 (B) = µ(ω Ω; X(ω) B), B B(E). The measure µ X is called the distribution (or sometimes the law) of X. Example Let I denote a subset of the positive integers. A vector ρ 0 = {ρ 0,i, i I} is a distribution on I if it has nonnegative entries and its total mass equals 1: i I ρ 0,i = 1. Pavliotis (IC) StochProc January 16, 2011 31 / 367

We can use the distribution of a random variable to compute expectations and probabilities: E[f(X)] = f(x) dµ X (x) and P[X G] = G S dµ X (x), G B(E). When E = R d and we can write dµ X (x) = ρ(x) dx, then we refer to ρ(x) as the probability density function (pdf), or density with respect to Lebesque measure for X. When E = R d then by L p (Ω; R d ), or sometimes L p (Ω;µ) or even simply L p (µ), we mean the Banach space of measurable functions on Ω with norm ( X L p = E X p) 1/p. Pavliotis (IC) StochProc January 16, 2011 32 / 367

Example Consider the random variable X : Ω R with pdf ) γ σ,m (x) := (2πσ) 1 (x m)2 2 exp (. 2σ Such an X is termed a Gaussian or normal random variable. The mean is EX = xγ σ,m (x) dx = m and the variance is E(X m) 2 = (x m) 2 γ σ,m (x) dx = σ. R R Pavliotis (IC) StochProc January 16, 2011 33 / 367

Example (Continued) Let m R d and Σ R d d be symmetric and positive definite. The random variable X : Ω R d with pdf γ Σ,m (x) := ( ) 1 ( (2π) d 2 detσ exp 1 ) 2 Σ 1 (x m),(x m) is termed a multivariate Gaussian or normal random variable. The mean is E(X) = m (1) and the covariance matrix is ( ) E (X m) (X m) = Σ. (2) Pavliotis (IC) StochProc January 16, 2011 34 / 367

Since the mean and variance specify completely a Gaussian random variable on R, the Gaussian is commonly denoted by N(m, σ). The standard normal random variable is N(0, 1). Since the mean and covariance matrix completely specify a Gaussian random variable on R d, the Gaussian is commonly denoted by N(m, Σ). Pavliotis (IC) StochProc January 16, 2011 35 / 367

The Characteristic Function Many of the properties of (sums of) random variables can be studied using the Fourier transform of the distribution function. The characteristic function of the random variable X is defined to be the Fourier transform of the distribution function φ(t) = e itλ dµ X (λ) = E(e itx ). (3) R The characteristic function determines uniquely the distribution function of the random variable, in the sense that there is a one-to-one correspondance between F(λ) and φ(t). The characteristic function of a N(m,Σ) is φ(t) = e m,t 1 2 t,σt. Pavliotis (IC) StochProc January 16, 2011 36 / 367

Lemma Let {X 1, X 2,...X n } be independent random variables with characteristic functions φ j (t), j = 1,...n and let Y = n j=1 X j with characteristic function φ Y (t). Then φ Y (t) = Π n j=1 φ j(t). Lemma Let X be a random variable with characteristic function φ(t) and assume that it has finite moments. Then E(X k ) = 1 i k φ(k) (0). Pavliotis (IC) StochProc January 16, 2011 37 / 367

Types of Convergence and Limit Theorems One of the most important aspects of the theory of random variables is the study of limit theorems for sums of random variables. The most well known limit theorems in probability theory are the law of large numbers and the central limit theorem. There are various different types of convergence for sequences or random variables. Pavliotis (IC) StochProc January 16, 2011 38 / 367

Definition Let {Z n } n=1 be a sequence of random variables. We will say that (a) Z n converges to Z with probability one (almost surely) if P ( lim n + Z n = Z ) = 1. (b) Z n converges to Z in probability if for every ε > 0 (c) Z n converges to Z in L p if lim P( Z n Z > ε ) = 0. n + lim n + E[ Z n Z p ] = 0. (d) Let F n (λ), n = 1, +, F(λ) be the distribution functions of Z n n = 1, + and Z, respectively. Then Z n converges to Z in distribution if lim n + F n(λ) = F(λ) Pavliotis (IC) StochProc January 16, 2011 39 / 367

Let {X n } n=1 be iid random variables with EX n = V. Then, the strong law of large numbers states that average of the sum of the iid converges to V with probability one: P ( lim n + 1 N ) N X n = V = 1. n=1 The strong law of large numbers provides us with information about the behavior of a sum of random variables (or, a large number or repetitions of the same experiment) on average. We can also study fluctuations around the average behavior. Indeed, let E(X n V) 2 = σ 2. Define the centered iid random variables Y n = X n V. Then, the sequence of random variables 1 N n=1 Y n converges in distribution to a N(0, 1) random σ N variable: lim P n + ( 1 σ N ) N Y n a = n=1 a 1 2π e 1 2 x2 dx. Pavliotis (IC) StochProc January 16, 2011 40 / 367

Assume that E X < and let G be a sub σ algebra of F. The conditional expectation of X with respect to G is defined to be the function E[X G] : Ω E which is G measurable and satisfies E[X G] dµ = X dµ G G. G We can define E[f(X) G] and the conditional probability P[X F G] = E[I F (X) G], where I F is the indicator function of F, in a similar manner. G Pavliotis (IC) StochProc January 16, 2011 41 / 367

ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES Pavliotis (IC) StochProc January 16, 2011 42 / 367

Let T be an ordered set. A stochastic process is a collection of random variables X = {X t ; t T } where, for each fixed t T, X t is a random variable from (Ω, F) to (E, G). The measurable space {Ω, F} is called the sample space. The space (E, G) is called the state space. In this course we will take the set T to be [0,+ ). The state space E will usually be R d equipped with the σ algebra of Borel sets. A stochastic process X may be viewed as a function of both t T and ω Ω. We will sometimes write X(t), X(t,ω) or X t (ω) instead of X t. For a fixed sample point ω Ω, the function X t (ω) : T E is called a sample path (realization, trajectory) of the process X. Pavliotis (IC) StochProc January 16, 2011 43 / 367

The finite dimensional distributions (fdd) of a stochastic process are the distributions of the E k valued random variables (X(t 1 ), X(t 2 ),..., X(t k )) for arbitrary positive integer k and arbitrary times t i T, i {1,..., k}: with x = (x 1,..., x k ). F(x) = P(X(t i ) x i, i = 1,...,k) We will say that two processes X t and Y t are equivalent if they have same finite dimensional distributions. From experiments or numerical simulations we can only obtain information about the (fdd) of a process. Pavliotis (IC) StochProc January 16, 2011 44 / 367

Definition A Gaussian process is a stochastic processes for which E = R d and all the finite dimensional distributions are Gaussian F(x) = P(X(t i ) x i, i = 1,...,k) [ = (2π) n/2 (detk k ) 1/2 exp 1 ] 1 K 2 k (x µ k ), x µ k, for some vector µ k and a symmetric positive definite matrix K k. A Gaussian process x(t) is characterized by its mean and the covariance function m(t) := Ex(t) C(t, s) = E( (x(t) m(t) ) ( x(s) m(s) ) ). Thus, the first two moments of a Gaussian process are sufficient for a complete characterization of the process. Pavliotis (IC) StochProc January 16, 2011 45 / 367

Let ( Ω, F, P ) be a probability space. Let X t, t T (with T = R or Z) be a real-valued random process on this probability space with finite second moment, E X t 2 < + (i.e. X t L 2 ). Definition A stochastic process X t L 2 is called second-order stationary or wide-sense stationary if the first moment EX t is a constant and the second moment E(X t X s ) depends only on the difference t s: EX t = µ, E(X t X s ) = C(t s). Pavliotis (IC) StochProc January 16, 2011 46 / 367

The constant m is called the expectation of the process X t. We will set m = 0. The function C(t) is called the covariance or the autocorrelation function of the X t. Notice that C(t) = E(X t X 0 ), whereas C(0) = E(Xt 2 ), which is finite, by assumption. Since we have assumed that X t is a real valued process, we have that C(t) = C( t), t R. Pavliotis (IC) StochProc January 16, 2011 47 / 367

Continuity properties of the covariance function are equivalent to continuity properties of the paths of X t. Lemma Assume that the covariance function C(t) of a second order stationary process is continuous at t = 0. Then it is continuous for all t R. Furthermore, the continuity of C(t) is equivalent to the continuity of the process X t in the L 2 -sense. Pavliotis (IC) StochProc January 16, 2011 48 / 367

Proof. Fix t R. We calculate: C(t + h) C(t) 2 = E(X t+h X 0 ) E(X t X 0 ) 2 = E ((X t+h X t )X 0 ) 2 E(X 0 ) 2 E(X t+h X t ) 2 = C(0)(EX 2 t+h + EX 2 t 2EX t X t+h ) = 2C(0)(C(0) C(h)) 0, as h 0. Assume now that C(t) is continuous. From the above calculation we have E X t+h X t 2 = 2(C(0) C(h)), (4) which converges to 0 as h 0. Conversely, assume that X t is L 2 -continuous. Then, from the above equation we get lim h 0 C(h) = C(0). Pavliotis (IC) StochProc January 16, 2011 49 / 367

Notice that form (4) we immediately conclude that C(0) > C(h), h R. The Fourier transform of the covariance function of a second order stationary process always exists. This enables us to study second order stationary processes using tools from Fourier analysis. To make the link between second order stationary processes and Fourier analysis we will use Bochner s theorem, which applies to all nonnegative functions. Definition A function f(x) : R R is called nonnegative definite if n f(t i t j )c i c j 0 (5) i,j=1 for all n N, t 1,... t n R, c 1,... c n C. Pavliotis (IC) StochProc January 16, 2011 50 / 367

Lemma The covariance function of second order stationary process is a nonnegative definite function. Proof. We will use the notation X c t n C(t i t j )c i c j = i,j=1 := n i=1 X t i c i. We have. n EX ti X tj c i c j i,j=1 n = E X ti c i i=1 = E X c t 2 0. n X tj c j j=1 = E ( X c t X t c ) Pavliotis (IC) StochProc January 16, 2011 51 / 367

Theorem (Bochner) There is a one-to-one correspondence between the set of continuous nonnegative definite functions and the set of finite measures on the Borel σ-algebra of R: if ρ is a finite measure, then C(t) = e ixt ρ(dx) (6) R in nonnegative definite. Conversely, any nonnegative definite function can be represented in the form (6). Definition Let X t be a second order stationary process with covariance C(t) whose Fourier transform is the measure ρ(dx). The measure ρ(dx) is called the spectral measure of the process X t. Pavliotis (IC) StochProc January 16, 2011 52 / 367

In the following we will assume that the spectral measure is absolutely continuous with respect to the Lebesgue measure on R with density f(x), dρ(x) = f(x)dx. The Fourier transform f(x) of the covariance function is called the spectral density of the process: f(x) = 1 2π e itx C(t) dt. From (6) it follows that that the covariance function of a mean zero, second order stationary process is given by the inverse Fourier transform of the spectral density: C(t) = e itx f(x) dx. In most cases, the experimentally measured quantity is the spectral density (or power spectrum) of the stochastic process. Pavliotis (IC) StochProc January 16, 2011 53 / 367

The correlation function of a second order stationary process enables us to associate a time scale to X t, the correlation time τ cor : τ cor = 1 C(0) 0 C(τ) dτ = 0 E(X τ X 0 )/E(X 2 0 ) dτ. The slower the decay of the correlation function, the larger the correlation time is. We have to assume sufficiently fast decay of correlations so that the correlation time is finite. Pavliotis (IC) StochProc January 16, 2011 54 / 367

Example Consider the mean zero, second order stationary process with covariance function The spectral density of this process is: f(x) = 1 D 2π α = 1 D 2π α = 1 D 2π α = D π + ( 0 R(t) = D α e α t. (7) e ixt e α t dt e ixt e αt dt + ( 1 ix + α + 1 ix + α 1 x 2 + α 2. + ) 0 e ixt e αt dt ) Pavliotis (IC) StochProc January 16, 2011 55 / 367

Example (Continued) This function is called the Cauchy or the Lorentz distribution. The Gaussian stochastic process with covariance function (7) is called the stationary Ornstein-Uhlenbeck process. The correlation time is (we have that C(0) = D/(α)) τ cor = 0 e αt dt = α 1. Pavliotis (IC) StochProc January 16, 2011 56 / 367

The OU process was introduced by Ornstein and Uhlenbeck in 1930 (G.E. Uhlenbeck, L.S. Ornstein, Phys. Rev. 36, 823 (1930)) as a model for the velocity of a Brownian particle. It is of interest to calculate the statistics of the position of the Brownian particle, i.e. of the integral X(t) = t where Y(t) denotes the stationary OU process. Lemma 0 Y(s) ds, (8) Let Y(t) denote the stationary OU process with covariance function (7). Then the position process (8) is a mean zero Gaussian process with covariance function E(X(t)X(s)) = 2 min(t, s) + e min(t,s) + e max(t,s) e t s 1. Pavliotis (IC) StochProc January 16, 2011 57 / 367

Second order stationary processes are ergodic: time averages equal phase space (ensemble) averages. An example of an ergodic theorem for a stationary processes is the following L 2 (mean-square) ergodic theorem. Theorem Let {X t } t 0 be a second order stationary process on a probability space Ω, F, P with mean µ and covariance R(t), and assume that R(t) L 1 (0,+ ). Then lim E 1 T + T T 0 X(s) ds µ 2 = 0. (9) Pavliotis (IC) StochProc January 16, 2011 58 / 367

Proof. We have 1 E T T 0 X(s) ds µ 2 = 1 T T 2 = 2 T 2 = 2 T 2 T 0 0 T t 0 T 0 0 R(t s) dtds R(t s) dsdt (T v)r(u) du 0, using the dominated convergence theorem and the assumption R( ) L 1. In the above we used the fact that R is a symmetric function, together with the change of variables u = t s, v = t and an integration over v. Pavliotis (IC) StochProc January 16, 2011 59 / 367

Assume that µ = 0. From the above calculation we can conclude that, under the assumption that R( ) decays sufficiently fast at infinity, for t 1 we have that where ( t 2 E X(t) dt) 2Dt, 0 D = 0 R(t) dt is the diffusion coefficient. Thus, one expects that at sufficiently long times and under appropriate assumptions on the correlation function, the time integral of a stationary process will approximate a Brownian motion with diffusion coefficient D. This kind of analysis was initiated in G.I. Taylor, Diffusion by Continuous Movements Proc. London Math. Soc..1922; s2-20: 196-212. Pavliotis (IC) StochProc January 16, 2011 60 / 367

Definition A stochastic process is called (strictly) stationary if all finite dimensional distributions are invariant under time translation: for any integer k and times t i T, the distribution of (X(t 1 ), X(t 2 ),..., X(t k )) is equal to that of (X(s + t 1 ), X(s + t 2 ),..., X(s + t k )) for any s such that s + t i T for all i {1,..., k}. In other words, P(X t1 +t A 1, X t2 +t A 2...X tk +t A k ) = P(X t1 A 1, X t2 A 2... X tk A k ), t T. Let X t be a strictly stationary stochastic process with finite second moment (i.e. X t L 2 ). The definition of strict stationarity implies that EX t = µ, a constant, and E((X t µ)(x s µ)) = C(t s). Hence, a strictly stationary process with finite second moment is also stationary in the wide sense. The converse is not true. Pavliotis (IC) StochProc January 16, 2011 61 / 367

Remarks 1 A sequence Y 0, Y 1,... of independent, identically distributed random variables is a stationary process with R(k) = σ 2 δ k0, σ 2 = E(Y k ) 2. 2 Let Z be a single random variable with known distribution and set Z j = Z, j = 0, 1, 2,... Then the sequence Z 0, Z 1, Z 2,... is a stationary sequence with R(k) = σ 2. 3 The first two moments of a Gaussian process are sufficient for a complete characterization of the process. A corollary of this is that a second order stationary Gaussian process is also a (strictly) stationary process. Pavliotis (IC) StochProc January 16, 2011 62 / 367

The most important continuous time stochastic process is Brownian motion. Brownian motion is a mean zero, continuous (i.e. it has continuous sample paths: for a.e ω Ω the function X t is a continuous function of time) process with independent Gaussian increments. A process X t has independent increments if for every sequence t 0 < t 1...t n the random variables are independent. X t1 X t0, X t2 X t1,..., X tn X tn 1 If, furthermore, for any t 1, t 2 and Borel set B R P(X t2 +s X t1 +s B) is independent of s, then the process X t has stationary independent increments. Pavliotis (IC) StochProc January 16, 2011 63 / 367

Definition A one dimensional standard Brownian motion W(t) : R + R is a real valued stochastic process with the following properties: 1 W(0) = 0; 2 W(t) is continuous; 3 W(t) has independent increments. 4 For every t > s 0 W(t) W(s) has a Gaussian distribution with mean 0 and variance t s. That is, the density of the random variable W(t) W(s) is g(x; t, s) = ( ) 1 2 2π(t s) exp ( x 2 ) ; (10) 2(t s) Pavliotis (IC) StochProc January 16, 2011 64 / 367

Definition (Continued) A d dimensional standard Brownian motion W(t) : R + R d is a collection of d independent one dimensional Brownian motions: W(t) = (W 1 (t),..., W d (t)), where W i (t), i = 1,...,d are independent one dimensional Brownian motions. The density of the Gaussian random vector W(t) W(s) is thus g(x; t, s) = ( ) ) d/2 2π(t s) exp ( x 2. 2(t s) Brownian motion is sometimes referred to as the Wiener process. Pavliotis (IC) StochProc January 16, 2011 65 / 367

2 1.5 mean of 1000 paths 5 individual paths 1 U(t) 0.5 0 0.5 1 1.5 0 0.2 0.4 0.6 0.8 1 t Figure: Brownian sample paths Pavliotis (IC) StochProc January 16, 2011 66 / 367

It is possible to prove rigorously the existence of the Wiener process (Brownian motion): Theorem (Wiener) There exists an almost-surely continuous process W t with independent increments such and W 0 = 0, such that for each t the random variable W t is N(0, t). Furthermore, W t is almost surely locally Hölder continuous with exponent α for any α (0, 1 2 ). Notice that Brownian paths are not differentiable. Pavliotis (IC) StochProc January 16, 2011 67 / 367

Brownian motion is a Gaussian process. For the d dimensional Brownian motion, and for I the d d dimensional identity, we have (see (1) and (2)) EW(t) = 0 t 0 and Moreover, ( ) E (W(t) W(s)) (W(t) W(s)) = (t s)i. (11) ( ) E W(t) W(s) = min(t, s)i. (12) Pavliotis (IC) StochProc January 16, 2011 68 / 367

From the formula for the Gaussian density g(x, t s), eqn. (10), we immediately conclude that W(t) W(s) and W(t + u) W(s + u) have the same pdf. Consequently, Brownian motion has stationary increments. Notice, however, that Brownian motion itself is not a stationary process. Since W(t) = W(t) W(0), the pdf of W(t) is g(x, t) = 1 2πt e x2 /2t. We can easily calculate all moments of the Brownian motion: E(x n (t)) = = 1 + x n e x2 /2t dx 2πt { 1.3...(n 1)t n/2, n even, 0, n odd. Pavliotis (IC) StochProc January 16, 2011 69 / 367

We can define the OU process through the Brownian motion via a time change. Lemma Let W(t) be a standard Brownian motion and consider the process V(t) = e t W(e 2t ). Then V(t) is a Gaussian second order stationary process with mean 0 and covariance K(s, t) = e t s. Pavliotis (IC) StochProc January 16, 2011 70 / 367

Definition A (normalized) fractional Brownian motion Wt H, t 0 with Hurst parameter H (0, 1) is a centered Gaussian process with continuous sample paths whose covariance is given by E(W H t W H s ) = 1 2( s 2H + t 2H t s 2H). (13) Fractional Brownian motion has the following properties. 1 When H = 1 2, W 1 2 t 2 W H 0 = 0, EW H t becomes the standard Brownian motion. = 0, E(W H t ) 2 = t 2H, t 0. 3 It has stationary increments, E(W H t W H s ) 2 = t s 2H. 4 It has the following self similarity property (W H αt, t 0) = (α H W H t, t 0), α > 0, where the equivalence is in law. Pavliotis (IC) StochProc January 16, 2011 71 / 367

The Poisson Process Another fundamental continuous time process is the Poisson process : Definition The Poisson process with intensity λ, denoted by N(t), is an integer-valued, continuous time, stochastic process with independent increments satisfying P[(N(t) N(s)) = k] = e λ(t s)( λ(t s) ) k, t > s 0, k N. k! Both Brownian motion and the Poisson process are homogeneous (or time-homogeneous): the increments between successive times s and t depend only on t s. Pavliotis (IC) StochProc January 16, 2011 72 / 367

A very useful result is that we can expand every centered (EX t = 0) stochastic process with continuous covariance (and hence, every L 2 -continuous centered stochastic process) into a random Fourier series. Let (Ω, F, P) be a probability space and {X t } t T a centered process which is mean square continuous. Then, the Karhunen-Loéve theorem states that X t can be expanded in the form X t = ξ n φ n (t), (14) n=1 where the ξ n are an orthogonal sequence of random variables with E ξ k 2 = λ k, where {λ k φ n } k=1 are the eigenvalues and eigenfunctions of the integral operator whose kernel is the covariance of the processes X t. The convergence is in L 2 (P) for every t T. If X t is a Gaussian process then the ξ k are independent Gaussian random variables. Pavliotis (IC) StochProc January 16, 2011 73 / 367

Set T = [0, 1].Let K : L 2 (T) L 2 (T) be defined by Kψ(t) = 1 0 K(s, t)ψ(s) ds. The kernel of this integral operator is a continuous (in both s and t), symmetric, nonnegative function. Hence, the corresponding integral operator has eigenvalues, eigenfunctions Kφ n = λ n, λ n 0, (φ n,φ m ) L 2 = δ nm, such that the covariance operator can be expanded in a uniformly convergent series B(t, s) = λ n φ n (t)φ n (s). n=1 Pavliotis (IC) StochProc January 16, 2011 74 / 367

The random variables ξ n are defined as ξ n = 1 0 X(t)φ n (t) dt. The orthogonality of the eigenfunctions of the covariance operator implies that E(ξ n ξ m ) = λ n δ nm. Thus the random variables are orthogonal. When X(t) is Gaussian, then ξ k are Gaussian random variables. Furthermore, since they are also orthogonal, they are independent Gaussian random variables. Pavliotis (IC) StochProc January 16, 2011 75 / 367

Example The Karhunen-Loéve Expansion for Brownian Motion We set T = [0, 1].The covariance function of Brownian motion is C(t, s) = min(t, s). The eigenvalue problem Cψ n = λ n ψ n becomes Or, 1 0 1 0 min(t, s)ψ n (s) ds = λ n ψ n (t). sψ n (s) ds + t 1 We differentiate this equation twice: 1 t t ψ n (s) ds = λ n ψ n (t). ψ n (s) ds = λ n ψ n(t) and ψ n (t) = λ n ψ n(t), where primes denote differentiation with respect to t. Pavliotis (IC) StochProc January 16, 2011 76 / 367

Example From the above equations we immediately see that the right boundary conditions are ψ(0) = ψ (1) = 0. The eigenvalues and eigenfunctions are ψ n (t) = ( ) ( ) 1 2 2 2 sin (2n + 1)πt, λ n =. 2 (2n + 1)π Thus, the Karhunen-Loéve expansion of Brownian motion on [0, 1] is W t = 2 π n=1 ( ) 2 1 ξ n (2n + 1)π sin (2n + 1)πt. (15) 2 Pavliotis (IC) StochProc January 16, 2011 77 / 367

The Path Space Let (Ω, F,µ) be a probability space, (E,ρ) a metric space and let T = [0, ). Let {X t } be a stochastic process from (Ω, F,µ) to (E, ρ) with continuous sample paths. The above means that for every ω Ω we have that X t C E := C([0, ); E). The space of continuous functions C E is called the path space of the stochastic process. We can put a metric on E as follows: ρ E (X 1, X 2 ) := n=1 1 2 n max min( ρ(xt 1, Xt 2 ), 1 ). 0 t n We can then define the Borel sets on C E, using the topology induced by this metric, and {X t } can be thought of as a random variable on (Ω, F,µ) with state space (C E, B(C E )). Pavliotis (IC) StochProc January 16, 2011 78 / 367

The probability measure PXt 1 on (C E, B(C E )) is called the law of {X t }. The law of a stochastic process is a probability measure on its path space. Example The space of continuous functions C E is the path space of Brownian motion (the Wiener process). The law of Brownian motion, that is the measure that it induces on C([0, ), R d ), is known as the Wiener measure. Pavliotis (IC) StochProc January 16, 2011 79 / 367

MARKOV STOCHASTIC PROCESSES Pavliotis (IC) StochProc January 16, 2011 80 / 367

Let (Ω, F) be a measurable space and T an ordered set. Let X = X t (ω) be a stochastic process from the sample space (Ω, F) to the state space (E, G). It is a function of two variables, t T and ω Ω. For a fixed ω Ω the function X t (ω), t T is the sample path of the process X associated with ω. Let K be a collection of subsets of Ω. The smallest σ algebra on Ω which contains K is denoted by σ(k) and is called the σ algebra generated by K. Let X t : Ω E, t T. The smallest σ algebra σ(x t, t T), such that the family of mappings {X t, t T } is a stochastic process with sample space (Ω,σ(X t, t T)) and state space (E, G), is called the σ algebra generated by {X t, t T }. Pavliotis (IC) StochProc January 16, 2011 81 / 367

A filtration on (Ω, F) is a nondecreasing family {F t, t T } of sub σ algebras of F: F s F t F for s t. We set F = σ( t T F t ). The filtration generated by X t, where X t is a stochastic process, is F X t := σ (X s ; s t). We say that a stochastic process {X t ; t T } is adapted to the filtration {F t } := {F t, t T } if for all t T, X t is an F t measurable random variable. Definition Let {X t } be a stochastic process defined on a probability space (Ω, F,µ) with values in E and let Ft X be the filtration generated by {X t }. Then {X t } is a Markov process if for all t, s T with t s, and Γ B(E). P(X t Γ F X s ) = P(X t Γ X s ) (16) Pavliotis (IC) StochProc January 16, 2011 82 / 367

The filtration Ft X is generated by events of the form {ω X s1 B 1, X s2 B 2,...X sn B n, } with 0 s 1 < s 2 < < s n s and B i B(E). The definition of a Markov process is thus equivalent to the hierarchy of equations P(X t Γ X t1, X t2,... X tn ) = P(X t Γ X tn ) a.s. for n 1, 0 t 1 < t 2 < < t n t and Γ B(E). Pavliotis (IC) StochProc January 16, 2011 83 / 367

Roughly speaking, the statistics of X t for t s are completely determined once X s is known; information about X t for t < s is superfluous. In other words: a Markov process has no memory. More precisely: when a Markov process is conditioned on the present state, then there is no memory of the past. The past and future of a Markov process are statistically independent when the present is known. A typical example of a Markov process is the random walk: in order to find the position x(t + 1) of the random walker at time t + 1 it is enough to know its position x(t) at time t: how it got to x(t) is irrelevant. A non Markovian process X t can be described through a Markovian one Y t by enlarging the state space: the additional variables that we introduce account for the memory in the X t. This "Markovianization" trick is very useful since there are many more tools for analyzing Markovian process. Pavliotis (IC) StochProc January 16, 2011 84 / 367

With a Markov process {X t } we can associate a function P : T T E B(E) R + defined through the relation [ ] P X t Γ Fs X = P(s, t, X s,γ), for all t, s T with t s and all Γ B(E). Assume that X s = x. Since P [ X t Γ Fs X ] = P [Xt Γ X s ] we can write P(Γ, t x, s) = P [X t Γ X s = x]. The transition function P(t,Γ x, s) is (for fixed t, x s) a probability measure on E with P(t, E x, s) = 1; it is B(E) measurable in x (for fixed t, s, Γ) and satisfies the Chapman Kolmogorov equation P(Γ, t x, s) = P(Γ, t y, u)p(dy, u x, s). (17) E for all x E, Γ B(E) and s, u, t T with s u t. Pavliotis (IC) StochProc January 16, 2011 85 / 367

The derivation of the Chapman-Kolmogorov equation is based on the assumption of Markovianity and on properties of the conditional probability: 1 Let (Ω, F, µ) be a probability space, X a random variable from (Ω, F, µ) to (E, G) and let F 1 F 2 F. Then E(E(X F 2 ) F 1 ) = E(E(X F 1 ) F 2 ) = E(X F 1 ). (18) 2 Given G F we define the function P X (B G) = P(X B G) for B F. Assume that f is such that E(f(X)) <. Then E(f(X) G) = f(x)p X (dx G). (19) R Pavliotis (IC) StochProc January 16, 2011 86 / 367

Now we use the Markov property, together with equations (18) and (19) and the fact that s < u Fs X F u X to calculate: P(Γ, t x, s) := P(X t Γ X s = x) = P(X t Γ F X s ) = E(I Γ (X t ) F X s ) = E(E(I Γ (X t ) F X s ) F X u ) = E(E(I Γ (X t ) F X u ) F X s ) = E(P(X t Γ X u ) F X s ) = E(P(X t Γ X u = y) X s = x) = P(Γ, t X u = y)p(dz, u X s = x) R =: P(Γ, t y, u)p(dy, u x, s). R I Γ ( ) denotes the indicator function of the set Γ. We have also set E = R. Pavliotis (IC) StochProc January 16, 2011 87 / 367

The CK equation is an integral equation and is the fundamental equation in the theory of Markov processes. Under additional assumptions we will derive from it the Fokker-Planck PDE, which is the fundamental equation in the theory of diffusion processes, and will be the main object of study in this course. A Markov process is homogeneous if P(t,Γ X s = x) := P(s, t, x,γ) = P(0, t s, x,γ). We set P(0, t,, ) = P(t,, ). The Chapman Kolmogorov (CK) equation becomes P(t + s, x,γ) = P(s, x, dz)p(t, z, Γ). (20) E Pavliotis (IC) StochProc January 16, 2011 88 / 367

Let X t be a homogeneous Markov process and assume that the initial distribution of X t is given by the probability measure ν(γ) = P(X 0 Γ) (for deterministic initial conditions X 0 = x we have that ν(γ) = I Γ (x) ). The transition function P(x, t, Γ) and the initial distribution ν determine the finite dimensional distributions of X by P(X 0 Γ 1, X(t 1 ) Γ 1,...,X tn Γ n ) =... P(t n t n 1, y n 1,Γ n )P(t n 1 t n 2, y n 2, dy n 1 ) Γ 0 Γ 1 Γ n 1 P(t 1, y 0, dy 1 )ν(dy 0 ). (21) Theorem (Ethier and Kurtz 1986, Sec. 4.1) Let P(t, x,γ) satisfy (20) and assume that (E, ρ) is a complete separable metric space. Then there exists a Markov process X in E whose finite-dimensional distributions are uniquely determined by (21). Pavliotis (IC) StochProc January 16, 2011 89 / 367

Let X t be a homogeneous Markov process with initial distribution ν(γ) = P(X 0 Γ) and transition function P(x, t,γ). We can calculate the probability of finding X t in a set Γ at time t: P(X t Γ) = P(x, t, Γ)ν(dx). E Thus, the initial distribution and the transition function are sufficient to characterize a homogeneous Markov process. Notice that they do not provide us with any information about the actual paths of the Markov process. Pavliotis (IC) StochProc January 16, 2011 90 / 367

The transition probability P(Γ, t x, s) is a probability measure. Assume that it has a density for all t > s: P(Γ, t x, s) = p(y, t x, s) dy. Clearly, for t = s we have P(Γ, s x, s) = I Γ (x). The Chapman-Kolmogorov equation becomes: p(y, t x, s) dy = p(y, t z, u)p(z, u x, s) dzdy, Γ R Γ Γ and, since Γ B(R) is arbitrary, we obtain the equation p(y, t x, s) = p(y, t z, u)p(z, u x, s) dz. (22) R The transition probability density is a function of 4 arguments: the initial position and time x, s and the final position and time y, t. Pavliotis (IC) StochProc January 16, 2011 91 / 367

In words, the CK equation tells us that, for a Markov process, the transition from x, s to y, t can be done in two steps: first the system moves from x to z at some intermediate time u. Then it moves from z to y at time t. In order to calculate the probability for the transition from (x, s) to (y, t) we need to sum (integrate) the transitions from all possible intermediary states z. The above description suggests that a Markov process can be described through a semigroup of operators, i.e. a one-parameter family of linear operators with the properties P 0 = I, P t+s = P t P s t, s 0. A semigroup of operators is characterized through its generator. Pavliotis (IC) StochProc January 16, 2011 92 / 367

Indeed, let P(t, x, dy) be the transition function of a homogeneous Markov process. It satisfies the CK equation (20): P(t + s, x,γ) = P(s, x, dz)p(t, z,γ). Let X := C b (E) and define the operator (P t f)(x) := E(f(X t ) X 0 = x) = E E f(y)p(t, x, dy). This is a linear operator with (P 0 f)(x) = E(f(X 0 ) X 0 = x) = f(x) P 0 = I. Pavliotis (IC) StochProc January 16, 2011 93 / 367

Furthermore: (P t+s f)(x) = = = = f(y)p(t + s, x, dy) f(y)p(s, z, dy)p(t, x, dz) ( ) f(y)p(s, z, dy) P(t, x, dz) (P s f)(z)p(t, x, dz) = (P t P s f)(x). Consequently: P t+s = P t P s. Pavliotis (IC) StochProc January 16, 2011 94 / 367

Let (E,ρ) be a metric space and let {X t } be an E-valued homogeneous Markov process. Define the one parameter family of operators P t through P t f(x) = f(y)p(t, x, dy) = E[f(X t ) X 0 = x] for all f(x) C b (E) (continuous bounded functions on E). Assume for simplicity that P t : C b (E) C b (E). Then the one-parameter family of operators P t forms a semigroup of operators on C b (E). We define by D(L) the set of all f C b (E) such that the strong limit P t f f Lf = lim, t 0 t exists. Pavliotis (IC) StochProc January 16, 2011 95 / 367

Definition The operator L : D(L) C b (E) is called the infinitesimal generator of the operator semigroup P t. Definition The operator L : C b (E) C b (E) defined above is called the generator of the Markov process {X t }. The study of operator semigroups started in the late 40 s independently by Hille and Yosida. Semigroup theory was developed in the 50 s and 60 s by Feller, Dynkin and others, mostly in connection to the theory of Markov processes. Necessary and sufficient conditions for an operator L to be the generator of a (contraction) semigroup are given by the Hille-Yosida theorem (e.g. Evans Partial Differential Equations, AMS 1998, Ch. 7). Pavliotis (IC) StochProc January 16, 2011 96 / 367

The semigroup property and the definition of the generator of a semigroup imply that, formally at least, we can write: P t = exp(lt). Consider the function u(x, t) := (P t f)(x). We calculate its time derivative: u t = d dt (P tf) = d ( e Lt f ) dt = L ( e Lt f ) = LP t f = Lu. Furthermore, u(x, 0) = P 0 f(x) = f(x). Consequently, u(x, t) satisfies the initial value problem u t = Lu, u(x, 0) = f(x). (23) Pavliotis (IC) StochProc January 16, 2011 97 / 367

When the semigroup P t is the transition semigroup of a Markov process X t, then equation (23) is called the backward Kolmogorov equation. It governs the evolution of an observable u(x, t) = E(f(X t ) X 0 = x). Thus, given the generator of a Markov process L, we can calculate all the statistics of our process by solving the backward Kolmogorov equation. In the case where the Markov process is the solution of a stochastic differential equation, then the generator is a second order elliptic operator and the backward Kolmogorov equation becomes an initial value problem for a parabolic PDE. Pavliotis (IC) StochProc January 16, 2011 98 / 367

The space C b (E) is natural in a probabilistic context, but other Banach spaces often arise in applications; in particular when there is a measure µ on E, the spaces L p (E;µ) sometimes arise. We will quite often use the space L 2 (E;µ), where µ will is the invariant measure of our Markov process. The generator is frequently taken as the starting point for the definition of a homogeneous Markov process. Conversely, let P t be a contraction semigroup (Let X be a Banach space and T : X X a bounded operator. Then T is a contraction provided that Tf X f X f X), with D(P t ) C b (E), closed. Then, under mild technical hypotheses, there is an E valued homogeneous Markov process {X t } associated with P t defined through E[f(X(t) F X s )] = P t sf(x(s)) for all t, s T with t s and f D(P t ). Pavliotis (IC) StochProc January 16, 2011 99 / 367

Example The one dimensional Brownian motion is a homogeneous Markov process. The transition function is the Gaussian defined in the example in Lecture 2: P(t, x, dy) = γ t,x (y)dy, γ t,x (y) = 1 ) x y 2 exp (. 2πt 2t The semigroup associated to the standard Brownian motion is the heat d 2 semigroup P t = e t 2 dx 2. The generator of this Markov process is 1 2 d 2. dx 2 Notice that the transition probability density γ t,x of the one dimensional Brownian motion is the fundamental solution (Green s function) of the heat (diffusion) PDE γ t = 1 2 γ 2 x 2. Pavliotis (IC) StochProc January 16, 2011 100 / 367

Example The Ornstein-Uhlenbeck process V t = e t W(e 2t ) is a homogeneous Markov process. The transition probability density is the Gaussian p(y, t x, s) := p(v t = y V s = x) ( ) 1 = 2π(1 e 2(t s) ) exp y xe (t s) 2 2(1 e 2(t s). ) The semigroup associated to the Ornstein-Uhlenbeck process is d 2 P t = e t( x d dx + 1 2 dx 2 ). The generator of this Markov process is L = x d dx + 1 d 2 2 dx 2. Notice that the transition probability density p(t, x) of the 1d OU process is the fundamental solution (Green s function) of the heat (diffusion) PDE p t = (xp) x + 1 2 p 2 x 2. Pavliotis (IC) StochProc January 16, 2011 101 / 367