ELEMENTS OF PROBABILITY THEORY
Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable unions of its elements. A sub-σ algebra is a collection of subsets of a σ algebra which satisfies the axioms of a σ algebra. A measurable space is a pair (Ω, F) where Ω is a set and F is a σ algebra of subsets of Ω. Let (Ω, F) and (E, G) be two measurable spaces. A function X : Ω E such that the event {ω Ω : X(ω) A} =: {X A} belongs to F for arbitrary A G is called a measurable function or random variable.
Elements of Probability Theory Let (Ω, F) be a measurable space. A function µ : F [0, 1] is called a probability measure if µ( ) = 1, µ(ω) = 1 and µ( k=1 A k) = k=1 µ(a k) for all sequences of pairwise disjoint sets {A k } k=1 F. The triplet (Ω, F, µ) is called a probability space. Let X be a random variable (measurable function) from (Ω, F, µ) to (E, G). If E is a metric space then we may define expectation with respect to the measure µ by E[X] = X(ω) dµ(ω). More generally, let f : E R be G measurable. Then, E[f(X)] = f(x(ω)) dµ(ω). Ω Ω
Elements of Probability Theory Let U be a topological space. We will use the notation B(U) to denote the Borel σ algebra of U: the smallest σ algebra containing all open sets of U. Every random variable from a probability space (Ω, F, µ) to a measurable space (E, B(E)) induces a probability measure on E: µ X (B) = PX 1 (B) = µ(ω Ω; X(ω) B), B B(E). The measure µ X is called the distribution (or sometimes the law) of X. Example 1 Let I denote a subset of the positive integers. A vector ρ 0 = {ρ 0,i, i I} is a distribution on I if it has nonnegative entries and its total mass equals 1: i I ρ 0,i = 1.
Elements of Probability Theory We can use the distribution of a random variable to compute expectations and probabilities: E[f(X)] = f(x) dµ X (x) and P[X G] = G S dµ X (x), G B(E). When E = R d and we can write dµ X (x) = ρ(x) dx, then we refer to ρ(x) as the probability density function (pdf), or density with respect to Lebesque measure for X. When E = R d then by L p (Ω; R d ), or sometimes L p (Ω; µ) or even simply L p (µ), we mean the Banach space of measurable functions on Ω with norm X L p = ( E X p) 1/p.
Elements of Probability Theory Example 2 i) Consider the random variable X : Ω R with pdf ( ) γ σ,m (x) := (2πσ) 1 (x m)2 2 exp. 2σ Such an X is termed a Gaussian or normal random variable. The mean is EX = xγ σ,m (x) dx = m and the variance is E(X m) 2 = R R (x m) 2 γ σ,m (x) dx = σ. Since the mean and variance specify completely a Gaussian random variable on R, the Gaussian is commonly denoted by N (m, σ). The standard normal random variable is N (0, 1).
Elements of Probability Theory ii) Let m R d and Σ R d d be symmetric and positive definite. The random variable X : Ω R d with pdf γ Σ,m (x) := ( (2π) d detσ ) ( 1 2 exp 1 ) 2 Σ 1 (x m), (x m) is termed a multivariate Gaussian or normal random variable. The mean is E(X) = m (1) and the covariance matrix is ( ) E (X m) (X m) = Σ. (2) Since the mean and covariance matrix completely specify a Gaussian random variable on R d, the Gaussian is commonly denoted by N (m, Σ).
Elements of Probability Theory Example 3 An exponential random variable T : Ω R + with rate λ > 0 satisfies P(T > t) = e λt, t 0. We write T exp(λ). The related pdf is f T (t) = { λe λt, t 0, 0, t < 0. (3) Notice that E T = tf T (t)dt = 1 λ 0 (λt)e λt d(λt) = 1 λ. If the times τ n = t n+1 t n are i.i.d random variables with τ 0 exp(λ) then, for t 0 = 0, t n = n 1 k=0 τ k
Elements of Probability Theory and it is possible to show that P(0 t k t < t k+1 ) = e λt (λt) k. (4) k!
Elements of Probability Theory Assume that E X < and let G be a sub σ algebra of F. The conditional expectation of X with respect to G is defined to be the function E[X G] : Ω E which is G measurable and satisfies E[X G] dµ = X dµ G G. G We can define E[f(X) G] and the conditional probability P[X F G] = E[I F (X) G], where I F is the indicator function of F, in a similar manner. G
ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES
Definition of a Stochastic Process Let T be an ordered set. A stochastic process is a collection of random variables X = {X t ; t T } where, for each fixed t T, X t is a random variable from (Ω, F) to (E, G). The measurable space {Ω, F} is called the sample space. The space (E, G) is called the state space. In this course we will take the set T to be [0, + ). The state space E will usually be R d equipped with the σ algebra of Borel sets. A stochastic process X may be viewed as a function of both t T and ω Ω. We will sometimes write X(t), X(t, ω) or X t (ω) instead of X t. For a fixed sample point ω Ω, the function X t (ω) : T E is called a sample path (realization, trajectory) of the process X.
Definition of a Stochastic Process The finite dimensional distributions (fdd) of a stochastic process are the E k valued random variables (X(t 1 ), X(t 2 ),..., X(t k )) for arbitrary positive integer k and arbitrary times t i T, i {1,..., k}. We will say that two processes X t and Y t are equivalent if they have same finite dimensional distributions. From experiments or numerical simulations we can only obtain information about the (fdd) of a process.
Stationary Processes A process is called (strictly) stationary if all fdd are invariant under are time translation: for any integer k and times t i T, the distribution of (X(t 1 ), X(t 2 ),..., X(t k )) is equal to that of (X(s + t 1 ), X(s + t 2 ),..., X(s + t k )) for any s such that s + t i T for all i {1,..., k}. Let X t be a stationary stochastic process with finite second moment (i.e. X t L 2 ). Stationarity implies that EX t = µ, E((X t µ)(x s µ)) = C(t s). The converse is not true. A stochastic process X t L 2 is called second-order stationary (or stationary in the wide sense) if the first moment EX t is a constant and the second moment depends only on the difference t s: EX t = µ, E((X t µ)(x s µ)) = C(t s).
Stationary Processes The function C(t) is called the correlation (or covariance) function of X t. Let X t L 2 be a mean zero second order stationary process on R which is mean square continuous, i.e. lim E X t X s 2 = 0. t s Then the correlation function admits the representation C(t) = e itx f(x) dx, t R. the function f(x) is called the spectral density of the process X t. In many cases, the experimentally measured quantity is the spectral density (or power spectrum) of the stochastic process.
Stationary Processes Given the correlation function of X t, and assuming that C(t) L 1 (R), we can calculate the spectral density through its Fourier transform: f(x) = 1 2π e itx C(t) dt. The correlation function of a second order stationary process enables us to associate a time scale to X t, the correlation time τ cor : τ cor = 1 C(0) 0 C(τ) dτ = 0 E(X τ X 0 )/E(X 2 0 ) dτ. The slower the decay of the correlation function, the larger the correlation time is. We have to assume sufficiently fast decay of correlations so that the correlation time is finite.
Stationary Processes Example 4 Consider a second stationary process with correlation function C(t) = C(0)e γ t. The spectral density of this process is The correlation time is f(x) = 1 2π C(0) = C(0) 1 π τ cor = 0 γ γ 2 + x 2. e itx e γ t dt e γt dt = γ 1.
Gaussian Processes The most important class of stochastic processes is that of Gaussian processes: Definition 5 A Gaussian process is one for which E = R d and all the finite dimensional distributions are Gaussian. A Gaussian process x(t) is characterized by its mean and the covariance function m(t) := Ex(t) C(t, s) = E( (x(t) m(t) ) ( x(s) m(s) ) ). Thus, the first two moments of a Gaussian process are sufficient for a complete characterization of the process. A corollary of this is that a second order stationary Gaussian process is also a stationary process.
Brownian Motion The most important continuous time stochastic process is Brownian motion. Brownian motion is a mean zero, continuous (i.e. it has continuous sample paths: for a.e ω Ω the function X t is a continuous function of time) process with independent Gaussian increments. A process X t has independent increments if for every sequence t 0 < t 1...t n the random variables are independent. X t1 X t0, X t2 X t1,..., X tn X tn 1 If, furthermore, for any t 1, t 2 and Borel set B R P(X t2 +s X t1 +s B) is independent of s, then the process X t has stationary independent increments.
Brownian Motion Definition 6 i) A one dimensional standard Brownian motion W (t) : R + R is a real valued stochastic process with the following properties: (a) W (0) = 0; (b) W (t) is continuous; (c) W (t) has independent increments. (d) For every t > s 0 W (t) W (s) has a Gaussian distribution with mean 0 and variance t s. That is, the density of the random variable W (t) W (s) is g(x; t, s) = ( ) 1 2 2π(t s) exp ( x2 2(t s) ) ; (5)
Brownian Motion ii) A d dimensional standard Brownian motion W (t) : R + R d is a collection of d independent one dimensional Brownian motions: W (t) = (W 1 (t),..., W d (t)), where W i (t), i = 1,..., d are independent one dimensional Brownian motions. The density of the Gaussian random vector W (t) W (s) is thus g(x; t, s) = ( ) d/2 2π(t s) exp ( x 2 2(t s) Brownian motion is sometimes referred to as the Wiener process. ).
Brownian Motion 3 2 1 0 W(t) 1 2 3 4 0 1 2 3 4 5 t Figure 1: Brownian sample paths
Brownian Motion It is possible to prove rigorously the existence of the Wiener process (Brownian motion): Theorem 1 (Wiener) There exists an almost-surely continuous process W t with independent increments such and W 0 = 0, such that for each t the random variable W t is N (0, t). Furthermore, W t is almost surely locally Hölder continuous with exponent α for any α (0, 1 2 ). Notice that Brownian paths are not differentiable.
Brownian Motion Brownian motion is a Gaussian process. For the d dimensional Brownian motion, and for I the d d dimensional identity, we have (see (1) and (2)) EW (t) = 0 t 0 and ( ) E (W (t) W (s)) (W (t) W (s)) = (t s)i. (6) Moreover, ( ) E W (t) W (s) = min(t, s)i. (7)
Brownian Motion From the formula for the Gaussian density g(x, t s), eqn. (5), we immediately conclude that W (t) W (s) and W (t + u) W (s + u) have the same pdf. Consequently, Brownian motion has stationary increments. Notice, however, that Brownian motion itself is not a stationary process. Since W (t) = W (t) W (0), the pdf of W (t) is g(x, t) = 1 2πt e x2 /2t. We can easily calculate all moments of the Brownian motion: E(x n (t)) = = 1 + 2πt x n e x2 /2t dx { 1.3... (n 1)t n/2, n even, 0, n odd.
The Poisson Process Another fundamental continuous time process is the Poisson process : Definition 7 The Poisson process with intensity λ, denoted by N(t), is an integer-valued, continuous time, stochastic process with independent increments satisfying P[(N(t) N(s)) = k] = e λ(t s)( λ(t s) ) k, t > s 0, k N. k! Notice the connection to exponential random variables via (4). Both Brownian motion and the Poisson process are homogeneous (or time-homogeneous): the increments between successive times s and t depend only on t s.
The Path Space Let (Ω, F, µ) be a probability space, (E, ρ) a metric space and let T = [0, ). Let {X t } be a stochastic process from (Ω, F, µ) to (E, ρ) with continuous sample paths. The above means that for every ω Ω we have that X t C E := C([0, ); E). The space of continuous functions C E is called the path space of the stochastic process. We can put a metric on E as follows: ρ E (X 1, X 2 ) := n=1 1 2 n max 0 t n min( ρ(x 1 t, X 2 t ), 1 ). We can then define the Borel sets on C E, using the topology induced by this metric, and {X t } can be thought of as a random variable on (Ω, F, µ) with state space (C E, B(C E )).
The Path Space The probability measure PXt 1 law of {X t }. on (C E, B(C E )) is called the The law of a stochastic process is a probability measure on its path space. Example 8 The space of continuous functions C E is the path space of Brownian motion (the Wiener process). The law of Brownian motion, that is the measure that it induces on C([0, ), R d ), is known as the Wiener measure.