Università di Pavia Introduction to Stochastic processes Eduardo Rossi
Stochastic Process Stochastic Process: A stochastic process is an ordered sequence of random variables defined on a probability space (Ω,F,P). {Y t (ω),ω Ω,t T }, such that for each t T, y t (ω) is a random variable on the sample space Ω, and for each ω Ω, y t (ω) is a realization of the stochastic process on the index set T (that is an ordered set of values, each corresponds to a value of the index set). Time Series: A time series is a set of observations {y t,t T 0 }, each one recorded at a specified time t. The time series is a part of a realization of a stochastic process, {Y t,t T } where T T 0. An infinite series {y t } t= Eduardo Rossi c - Time series econometrics 2011 2
Mean, Variance and Autocovariance The unconditional mean µ t = E[Y t ] = Y t f(y t )dy t (1) Autocovariance function: The joint distribution of (Y t,y t 1,...,Y t h ) is usually characterized by the autocovariance function: γ t (h) = Cov(Y t,y t h ) = E[(Y t µ t )(Y t h µ t h )] =... (Y t µ t )(Y t h µ t h )f(y t,...,y t h )dy t...dy t h The autocorrelation function ρ t (h) = γ t (h) γt (0)γ t h (0) Eduardo Rossi c - Time series econometrics 2011 3
Stationarity Weak (Covariance) Stationarity: The process Y t is said to be weakly stationary or covariance stationary if the second moments of the process are time invariant: E[Y t ] = µ < t E[(Y t µ)(y t h µ)] = γ(h) < t,h Stationarity implies γ t (h) = γ t ( h) = γ(h). Eduardo Rossi c - Time series econometrics 2011 4
Stationarity Strict Stationarity The process is said to be strictly stationary if for any values of h 1,h 2,...,h n the joint distribution of (Y t,y t+h1,...,y t+hn ) depends only on the intervals h 1,h 2,...,h n but not on the date t itself: f(y t,y t+h1,...,y t+hn ) = f(y τ,y τ+h1,...,y τ+hn ) t,h (2) Strict stationarity implies that all existing moments are time invariant. Gaussian Process The process Y t is said to be Gaussian if the joint density of (Y t,y t+h1,...,y t+hn ), f(y t,y t+h1,...,y t+hn ), is Gaussian for any h 1,h 2,...,h n. Eduardo Rossi c - Time series econometrics 2011 5
Ergodicity The statistical ergodicity theorem concerns what information can be derived from an average over time about the common average at each point of time. Note that the WLLN does not apply as the observed time series represents just one realization of the stochastic process. Ergodic for the mean. Let {Y t (ω),ω Ω,t T } be a weakly stationary process, such that E[Y t (ω)] = µ < and E[(Y t (ω) µ) 2 ] = σ 2 < t. Let y T = T 1 T t=1 Y t be the time average. If y T converges in probability to µ as T, Y t is said to be ergodic for the mean. Eduardo Rossi c - Time series econometrics 2011 6
Ergodicity To be ergodic the memory of a stochastic process should fade in the sense that the covariance between increasingly distant observations converges to zero sufficiently rapidly. For stationary process it can be shown that absolutely summable autocovariances, i.e. h=0 γ(h) <, are sufficient to ensure ergodicity. Ergodic for the second moments γ(h) = (T h) 1 T t=h+1 (Y t µ)(y t h µ) P γ(h) (3) Ergodicity focus on asymptotic independence, while stationarity on the time-invariance of the process. Eduardo Rossi c - Time series econometrics 2011 7
Example Consider the stochastic process {Y t } defined by u 0 t = 0 with u 0 N(0,σ 2 ) Y t = Y t 1 t > 0 (4) Then {Y t } is strictly stationary but not ergodic. Proof Obviously we have that Y t = u 0 for all t 0. Stationarity follows from: E[Y t ] = E[u 0 ] = 0 E[Y 2 t ] = E[u 2 0] = σ 2 E[Y t Y t 1 ] = E[u 2 0] = σ 2 (5) Eduardo Rossi c - Time series econometrics 2011 8
Example Thus we have µ = 0, γ(h) = σ 2 ρ(h) = 1 are time invariant. Ergodicity for the mean requires: y T T 1 = T 1 t=o Y t = T 1 (Tu 0 ) y T = u 0 Eduardo Rossi c - Time series econometrics 2011 9
Random walk Y t = Y t 1 +u t where u t WN(0,σ 2 ). By recursive substitution, Y t Y t = Y t 1 +u t = Y t 2 +u t 1 +u t. = Y 0 +u t +u t 1 +u t 2 +...+u 1 The mean is time-invariant: [ ] t µ = E[Y t ] = E Y 0 + u s = Y 0 + s=1 t E[u s ] = 0. (6) s=1 Eduardo Rossi c - Time series econometrics 2011 10
Random walk But the second moments are diverging. The variance is given by: ( ) 2 ( t t ) 2 γ t (0) = E[Yt 2 ] = E Y 0 + u s = E u s = E = [ t s=1 k=1 t E[u 2 s]+ s=1 s=1 ] [ t t u s u k = E u 2 s + t t s=1 k=1,k s s=1 E[u s u k ] = t s=1 k=1 s=1 ] t u s u k t σ 2 = tσ 2. s=1 Eduardo Rossi c - Time series econometrics 2011 11
Random walk The autocovariances are: γ t (h) = E[Y t Y t h ] = E = E = = t h k=1 t h k=1 [ t s=1 E[u 2 k] ( t h u s k=1 [( Y 0 + u k )] t u s )(Y 0 + s=1 t h k=1 u k )] σ 2 = (t h)σ 2 h > 0. (7) Finally, the autocorrelation function ρ t (h) for h > 0 is given by: ρ 2 t(h) = γt(h) 2 γ t (0)γ t h (0) = [(t h)σ2 ] 2 [tσ 2 ][(t h)σ 2 ] = 1 h t h > 0 (8) Eduardo Rossi c - Time series econometrics 2011 12
White Noise Process A white-noise process is a weakly stationary process which has zero mean and is uncorrelated over time: u t WN(0,σ 2 ) (9) Thus u t is WN process t T : E[u t ] = 0 E[u 2 t] = σ 2 < E[u t u t h ] = 0 h 0,t h T (10) If the assumption of a constant variance is relaxed to E[u 2 t] <, sometimes u t is called a weak WN process. If the white-noise process is normally distributed it is called a Gaussian white-noise process: u t NID(0,σ 2 ). (11) Eduardo Rossi c - Time series econometrics 2011 13
White Noise Process The assumption of normality implies strict stationarity and serial independence (unpredictability). A generalization of the NID is the IID process with constant, but unspecified higher moments. A process u t with independent, identically distributed variates is denoted IID: u t IID(0,σ 2 ) (12) Eduardo Rossi c - Time series econometrics 2011 14
Martingale Martingale The stochastic process x t is said to be martingale with respect to an information set (σ-field), I t 1, of data realized by time t 1 if E[ x t ] < E[x t I t 1 ] = x t 1 a.s. (13) Martingale Difference Sequence The process u t = x t x t 1 with E[ u t ] < and E[u t I t 1 ] = 0 for all t is called a martingale difference sequence, MDS. Eduardo Rossi c - Time series econometrics 2011 15
Martingale difference Let {x t } be an m.d. sequence and let g t 1 = σ(x t 1,x t 2,...) be any measurable, integrable function of the lagged values of the sequence. Then x t g t 1 is a m.d., and Cov[g t 1,x t ] = E[g t 1 x t ] E[g t 1 ]E[x t ] = 0 This implies that, putting g t 1 = x t j j > 0 Cov[x t,x t j ] = 0 M.d. property implies uncorrelatedness of the sequence. White Noise processes may not be m.d. because the conditional expectation of an uncorrelated process can be a nonlinear. An example is the bilinear model (Granger and Newbold, 1986) x t = βx t 2 ǫ t 1 +ǫ t, ǫ t i.i.d(0,1) The model is w.n. when 0 < β < 1 2. Eduardo Rossi c - Time series econometrics 2011 16
Martingale difference The conditional expectations are the following nonlinear functions: i+1 E[x t I t 1 ] = ( β) i+1 x t i x t j i=1 j=2 Uncorrelated, zero mean processes: White Noise Processes 1. IID processes (a) Gaussian White Noise processes Martingale Difference Processes Eduardo Rossi c - Time series econometrics 2011 17
Innovation Innovation An innovation {u t } against an information set I t 1 is a process whose density f(u t I t 1 ) does not depend on I t 1. Mean Innovation {u t } is a mean innovation with respect to an information set I t 1 if E[u t I t 1 ] = 0 Eduardo Rossi c - Time series econometrics 2011 18
The Wold Decomposition If the zero-mean process Y t is wide sense stationary (implying E(Yt 2 ) < ) it has the representation Y t = ψ j ǫ t j +v t j=0 where ψ 0 = 1, j=0 ψ2 j < ǫ t WN(0,σ 2 ) E(v t ǫ t j ) = 0, j and there exist constants α 0,α 1,α 2,..., such that Var( j=0 α jv t ) = 0. Eduardo Rossi c - Time series econometrics 2011 19
The Wold Decomposition The distribution of v t is singular, since v t = (α j /α 0 )v t j j=1 with probability 1, and hence is perfectly predictable one-step ahead. (Deterministic process). If v t = 0, Y t is called a purely non-deterministic process. Eduardo Rossi c - Time series econometrics 2011 20
Box-Jenkins approach to modelling time series Approximate the infinite lag polynomial with the ratio of two finite-order polynomials φ(l) and θ(l): ψ(l) = j=0 ψ j L j θ(l) = φ(l) = 1+θ 1L+θ 2 L 2 +...+θ q L q 1 φ 1 L φ 2 L 2... φ p L p Eduardo Rossi c - Time series econometrics 2011 21
Box-Jenkins approach to modelling time series Time Series Models p q Model Type p > 0 q = 0 φ(l)y t = ǫ t AR(P) p = 0 q > 0 Y t = θ(l)ǫ t MA(Q) p > 0 q > 0 φ(l)y t = θ(l)ǫ t ARMA(p,q) Eduardo Rossi c - Time series econometrics 2011 22