ECON 5101 ADVANCED ECONOMETRICS TIME SERIES Lecture note no. 1 (EB) Erik Biørn, Department of Economics Version of February 1, 2011 Basic concepts and terminology: AR, MA and ARMA processes This lecture note is related to Topic 2 in the lecture plan. 1. STOCHASTIC PROCESSES Consider a time series of a variable Y: {Y} = { Y 1,Y 0,Y 1,,Y T, }, wherey t is its value in periodt. If the Y t s are stochastic variables, we call {Y} a stochastic (random) process. A (model of a) univariate stochastic process involves only one variable. A (model of a) multivariate stochastic process involves more than one variable. 2. LAG OPERATOR, LAG POLYNOMIALS, LINEAR DIFFERENCE EQUATIONS The lag-operator, symbolized by L is defined by: Lz t = z t 1, for any t, where z t is an arbitrary time-series of a variable (observed or unobserved). The important thing with the lag-operator is that it, within certain limits, can be treated and operated as any usual algebraic entity. Applying the lag-operator twice, we get L 2 z t = L(Lz t ) = Lz t 1 = z t 2, and in general, (1) L n z t = z t n, n = 0,±1,±2,.... In particular, L 0 z t = z t, while L n z t = z t+n. The algebra for the lag-operator for linear functions of time series is elaborated in Hamilton: Section 2.1 and exemplified below: Example set 1: L(a 0 +a 1 z t ) = La 0 +L(a 1 z t ) = a 0 +a 1 Lz t = a 0 +a 1 z t 1, (1+a 1 L+a 2 L 2 )z t = z t +a 1 Lz t +a 2 L 2 z t = z t +a 1 z t 1 +a 2 z t 2, (1+a 1 L)(1+b 1 L)z t = [1+(a 1 +b 1 )L+a 1 b 1 L 2 ]z t = z t +(a 1 +b 1 )z t 1 +a 1 b 1 z t 2, where a 0,a 1,a 2 and b 1 are constants. 1
Example 2: The summation formula for an infinite, geometric succession implies, provided that b < 1 that 1/(1 b) = 1+b+b 2 +b 3 +. A similar property also holds if we multiply b with L, again assuming b < 1, i.e., a 0 = a 0 (1+bL+(bL)2 +(bl) 3 + ) = a 0 z t = a 0 [1+bL+(bL) 2 +(bl) 3 + ]z t = a 0 [z t +bz t 1 +b 2 z t 2 +b 3 z t 3 + ]. This expresses an infinitely long, geometric lag-distribution in z t, starting at a 0. Example 3: Assume again b < 1, then a 0 +a 1 L = a 0 + (a 1 +a 0 b)l = a 0 +(a 1 +a 0 b)(l+bl 2 +b 2 L 3 + ) = [ a 0 +a 1 L z t = a 0 + (a 1 +a 0 b)l ] z t = a 0 z t +(a 1 +a 0 b)(z t 1 +bz t 2 +b 2 z t 3 + ). This expresses an infinite lag-distribution in z t with the first coefficient free at a 0 and with geometrically declining coefficients, starting at the value a 1 +a 0 b from the second coefficient onwards. Example 4: Assume again b < 1, then a 0 +a 1 L+a 2 L 2 = a 0 + (a 1 +a 0 b)l+a 2 L2 = a 0 +(a 1 +a 0 b)l+ (a 2 +a 1 b+a 0 b2 )L 2 = a 0 +(a 1 +a 0 b)l+(a 2 +a 1 b+a o b 2 )(L 2 +bl 3 +b 2 L 4 + ) = a 0 +a 1 L+a 2 L 2 z t = [a 0 +(a 1 +a 0 b)l+ (a 2 +a 1 b+a 0 b2 )L 2 ] z t = a 0 z t +(a 1 +a 0 b)z t 1 +(a 2 +a 1 b+a o b 2 )(z t 2 +bz t 3 +b 2 z t 4 + ). This expresses an infinite lag-distribution with the first coefficient free at a 0, the second coefficient free at a 1 +a 0 b, and with geometrically declining coefficients starting at the value a 2 +a 1 b+a 0 b 2 from the third coefficient onwards. Examples 2, 3, 4 can be generalized straightforwardly by extending the polynomial in the numerator of Example 4 with higher-order terms. We next turn to linear difference equations (with constant coefficients). Consider first a linear first-order difference equation in z t with constant coefficients: (2) z t = bz t 1 +f t, 2
where f t is a given, arbitrary time function. The corresponding homogeneous difference equation is z t bz t 1 = 0. By recursive substitution, we find the time function for z t which solves (2), starting in period t n at z t n, is: n 1 z t = b(bz t 2 +f t 1 )+f t = = b n z t n + b i f t i, where z t n is the initial value of the process. For n=t we get in particular t 1 (3) z t = b t z 0 + b i f t i = b t z 0 + 1 bt L t 1 b f t, i=0 since (1 b t )/(1 b)=1+b+b 2 + +b t 1 and hence (1 (bl) t )/()=1+bL+(bL) 2 + +(bl) t 1. If b <1 and we let n in (3), i.e., if we assume that the process had started an infinite number of periods ago, we get (4) z t = i=0 i=0 b i f t i = 1 f t, b < 1. Weseethatz t remembers the current and past values of the given time function, f t,f t 1,f t 2..., with geometrically declining weights. If b 1, the solution to the difference equation (2) explodes, since the absolute value of b i grows with i. Then the solution can still be written as (3), but not as (4). Stability condition, first-order difference equation: The homogeneous part of the difference equation (2), i.e., z t bz t 1 = 0, has a characteristic polynomial, m t bm t 1 = m t 1 (m b), whose root, m, after elimination of the common factor m t 1, shall be less than one in absolute value to ensure stability of the solution. The only root in this simple case is m=b, and hence m < 1 b < 1. Consider next a linear pth-order difference equation in z t with constant coefficients: (5) z t = b 1 z t 1 +b 2 z t 2 +b 3 z t 3 + +b p z t p +f t, where f t is a given, arbitrary time function. The corresponding homogeneous difference equation is z t b 1 z t 1 b 2 z t 2 b 3 z t 3 b p z t p = 0. We skip the solution of the difference equation (5), but turn to its stability condition. Stability condition, pth-order difference equation: The homogeneous part of the difference equation (5), has a characteristic polynomial, m t b 1 m t 1 b 2 m t 2 b 3 m t 3 b p m t p = m t p [m p b 1 m p 1 b 2 m p 2 b 3 m p 3 b p 1 m b p ]. 3
where the latter expression, according to the Fundamental Theorem of Algebra, can be written as m t p [m p b 1 m p 1 b 2 m p 2 b 3 m p 3 b p 1 m b p ] = m t p (m r 1 )(m r 2 )(m r 3 ) (m r p ) where r 1,r 2,r 3,,r p are the p roots of the polynomial, after elimination of the common factor m t p real or complex (conjugate), multiple roots being counted by their multiplicities. The stability condition generalizing m < 1 in the first-order example claims thatthese p roots (real or complex) shall all be inside the unit circle, i.e. the circle going through the points (+1,+i, 1, i), where i = 1. See Hamilton: Chapter 1 and RN s note on complex numbers (on web), for details. 3. STATIONARY AND NON-STATIONARY PROCESSES GENERALITIES Definition: A stochastic process {Y} is (strongly) stationary if {Y t,y t+1,,y t+h } has the same probability distribution as {Y t+t,y t+t+1,,y t+t+h } for all t, T and H, where H and T are finite. Intuitively, stationarity requires that the probability distribution does not change its form when we move the window through which we view realizations of the process. If this condition is NOT satisfied, the stochastic process{y} is said to be non-stationary. Intuitively, the form of the joint probability distribution of (Y t,y t+1,,y t+h ) changes over time. Implications of (strong) stationarity: All (existing) moments of the distribution of (Y t,y t+1,,y t+h ) of the first, second, and higher order are t-invariant, e.g., (6) (7) (8) but also E(Y t ) = µ, V(Y t ) = E[Y t µ] 2 = σ 00, C(Y t,y t+h ) = E[(Y t µ)(y t+h µ)] = σ 0h, t, t, t, h = 1,...,H, E[Y t µ] p = τ pp0, t, p = 3,4,..., { E[(Y t µ) q (Y t+h µ) r ] = τ qrh, t, q = 1,2,...; r = 2,3,..., q = 2,3,...; r = 1,2,.... Note: If at least one of conditions (6) (8) are violated, the process cannot be (strongly) stationary. 4
Weaker forms of stationarity: If (6), (7) and (8) are satisfied, the process {Y t } is said to be weakly stationary, or covariance stationary. If (6) is satisfied [but not necessarily (7) and (8)], the process {Y t } is said to be stationary in the mean. Many economic variables are non-stationary. They are not even stationary in the mean. In macro-economics examples abound (GDP, monetary stock, capital stock, consumer price index,...). On the other hand, non-stationary variables may be made stationary by transformations. Or linear combinations or other functions of non-stationary variables may be stationary. This will be elaborated in the second part of the course. Notable examples are: Y t may be non-stationary, while Y t is stationary. Y t and X t may both be non-stationary, while y t ax t is, for some a, stationary. 4. BASIC UNIVARIATE STOCHASTIC PROCESSES WITH ZERO MEAN White noise process: {ε t } T t=1 w.n.: (9) E(ε t ) = 0, E(ε t ε s ) = Alternatively (and somewhat stronger), ε t IID(0,σ 2 ε), { σ 2 ε, t = s 0, t s where IID denotes identically, independently distributed. Still stronger, ε t IIN(0,σε), 2 where IIN denotes normally, independently distributed. White noise processes are prominent examples of stationary processes. Random walk process: {u t } r.w.: (10) u t = u t 1 +ε t, where ε t w.n. Hence: A random walk process is a process whose first difference is white noise. t (11) {u t } r.w. = { u t } w.n. = u t = u 0 + ε s. E(u t u 0 ) = u 0, var(u t u 0 ) = tσ 2 ε. Random walk processes are prominent examples of non-stationary processes. Zero mean autoregressive (AR) process of order 1: {u t } AR(1): (12) u t = ρu t 1 +ε t, where ε t w.n. s=1 5
Zero mean autoregressive (AR) process of order p: {u t } AR(p): (13) u t = ρ 1 u t 1 +ρ 2 u t 2 + +ρ p u t p +ε t, where ε t w.n. Zero mean moving average (MA) process of order 1: {u t } MA(1): (14) u t = ε t +θε t 1, where ε t w.n. Zero mean moving average (MA) process of order q: {u t } MA(q): (15) u t = ε t +θ 1 ε t 1 + +θ q ε t q, where ε t w.n. Acronyms: AR=Auto-Regressive MA=Moving Average ARMA=Auto-Regressive-Moving Average Zero mean ARMA(1,1) process: {u t } ARMA(1,1) means: (16) u t = ρu t +ε t +θε t 1, where ε t w.n. This process combines AR(1) and MA(1). Hence its name. Zero mean ARMA(p,q) process: {u t } ARMA(p,q) means: (17) u t = ρ 1 u t 1 +ρ 2 u t 2 + +ρ p u t p +ε t +θ 1 ε t 1 + +θ q ε t q, where ε t w.n. This process combines AR(p) and MA(q). Hence its name. Special cases: 1. Zero mean AR(p) = Zero mean ARMA(p,0), 2. Zero mean MA(q) = Zero mean ARMA(0,q), 3. White noise = Zero mean AR(0) = Zero mean MA(0) = Zero mean ARMA(0,0) 4. Random walk = Zero mean AR(1) with ρ = 1 Notice: The zero mean processes discussed above may have interest by themselves or, often more interestingly, occur as disturbance processes in econometric time series models together with other (observable or unobservable) variables which we as economists/econometricians want to model. 5. AR, MA, AND ARMA PROCESSES WITH NON-ZERO MEAN We now extend the AM, MA and ARMA processes (12) (17), by including intercepts and, in doing this as a step towards including covariates allow for non-zero means and switching our variable of interest from u t (unobservable) to Y t (observable). Along the way, we introduce lag-polynomials, drawing on the results in Section 2. 6
Autoregressive (AR) process of order 1: (18) Y t = c+φy t 1 +ε t (1 φl)y t = c+ε t, where ε t w.n. This exemplifies a first-order linear, stochastic difference equation where c+ε t formally corresponds to the extraneously given time function, cf. f t above. Autoregressive (AR) process of order p: (19) Y t = c+φ 1 Y t 1 ++φ 2 Y t 2 + +φ p Y t p +ε t (1 φ 1 L φ 2 L 2 φ p L p )Y t = c+ε t [1 φ(l)]y t = k+ε t, φ(l) = φ 1 L+φ 2 L 2 + +φ p L p, where ε t w.n. This exemplifies a pth-order linear, stochastic difference equation where c+ε t formally corresponds to the extraneously given time function, cf. f t above. Moving average (MA) process of order 1: (20) Y t = k+ε t +θε t 1 Y t = k +(1+θL)ε t, where ε t w.n. It follows that this MA(1) process is stationary (at least covariance stationary), with { E(Y t ) = k, var(y t ) = (1+θ 2 )σε 2 θσ, cov(y 2 t,y t+h ) = ε, h = ±1 0 h = ±2,±3,.... y t = Y t k is zero mean stationary. Moving average (MA) process of order q: (21) Y t = k+ε t +θ 1 ε t 1 +θ 2 ε t 2 + +θ q ε t q Y t = k+(1+θ 1 L+θ 2 L 2 + +θ q L q )ε t Y t = c+[1+θ(l)]ε t, θ(l) = θ 1 L+θ 2 L 2 + +θ q L q, where ε t w.n. It follows that this MA(q) process is stationary (at least covariance stationary) with E(Y t ) = k, var(y t ) = (1+ { q i=1 θ2 i )σ2 ε, cov(y (θh + q h t,y t+h ) = i=1 θ h+iθ i )σε 2 h=1,...,q, 0 h > q. Its variance is finite provided that q s=1 θ2 s <. y t = Y t k is zero mean stationary. ARMA(1,1) process: (22) Y t = c+φy t 1 +ε t +θε t 1 (1 φl)y t = c+(1+θl)ε t, where ε t w.n. ARMA(p, q) process: (23) [1 φ(l)]y t = c+[1+θ(l)]ε t, where ε t w.n. φ(l) = φ 1 L+φ 2 L 2 + +φ p L p, θ(l) = θ 1 L+θ 2 L 2 + +θ q L q, 7
What about the stationarity properties of AR and ARMA processes? And if stationary prevails, what about their expectation? To answer these question, we will first have to introduce a new concept, invertibility, and explain its close association with some of the concepts introduced above: lag-polynomials, stability conditions of difference equations, their characteristic polynomials and the unit circle. 6. INVERTIBILITY CONDITIONS AND THE UNIT CIRCLE Invertibility condition, first-order equation: Let L be the lag operator. The difference equation (2) can then be expressed as (24) z t = blz t +f t ()z t = f t. Since b < 1, we are allowed to divide in (24) by (), to get (25) ()z t = f t z t = 1 f t, b < 1. Definition: We call the condition which allows us to derive the implication z t = [1/(1 bl)]f t from (1 bl)z t = f t the invertibility condition of the difference equation (24). The invertibility condition in this case is b < 1, which indeed coincides with the stability condition for the difference equation. See section 2 Since 1 = 1+bL+b2 L 2 +b 3 L 3 +, b < 1, we have that z t = 1 f t = (1+bL+b 2 L 2 +b 3 L 3 + )f t = b i f t i. This invertibility property shows an easy way of solving the difference equation (24) in the sense of expressing z t in terms of the past time path of f t by using the lag operator and the invertibility condition. Invertibility condition, first-order equation, alternative formulation. Replace now L in the lag-polynomial in (24) by the scalar x. This gives the polynomial 1 ax. The difference equation is invertible if the root of this equation is larger than one in absolute value. This means that x = 1/b must be strictly greater than one in absolute value. This coincides with the condition for stability of the first-order difference equation we have found earlier, because 1/b > 1 b < 1. Invertibility condition, pth order equation: Let L be the lag operator. The difference equation (5) can then be expressed as (26) z t = b(l)z t +f t [1 b(l)]z t = f t, b(l) = b 1 L+b 2 L 2 + +b p L p. Are we allowed to divide in (26) by [1 b(l)], to get (27) [1 b(l)]z t = f t z t = 1 1 b(l) f t, and if so, which conditions should be satisfied? As a generalization of the result for the first-order case above we have: i=0 8
Definition: We call the condition which allows us to derive the implication z t = {1/[1 b(l)]}f t from [1 b(l)]z t = f t the invertibility condition of the difference equation (26). Again, the invertibility conditions involve roots of the characteristic polynomial and coincide with the stability condition for the difference equation. See section 2. Invertibility condition, p lags precise formulation. Replace now L in the lagpolynomial in (26) by the scalar x. This gives the polynomial 1 b(x) = 1 b 1 x b 2 x 2 b 3 x 3. The difference equation is invertible if all roots (real or complex) of this polynomial, i.e., the solution values for x of 1 b(x) = 0, are outside the unit circle: All roots of the characteristic polynomial are inside the unit circle All roots of the lag polynomial are outside the unit circle It is incorrect to say that we want to find the values of the lag-operator for which the lag-polynomial is zero. The point is that the operator L should be replaced by an ordinary scalar entity, say x, to make this prescription work. The lag-operator cannot be a root of anything. 7. STATIONARITY CONDITIONS FOR AR- AND ARMA-PROCESSES It follows from the results in section 5 and 6 that: The AR(1)-process (18) and the ARMA(1,1)-process (22) may be either stationary and non-stationary. They are stationary if φ <1. They are non-stationary if φ 1. No restriction is placed on θ in the ARMA case. Under stationarity, we have in both cases: E(Y t ) = µ t = E(Y t ) = µ = c 1 φ, y t = Y t µ is a zero mean stationary process. See Hamilton, Figure 3.3 for an illustration. The AR(p)-process (19) and the ARMA(p,q)-process (23), with p and q finite, may be either stationary and non-stationary. They are stationary if all roots of 1 φ(x) = 0 are outside the unit circle. They are non-stationary if at least one root of 1 φ(x) = 0 is on or inside the unit circle. No restriction is placed on the coefficients of θ(l) in the ARMA case. Under stationarity, we have in both cases c E(Y t ) = µ t = E(Y t ) = µ = 1 p i=1 φ, y t = Y t µ is a zero mean stationary process. i 9