Econometrics 2 Fall 24 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen of4 Outline of the Lecture () Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing time dependence: ACF and PACF. (4) Modelling time dependence: ARMA(p,q). (5) Lag operators, lag polynomials and invertibility. (6) Examples: AR(). AR(2). MA(). (7) Model selection. (8) Estimation. (9) Forecasting. 2of4
Univariate Time Series Analysis Consider a single time series: y,y 2,...,y T. Simple models for y t as a function of the past. Univariate models are used for Analyzing the properties of a time series. The dynamic adjustment after a chock. Transitory or permanent effects. Presence of unit roots. Forecasting. AmodelforE[y t x t ] is only useful for forecasting y t+ if we know (or can forecast) x t+. Introducing the tools necessary for analyzing more complicated models. 3of4 Stationarity A time series, y,y 2,...,y t,..., y T, is (strictly) stationary if the joint distributions (y t,y t2,...,y tn ) and (y t +h,y t2 +h,..., y tn +h) are the same for all h. A time series is called weakly stationary or covariance stationary if E[y t ] = µ V [y t ] = E[(y t µ) 2 ] = γ Cov[y t,y t k ] = E[(y t µ)(y t k µ)] = γ k for k =, 2,... Often µ and γ are assumed finite. On these slides we consider only stationary processes. Later we consider non-stationary processes. 4of4
The Autocorrelation Function (ACF) For a stationary time series we define the autocorrelation function (ACF) as Ã! ρ k = Corr(y t,y t k )= γ k = Cov(y t,y t k ), = Cov(y t,y t k ) p. γ V (y t ) V (yt ) V (y t k ) Note that ρ k, ρ =,andρ k = ρ k. Recall that the ACF can (e.g.) be estimated by OLS in the regression model y t = c + ρ k y t k + residual. Under the assumption of white noise, ρ = ρ 2 =... =,itholdsthat V (bρ k )=T, and 95% confidence bands are given by ±2/ T. 5of4 The Partial Autocorrelation Function (PACF) An alternative measure is the partial autocorrelation function (PACF), which is the correlation conditional on the intermediate values, i.e. Corr(y t,y t k y t,..., y t k+ ). The PACF can be estimated as the OLS estimator b θ k in the regression y t = c + θ y t +... + θ k y t k + residual, where the intermediate lags are included. Under the assumption of white noise, θ = θ 2 =... =,itholdsthat V ( b θ k )=T. 95% confidence bands are given by ±2/ T. 6of4
6.4 (A) Danish income, log of constant prices 2 (B) Deviation from trend 6.2 6. 97 98 99 2 97 98 99 2 (C) Estimated ACF (D) Estimated PACF 5 5 2 5 5 2 7of4 The ARMA(p,q) Model We consider two simple models for y t : The autoregressive AR(p) model and the moving average MA(q) model. First define a white noise process, t i.i.d.(,σ 2 ). The AR(p) model is defined as y t = θ y t + θ 2 y t 2 +... + θ p y t p + t. Systematic part of y t is a linear function of p lagged values. The MA(q) model is defined as y t = t + α t + α 2 t 2 +... + α q t q. y t is a moving average of past chocks to the process. They can be combined into the ARMA(p,q) model y t = θ y t +... + θ p y t p + t + α t +... + α q t q. 8of4
The Lag and Difference Operators Define the lag-operator, L, tohavethepropertythat Ly t = y t. Note, that L 2 y t = L(Ly t )=Ly t = y t 2. Also define the first difference operator, = L, such that Note, that y t =( L) y t = y t Ly t = y t y t. 2 y t = ( y t )= (y t y t )= y t y t = (y t y t ) (y t y t 2 )=y t 2y t + y t 2. 9of4 Lag Polynomials Consider as an example the AR(2) model That can be written as where y t = θ y t + θ 2 y t 2 + t. y t θ y t θ 2 y t 2 = t ( θ L θ 2 L 2 )y t = t θ(l)y t = t, θ(l) = θ L θ 2 L 2 is a polynomial in L denoted a lag-polynomial. Standard rules for calculating with polynomials also hold for polynomials in L. of 4
Characteristic Equations and Roots For a model y t θ y t θ 2 y t 2 = θ(l)y t = t, we define the characteristic equation as θ(z) = θ z θ 2 z 2 =. The solutions, z and z 2,aredenotedcharacteristic roots. An AR(p) has p roots. Some of them may be complex values, h ± v i, wherei =. The roots can be used for factorizing the polynomial θ(z) = θ z θ 2 z 2 = ( φ z)( φ 2 z), where φ = z and φ 2 = z 2. of 4 Invertibility of Polynomials Define the inverse of a polynomial, θ (L) of θ(l), sothat θ (L)θ(L) =. Consider the AR() case, θ(l) = θl, and look at the product ( θl) +θl + θ 2 L 2 + θ 3 L 3 +... + θ k L k = ( θl)+ θl θ 2 L 2 + θ 2 L 2 θ 3 L 3 + θ 3 L 3 θ 4 L 5 +... = θ k+ L k+. If θ <, itholdsthatθ k+ L k+ as k implying that θ (L) =( θl) = θl =+θl + θ2 L 2 + θ 3 L 3 +... = X θ i L i i= If θ(l) is a finite polynomial, the inverse polynomial, θ (L), isinfinite. 2 of 4
ARMA Models in AR and MA form Using lag polynomials we can rewrite the stationary ARMA(p,q) model as y t θ y t... θ p y t p = t + α t +... + α q t q ( ) θ(l)y t = α(l) t. where θ(l) and α(l) are finite polynomials. If θ(l) is invertible, ( ) can be written as the infinite MA( ) model This is called the MA representation. y t = θ (L)α(L) t y t = t + γ t + γ 2 t 2 +... If α(l) is invertible, ( ) can be written as an infinite AR( ) model α (L)θ(L)y t = t y t γ y t γ 2 y t 2... = t. This is called the AR representation. 3 of 4 Invertibility and Stationarity A finite order MA process is stationary by construction. It is a linear combination of stationary white noise terms. Invertibility is sometimes convenient for estimation and prediction. An infinite MA process is stationary if the coefficients, α i, converge to zero. Werequirethat P i= α2 i <. An AR process is stationary if θ(l) is invertible. This is important for interpretation and inference. In the case of a root at unity standard results no longer hold. We return to unit roots later. 4 of 4
Consider again the AR(2) model θ(z) = θ z θ 2 z 2 =( φ L)( φ 2 L). The polynomial is invertible if the factors ( φ i L) are invertible, i.e. if φ < and φ 2 <. In general a polynomial, θ(l), is invertible if the characteristic roots, z,...,z p, are larger than one in absolute value. In complex cases, this corresponds to the roots being outside the complex unit circle. (Modulus larger than one). 5 of 4 ARMA Models and Common Roots Consider the stationary ARMA(p,q) model y t θ y t... θ p y t p = t + α t +... + α q t q θ(l)y t = α(l) t ( φ L)( φ 2 L) φ p L y t = ( ξ L)( ξ 2 L) ξ q L t. If φ i = ξ j for some i, j, they are denoted common roots or canceling roots. The ARMA(p,q) model is equivalent to a ARMA(p-,q-) model. As an example, consider y t y t +.25y t 2 = t.5 t L +.25L 2 y t = (.5L) t (.5L)(.5L) y t = (.5L) t (.5L) y t = t. 6 of 4
Unit Roots and ARIMA Models A root at one is denoted a unit root, and has important consequences for the analysis. We consider tests for unit roots and unit root econometrics later. Consider an ARMA(p,q) model θ(l)y t = α(l) t. If there is a unit root in the AR polynomial, we can factorize into θ(l) =( L)( φ 2 L) φ p L =( L)θ (L), and we can write the model as θ (L)( L)y t = α(l) t θ (L) y t = α(l) t. An ARMA(p,q) model for d y t is denoted an ARIMA(p,d,q) model for y t. 7 of 4 Example: Danish Real House Prices Consider the Danish real house prices in logs, p t. An AR(2) model yields p t =.55 (2.7) The lag polynomial is given by p t.5734 ( 7.56) p t 2 +.3426 (.3) θ(l) =.55 L +.5734 L 2, with inverse roots given by.9422 and.686. One root is close to unity and we estimate an ARIMA(2,,) model for p t : p t =.323 (6.6) The lag polynomial is given by p t.4853 ( 6.2) with complex (inverse) roots given by p t 2 +.9959 (.333) θ(l) =.323 L +.4853 L 2,.664 ±.2879 i, where i =. 8 of 4
Consider the AR() model where Y is the initial value. AR() Model Y t = δ + θy t + t ( θl)y t = δ + t Provided θ <, thesolutionfory t can be found as Y t = ( θl) (δ + t ) = +θl + θ 2 L 2 + θ 3 L 3 +... (δ + t ) = +θ + θ 2 + θ 3 +... δ + +θl + θ 2 L 2 + θ 3 L 3 +... t δ = θ + t + θ t + θ 2 t 2 ++θ 3 t 3 +... This is the MA-representation. In finite samples there is an effect of the initial value, θ t Y. 9 of 4 We calculate the expectation δ E[Y t ]=E θ + t + θ t + θ 2 t 2 +... = δ θ = µ. The effect of the constant depends on the autoregressive parameter. Now define the deviation from mean, y t = Y t µ, sothat Y t = δ + θy t + t Y t = ( θ) µ + θy t + t Y t µ = θ (Y t θµ)+ t y t = θy t + t. Note, that γ = V [Y t ]=V [y t ]. And under stationarity V [y t ] = V [θy t + t ] V [y t ] = θ 2 V [y t ]+V[ t ] V [y t ]( θ 2 ) = V [ t ] γ = σ 2 θ 2 2 of 4
The covariances are given by γ = Cov[y t,y t ]=E[y t y t ]=E[(θy t + t )y t ]=θe[yt ]+E[y 2 t t ] = θ σ2 θ 2 = θγ γ 2 = Cov[y t,y t 2 ]=E[y t y t 2 ]=E[(θy t + t ) y t 2 ] = E[(θ(θy t 2 + t )+ t ) y t 2 ] = E[θ 2 y t 2 + θy t 2 t + y t 2 t ] = θ 2 E[yt 2]+θE[y 2 t 2 t ]+E[y t 2 t ]=θ 2 σ 2 γ k = Cov[y t,y t k ]=θ k σ 2 The ACF is given by θ 2 = θk γ θ 2 = θ2 γ ρ k = γ k = θk γ = θ k. γ γ The PACF is simply the autoregressive coefficients: θ,,,... 2 of 4 Examples of AR() Models 2 y t =ε t ACF- PACF- -2 2.5 5 y t =.8 y t +ε t 2 ACF- 2 PACF-. -2.5 5 5 y t =.8 y t +ε t 2 ACF- 2 PACF- -5 5 2 2 22 of 4
Examples of AR() Models 2.5 y t =.5 y t +ε t 2.5 y t =.95 y t +ε t.. -2.5-2.5 2 4 6 8-5. 2 4 6 8 y t =y t +ε t y t =.5 y t +ε t -25-5 -5-75 - 2 4 6 8 2 4 6 8 23 of 4 Consider the AR(2) model given by AR(2) Model Y t = δ + θ Y t + θ 2 Y t 2 + t. Again we can find the mean E [Y t ] = δ + θ E [Y t ]+θ 2 E [Y t 2 ]+E[ t ] E [Y t ] = δ = µ, θ θ 2 and define the process y t = Y t µ for which it holds that y t = θ y t + θ 2 y t 2 + t. 24 of 4
Multiplying both sides with y t and taking expectations yields E yt 2 = θ E [y t y t ]+θ 2 E [y t 2 y t ]+E[ t y t ] γ = θ γ + θ 2 γ 2 + σ 2 Multiplying instead with y t yields E [y t y t ] = θ E [y t y t ]+θ 2 E [y t 2 y t ]+E[ t y t ] γ = θ γ + θ 2 γ Multiplying instead with y t 2 yields E [y t y t 2 ] = θ E [y t y t 2 ]+θ 2 E [y t 2 y t 2 ]+E[ t y t 2 ] γ 2 = θ γ + θ 2 γ Multiplying instead with y t 3 yields E [y t y t 3 ] = θ E [y t y t 3 ]+θ 2 E [y t 2 y t 3 ]+E[ t y t 3 ] γ 3 = θ γ 2 + θ 2 γ These are the so-called Yule-Walker equations. 25 of 4 Formulating in terms of the ACF, we get or alternatively that ρ = θ + θ 2 ρ ρ 2 = θ ρ + θ 2 ρ k = θ ρ k + θ 2 ρ k 2, k 3 θ ρ = θ 2 ρ 2 = θ 2 + θ 2 θ 2 ρ k = θ ρ k + θ 2 ρ k 2, k 3. 26 of 4
Examples of AR(2) Models 2.5 y t =.5 y t +.4 y t 2 +ε t. ACF- PACF-..5-2.5 5 5 y t =.8 y t 2 +ε t 2 ACF- 2 PACF- -5 5 5 y t =.3 y t.8 y t 2 ε t 2 ACF- 2 PACF- -5 5 2 2 27 of 4 Consider the MA() model MA() Model Y t = µ + t + α t, t =2, 3,..., T. If the polynomial α(l) =+αl is invertible, the MA() model can be written as the infinite AR( ) model. The mean and variance are given by E[Y t ] = E[µ + t + α t ]=µ V [Y t ] = E h(y t µ) 2i = E h( t + α t ) 2i = E t 2 + E α 2 t 2 + E [2α t t ] = +α 2 σ 2. 28 of 4
The covariances are given by γ = Cov[Y t,y t ] = E [(Y t µ)(y t µ)] = E[( t + α t )( t + α t 2 )] = E[ t t + α t t 2 + α 2 t + α 2 t t 2 ] = ασ 2 γ 2 = Cov[Y t,y t 2 ] = E [(Y t µ)(y t 2 µ)] = E[( t + α t )( t 2 + α t 3 )] = E[ t t 2 + α t t 3 + α t t 2 + α 2 t t 3 ] = γ k = for k =3, 4,... The ACF is given by ρ = γ ασ 2 = γ ( + α 2 ) σ = α 2 ( + α 2 ) ρ k =, k 2. 29 of 4 Examples of MA Models 2.5 y t =ε t.9 ε t ACF- PACF-. -2.5 5. 5 y t =ε t +.9 ε t 2 ACF- 2 PACF- 2.5. -2.5 2.5 5 y t =ε t.6 ε t +.5 ε t 2 +.5 ε t 3 2 ACF- 2 PACF-. -2.5 5 2 2 3 of 4
ARIMA(p,d,q) Model Selection Find a transformation of the process that is stationary, e.g. d Y t. Recall, that for the stationary AR(p) model The ACF is infinite but convergent. The PACF is zero for lags larger than p. For the MA(q) model The ACF is zero for lags larger than q. The PACF is infinite but convergent. The ACF and PACF contains information p and q. Can be used to select relevant models. 3 of 4 If alternative models are nested, they can be tested. Model selection can be based on information criteria IC = logbσ 2 {z } Measures the likelihood The information criteria should be minimized! Three important criteria + penalty(t,#parameters) {z } A penalty for the number of parameters AIC = logbσ 2 + 2 k T HQ = logbσ 2 2 k log(log(t )) + T BIC = logbσ 2 + k log(t ), T where k is the number of estimated parameters, e.g. k = p + q. 32 of 4
Example: Consumption-Income Ratio (A) Consumption and income, logs. Consumption (c) Income (y). (B) Consumption-Income ratio, logs. 6.25 -.5 -. 6. -.5 97 98 99 2 97 98 99 2 (C) ACF for series in (B) (D) PACF for series in (B) 5 5 2 5 5 2 33 of 4 Model T p log-lik SC HQ AIC ARMA(2,2) 3 5 3.825-4.448-4.563-4.55 ARMA(2,) 3 4 3.39537-4.477-4.524-4.5599 ARMA(2,) 3 3 3.3898-4.59-4.5483-4.5752 ARMA(,2) 3 4 3.42756-4.4722-4.5246-4.564 ARMA(,) 3 3 299.99333-4.53-4.5422-4.569 ARMA(,) 3 2 296.7449-4.486-4.578-4.5258 ARMA(,) 3 249.8264-3.86-3.89-3.828 34 of 4
---- Maximum likelihood estimation of ARFIMA(,,) model ---- The estimation sample is: 97 () - 23 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-.85736.565 5.2. MA- -.382.9825-3.6.3 Constant -.934.9898-9.44. log-likelihood 299.993327 sigma.239986 sigma^2.575934 ---- Maximum likelihood estimation of ARFIMA(2,,) model ---- The estimation sample is: 97 () - 23 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-.53683.8428 6.36. AR-2.25548.8479 2.95.4 Constant -.93547.948-9.87. log-likelihood 3.38984 sigma.239238 sigma^2.572349 35 of 4 Estimation of ARMA Models The natural estimator is maximum likelihood. With normal errors log L(θ, α, σ 2 )= T TX 2 2 log(2πσ2 t ) 2 σ 2, where t is the residual. For an AR() model we can write the residual as and OLS coincides with ML. t = Y t δ θ Y t, Usual to condition on the initial values. Alternatively we can postulate a distribution for the first observation, e.g. µ δ Y N θ, σ 2 θ 2, where the mean and variance are chosen as implied by the model for the rest of the observations. We say that Y is chosen from the invariant distribution. t= 36 of 4
For the MA() model Y t = µ + t + α t, the residuals can be found recursively as a function of the parameters = Y µ 2 = Y 2 µ α 3 = Y 3 µ α 2. Here, the initial value is =, but that could be relaxed if required by using the invariant distribution. The likelihood function can be maximized wrt. α and µ. 37 of 4 Forecasting Easy to forecast with ARMA models. Main drawback is that here is no economic insight. We want to predict y T +k given all information up to time T,i.e. information set I T = {y,...,y T,y T }. giventhe The optimal predictor is the conditional expectation y T +k T = E[y T +k I T ]. 38 of 4
Consider the ARMA(,) model y t = θ y t + t + α t, t =, 2,..., T. To forecast we Substitute the estimated parameters for the true. Use estimated residuals up to time T. Hereafter, the best forecast is zero. The optimal forecasts will be y T + T = E[θ y T + T + + α T I T ] = b θ y T + bα b T y T +2 T = E[θ y T + + T +2 + α T + I T ] = b θ y T + T. 39 of 4 -.2 -.4 Forecasts Actual -.6 -.8 -. -.2 -.4 -.6 99 992 994 996 998 2 22 24 26 28 4 of 4