ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t

ARIMA Models Da Sauders I will discuss models with a depedet variable y t, a potetially edogeous error term ɛ t, ad a exogeous error term η t, each with a subscript t deotig time. With just these three objects, we may cosider a rich class of models called: Autoregressio To start, cosider a AR() model: Autoregressive Itegrated Movig Averages }{{}}{{}}{{} AR I MA y t = φy t + ɛ t Right away you otice this is o differet from ay stadard regressio y i = βx i + ɛ i. We have simply relabeled the coefficiet β φ ad the right-had-side variable is a lag of the depedet variable x y t. Sice it is o differet from ay other regressio, a exogeous error term is eough for OLS to be cosistet: The ˆφ ols = If E(ɛ t ) = 0 ad E(ɛ t y t ) = 0 ad E(y t ) < t= y ty t t= y t = t= (φy t + ɛ t )y t t= y t plim φ + E(ɛ ty t ) E(yt ) = φ I words, as log as assumptios -3 hold, OLS is cosistet. Origially, assumptio 4, homoskedasticity, did ot affect the ubiasedess or cosistecy of OLS. With lagged depedet variables, this is o loger true. Serial correlatio i the error term geerates edogeeity bias, assumptio, due to omitted variables bias. This is a implicit violatio of assumptio, i.e., we have misspecified the model. Let s start with some basic ituitio. Suppose the serial correlatio of the error term is itself AR(): ɛ t = ρɛ t + η t The it is clear that the error term is correlated with the right-had-side variable: y t = φ y }{{} t + depeds o ɛ t ɛ t }{{} depeds o ɛ t So, what should we do? If we had a way for Eviews to cotrol for the serial correlatio of the error, the the remaiig error would be exogeous, ad we could perform OLS. Well, that s exactly what the AR() fuctio i Eviews does. To clarify, this is a fuctio i Eviews that cotrols for AR() serial correlatio i the error term, ot y, regardless of

whether that autocorrelatio geerates bias. Thus, we could use it i ay regressio, usig Cochrae-Orcutt: If y i = βx i + ɛ i, ad ɛ i = ρɛ i + η i, the ls y x AR() If we have a lagged depedet variable, the this solutio works as well, usig Iterated- Cochrae-Orcutt: If y t = φy t + ɛ t, ad ɛ t = ρɛ t + η t, the ls y y(-) AR() The coefficiet o y(-) will be ˆφ ad o AR() will be ˆρ. (Or will they? More later...) However, as I said earlier, this is really a violatio of assumptio. To see this, first substitute the AR() error equatio ito the AR() mai equatio: y t = φy t + ρɛ t + η t Now, substitute the lagged mai equatio i for ɛ t : y t = φy t + ρ(y t φy t ) + η t Collectig terms, we ca see that the true model (the oe we implicitly wrote dow) is: y t = (φ + ρ)y t (ρφ)y t + η t where the error is ow exogeous. We may choose to re-write the equatio; usig differet letters to differetiate the true coefficiets from the misspecified model: y t = λ y t + λ y t + η t If we try to solve for φ ad ρ as fuctios of λ ad λ, we fid: φ = λ + λ + 4λ, ρ = λ λ + 4λ OR φ = λ λ + 4λ, ρ = λ + λ + 4λ We have o way of kowig which, ad for the purposes of forecastig, it does t matter. If Eviews sets ˆφ = ρ ad ˆρ = φ, we will have the exact same forecast. Whether Eviews coverges to the correct solutio, or the reverse, depeds upo the iitial coditio used for Iterated-Cochrae-Orcutt. However, why ot simply ru OLS o the correctly specified equatio, ad geerate idetical forecasts: ls y y(-) y(-) The lesso embedded i this problem is importat. I geeral, ay AR(p) model for y with a AR(b) model error term is, ifact, a misspecified AR(p+h) model of y. Thus, fidig the correct specificatio for ay autoregressive process will resolve the autocorrelatio i the error term ad, hece, remove the bias. This is why we should reject ay AR model with

serially correlated residuals ad try higher order AR models, rather tha try to cotrol for the serial correlatio of the error directly. Okay. Lesso leared, let s go ru a buch of AR models. But wait, if the AR() commad i Eviews refers to the error term, ot the depedet variable, the what s the commad we wat? Well, if we believe y is AR() ad the error is exogeous, the we ru OLS: y t = φy t + ɛ t ɛ t = η t ls y y(-) Alteratively, we could regress y o othig, but assume the error is serially correlated: y t = ɛ t ɛ t = φɛ t + η t Why is this a equivalet model? Repeat the steps from above. First, substitute the serial correlatio equatio ito the mai equatio: y t = φɛ t + η t Secod, use the mai equatio to replace ɛ t : Therefore, i Eviews we ru the commad: y t = φy t + η t ls y AR() which literally says regress y o othig, but cotrol for a AR() serially correlated error term. Yet the result is a estimatio the exact same model. This result is also importat because it exteds to all cases. Suppose we believe that y is a AR(p) process whe the model is correctly specified model: y t = φ y t + φ y t + + φ p y t p + ɛ t ɛ t = η t The we may estimate this equatio i Eviews as: y t = ɛ t ɛ t = φ ɛ t + φ ɛ t + + φ p ɛ t p + η t ls y AR() AR() AR(p) The mai differece will ow be that Eviews uderstads you are performig time series aalysis ad stores the auto-correlatio fuctios for the model, so you should always do it this way. Okay, so ow you uderstad. Lagged depedet variables with serial correlatio i the residuals meas you should try a differet AR(p) specificatio usig the AR() AR(p) commads. Likewise, for a regressio without lagged depedet variables, but with serially correlated errors, you may add the AR() AR(p) commads to restore homoskedasticity. Both methods work with the same simple, flexible commads (or at least that s the idea). 3

Movig Average So what s a movig average? It is most simple to uderstad with real data ad eve weightig. Suppose we have ay radom data over time. We may ask, what s the three day ruig average? Of course, we eed the first three umbers i order to calculate the first term, so we will oly have averages whe we re doe: More geerally, we may costruct a movig average of order q for ay data: q x t (q) = q i= x t i We do t eve require equal weights: x t (q) = q α i x t i where i= q α i = i= From this perspective, movig averages seem quite simple. What makes our movig averages difficult is that they are defied for ɛ, the uobservable error term. Moreover, it is assumed that the error term is exogeous, i.e., the AR process is correctly specified so that ɛ t = η t. Fially, our weights do t sum to oe. Istead, they satisfy the uit root restrictio. Agai, i Eviews the MA() fuctio is a assumptio about the error, so we could estimate a MA() as follows: y t = ɛ t ɛ t = θη t + η t ls y MA() 4

Agai, this is more familiar if we substitute the error equatio ito the mai equatio: Likewise, we may imagie ay MA(q) model: y t = θη t + η t y t = θ η t + θ η t + + θ q η t q + η t Ad we could estimate ay such model i Eviews as: ls y MA() MA() MA(q) It is importat to ote that the movig average is with respect to the exogeous error term. Thus, i order to have ay chace at accurately estimatig the movig average coefficiets, we must first believe that the residuals we observe are ot serially correlated. This takes us back to the pricipal questio. How are we to select a ARMA model? The aswer:. We must select a AR(p) process that is a plausibly correct specificatio. Oe ecessary (but ot sufficiet) coditio is that the residuals ot be serially correlated. We should add as may terms as ecessary but o more.. Oce we ca obtai ubiased residuals, we may use them to estimate a movig average o the exogeous error. We should add as may terms as ecessary but o more. We do all of this simultaeously by ruig may ARMA(p,q) models. We must throw out ay models with serially correlated residuals. Amog the remaiig models, we must balace out our desire for correct specificatio with parsimoy (simplicity). Oe method is to select the model with the miimum Akaike Iformatio Criterio or (ofte preferred) the miimum Schwarz Criterio (miimum meas most egative). However, these are by o meas the oly methods for selectig a model. We may also appeal to graphical argumets (correlograms), test statistics, or forecastig performace whe selectig a model. A Techical Note I have omitted a costat for the usual reaso: easier math. However, you may otice that: ls y c AR() AR(p) MA() MA(q) AND ls y c y(-) y(-p) MA() MA(q) produce differet estimates of the costat coefficiet ĉ. The short aswer is, Who cares about the costat ayway? It has o ecoomic sigificace. I do t mea to imply that you should drop the costat, as that could cause omitted variable bias (the bias we just worked so hard to resolve). Rather, subtract off the mea of the depedet variable from each observatio (yt = y t ȳ). The you ca drop the costat from the regressio sice the process will be mea zero by costructio (assumig statioarity). (From here o out we shall assume that the ARMA model is well specified. So ɛ is purely exogeous, ɛ = η. This is called White Noise i time series ecoometrics.) 5

Itegrated ARMA I order for ARMA estimatio to work at all, we must believe that the depedet variable is statioary. There two defiitios:. Weakly Statioary: the covariace Cov(y t, y t j ) = σ j does ot chage over time. Strictly Statioary: the distributio of y t does ot chage over time The weak defiitio is sufficiet for ARMA models; although, it s easier to imagie the strict defiitio. I order to trasform o-statioary data ito somethig statioary, we will cosider takig first ad secod order differeces; a process kow as itegratio. Cosider a time-series data with a time tred. Oe optio is to de-tred the data: y t = α + µt + ɛ t I this case, y t is called tred-statioary, ad addig @tred i Eviews is sufficiet to restore statioarity. O the other had, we may have a radom walk with drift: y t = µ + y t + ɛ t I this case, the model is called differece-statioary because de-tredig solves the ostatioarity of the drift, ot the radom walk, while first-differecig solves both. We could easily ru this model i Eviews usig the d() fuctio, which tells the software to calculate the first differece: ls d(y) c Because Eviews uderstads d(y) to mea the depedet variable is the first differece of y, this sytax is carried through to the AR() commads. We may also wat to calculate the secod order differece, i.e., the differece of the differece: [(y t y t ) (y t y t )] = φ [(y y y t ) (y t y t 3 )] + ɛ t To ru this i Eviews we would iterate the differeces: ls d(d(y)) AR() While it is mathematically straightforward to exted this cocept idefiitely, we typically do ot go beyod first or secod differecig, as it is hard to imagie the applicability. More geerally, a ARIMA(p,,q), a first order itegrated ARMA(p,q) model, looks like: (y t y t ) = φ (y t y t ) + + φ p (y t p y t p ) + θ ɛ t + + θ q ɛ t q + ɛ t This could be ru i Eviews as: ls d(y) AR() AR(p) MA() MA(q) 6