AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Outline 1 4.1: Nonstationarity in the Mean 2 ARIMA Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 2/ 19

Deterministic Trend Models Polynomial Trend Consider the time series The mean function of this process is So this process is not stationary. More generally, the time series Z t = α 0 + α 1 t + a t µ t = α + α 1 t Z t = α 0 + α 1 t + + α k t k + a t has a kth-order polynomial mean function µ t = α 0 + α 1 t + + α k t k Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 3/ 19

Deterministic Trend Models Sinusoidal Frequency Consider the time series Z t = ν 0 + ν cos(ωt + θ) + a t = ν 0 + α cos(ωt) + β sin(ωt) + a t where α = α 2 + β 2 cos(θ) and β = α 2 + β 2 sin(θ) More generally, m Z t = ν 0 + (α j cos(ω j t) + β j sin(ω j t)) + a t j=1 is the model of hidden periodicities". Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 4/ 19

Outline 1 4.1: Nonstationarity in the Mean 2 ARIMA Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 5/ 19

ARIMA Definition Recall the difference operator Z t = (1 B)Z t = Z t Z t 1 And more generally d Z t = (1 B) d Z t Definition (ARIMA) A process is said to be ARIMA(p,d,q) if d Z t = (1 B) d Z t is ARMA(p, q). Therefore an ARIMA(p,d,q) model (with mean zero) can be written as φ(b)(1 B) d Z t = θ(b)a t Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 6/ 19

Return Rate Suppose Z t is the value of an investment at time t and p t is the percentage changes from t 1 to t (which may be negative). Therefore we have Z t = (1 + p t )Z t 1 Taking logs produces or equivalently log(z t ) = log(1 + p t ) + log(z t 1 ) log(z t ) = log(z t ) log(z t 1 ) = log(1 + p t ) p t where the approximation holds when p t is close to zero. Another representation of log(z t ) is log(z t ) = log(z t ) log(z t 1 ) = log ( Zt Z t 1 ). Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 7/ 19

US Gross National Product We consider the seasonally adjusted quarterly US GNP from 1947(1) to 2003(3) giving a total of n = 223 observations. http://research.stlouisfed.org/ (Economic Data FREDR Gross Domestic Product (GDP) and Components GDP/GNP GNP) > gnp96 = read.table("mydata/gnp96.dat") > gnp = ts(gnp96[,2], start=1947, frequency=4) > plot(gnp,lwd=3) Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 8/ 19

GDP and Housing Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 9/ 19

US Gross National Product (cont) Just for kicks, lets look at the acf. > acf(gnp, 50) Simple differencing may not be the answer. > plot(diff(gnp)) Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 10/ 19

Percentage Quarterly Growth of US GNP Instead, we consider the growth rate Z t = log(z t ). > gnpgr = diff(log(gnp)) # growth rate > plot.ts(gnpgr) Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 11/ 19

Modeling Percentage Quarterly Growth of US GNP The plots of the ACF and PACF of the GNP growth rate indicates two potential models for the log GNP series: ARIMA(0,1,2) ARIMA(1,1,0) We fit AR(1) to log(gnp). > (gnpgr.ar = arima(gnpgr, order = c(1, 0, 0))) Call: arima(x = gnpgr, order = c(1, 0, 0)) Coefficients: ar1 intercept 0.3467 0.0083 s.e. 0.0627 0.0010 sigma^2 estimated as 9.03e-05: log likelihood = 718.61, aic = -1431.22 Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 12/ 19

Modeling in R R says intercept but means mean. Therefore the fitted model is Z t.0083 =.347(Z t 1.0083) + a t or equivalently Z t =.005 +.347Z t 1 + a t i.e. if α is the intercept and µ is the mean, then α = µ(1 φ) Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 13/ 19

Modeling in R From the expression α = µ(1 φ), we see σ α = σ m u(1 φ). Therefore we can write down the fitted model which incorporates the standard errors of the estimators Z t =.005 (.0006) +.347 (.063) Z t 1 + a t and σ = 9.03 10 5.0095. Also R has an issue with the I part of ARIMA fits where there is an AR component, so first difference the data then fit an ARMA model. Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 14/ 19

Modeling Percentage Quarterly Growth of US GNP We fit MA(2) to log(gnp). > (gnpgr.ma = arima(gnpgr, order = c(0, 0, 2))) Call: arima(x = gnpgr, order = c(0, 0, 2)) Coefficients: ma1 ma2 intercept 0.3028 0.2035 0.0083 s.e. 0.0654 0.0644 0.0010 sigma^2 estimated as 8.92e-05: log likelihood = 719.96, aic = -1431.93 The R output indicates the model with σ =.0094. Z t =.0083 (.001) +.303 (.065) a t 1 +.204 (.064) a t 2 + w t Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 15/ 19

The Two Models Aren t That Different The first 10 terms of the MA( ) representation of the AR(1) model is computed in R as > ARMAtoMA(ar=.35, ma=0, 10) # prints psi-weights [1] 3.500000e-01 1.225000e-01 4.287500e-02 1.500625e-02 [5] 5.252187e-03 1.838266e-03 6.433930e-04 2.251875e-04 [9] 7.881564e-05 2.758547e-05 So one (rather crude) approximation to the model Z t =.35Z t 1 + a t is Z t =.35a t 1 + 1.23a t 2 + a t which is close to the fitted MA(2) model Z t =.0083 (.001) +.303 (.065) a t 1 +.204 (.064) a t 2 + w t. Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 16/ 19

Diagnostic Checking Investigate the residuals Z t x t t 1 or standardized residuals e t = Z t x t t 1 P t 1 t If the model fits well, the residuals should behave like an iid sequence with mean zero and variance one. Diagnostic Checks Check the plot of Standardized residuals for patterns and outliers. Check the ACF, ˆρ, for significance lags. Use the Ljung-Box-Pierce Q-statistic to measure collective autocorralative (not just significance at a single lag). The Ljung-Box-Pierce Q-statistic is given as H ρ 2 Q = n(n + 2) e(h) n h Under the null of model adequacy, Q as the asymptotic distribution Q χ 2 H p q. Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 17/ 19 h=1

Diagnostic Checking of gnpgr.ma > tsdiag(gnpgr.ma, gof.lag=20) Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 18/ 19

Model Selection in US GNP Series n = length(gnpgr) kma = length(gnpgr.ma$coef) sma=gnpgr.ma$sigma2 kar = length(gnpgr.ar$coef) sar=gnpgr.ar$sigma2 # AIC Returned Value log(sma) + (n+2*kma)/n # MA2-8.298 log(sar) + (n+2*kar)/n # AR1-8.294 # AICc log(sma) + (n+kma)/(n-kma-2) # MA2-8.288 log(sar) + (n+kar)/(n-kar-2) # AR1-8.285 # BIC log(sma) + kma*log(n)/n # MA2-9.252 log(sar) + kar*log(n)/n # AR1-9.264 # sample size # number of parameters in ma model # mle of sigma^2 # number of parameters in ar model # mle of sigma^2 Arthur Berg AR(p) + I(d)+ MA(q) = ARIMA(p, d, q) 19/ 19