Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Similar documents
TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Estimation and application of best ARIMA model for forecasting the uranium price.

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Univariate ARIMA Models

5 Autoregressive-Moving-Average Modeling

CHAPTER 8 FORECASTING PRACTICE I

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Box-Jenkins ARIMA Advanced Time Series

Suan Sunandha Rajabhat University

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

Circle a single answer for each multiple choice question. Your choice should be made clearly.

at least 50 and preferably 100 observations should be available to build a proper model

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Dynamic Time Series Regression: A Panacea for Spurious Correlations

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Lecture 5: Estimation of time series

Estimating AR/MA models

Minitab Project Report - Assignment 6

Forecasting Bangladesh's Inflation through Econometric Models

Forecasting Egyptian GDP Using ARIMA Models

STAT Financial Time Series

Scenario 5: Internet Usage Solution. θ j

Analysis. Components of a Time Series

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Problem Set 2: Box-Jenkins methodology

Advanced Econometrics

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

A Data-Driven Model for Software Reliability Prediction

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Applied time-series analysis

FE570 Financial Markets and Trading. Stevens Institute of Technology

Econometrics I: Univariate Time Series Econometrics (1)

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu

ARIMA modeling to forecast area and production of rice in West Bengal

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016 Organized by African Heritage Institution

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

Using Analysis of Time Series to Forecast numbers of The Patients with Malignant Tumors in Anbar Provinc

Lecture 19 Box-Jenkins Seasonal Models

MODELING MAXIMUM MONTHLY TEMPERATURE IN KATUNAYAKE REGION, SRI LANKA: A SARIMA APPROACH

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model

STAT 436 / Lecture 16: Key

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

Chapter 3: Regression Methods for Trends

Empirical Approach to Modelling and Forecasting Inflation in Ghana

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Univariate linear models

The log transformation produces a time series whose variance can be treated as constant over time.

Implementation of ARIMA Model for Ghee Production in Tamilnadu

9. Using Excel matrices functions to calculate partial autocorrelations

Austrian Inflation Rate

Empirical Market Microstructure Analysis (EMMA)

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

MODELLING TIME SERIES WITH CONDITIONAL HETEROSCEDASTICITY

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

Ch 6. Model Specification. Time Series Analysis

Decision 411: Class 9. HW#3 issues

Chapter 3, Part V: More on Model Identification; Examples

ANALYZING THE IMPACT OF HISTORICAL DATA LENGTH IN NON SEASONAL ARIMA MODELS FORECASTING

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Study on Modeling and Forecasting of the GDP of Manufacturing Industries in Bangladesh

Midterm Suggested Solutions

Sugarcane Productivity in Bihar- A Forecast through ARIMA Model

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Review Session: Econometrics - CLEFIN (20192)

Inferences for Regression

Quantitative Finance I

Introduction to Time Series Analysis. Lecture 11.

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Correlation Analysis

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Autocorrelation or Serial Correlation

AR, MA and ARMA models

COMPUTER SESSION 3: ESTIMATION AND FORECASTING.

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION

Oil price volatility in the Philippines using generalized autoregressive conditional heteroscedasticity

Econ 623 Econometrics II Topic 2: Stationary Time Series

MCMC analysis of classical time series algorithms.

2. An Introduction to Moving Average Models and ARMA Models

Modelling using ARMA processes

Forecasting of meteorological drought using ARIMA model

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016

Comment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo

Time Series Analysis -- An Introduction -- AMS 586

Akaike criterion: Kullback-Leibler discrepancy

Transcription:

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator In this lab we explore the Box-Jenkins methodology by applying it to a time-series data set comprising quarterly observations of the US Wholesale Price Index (see chart below). 140 US Wholesale Prices Index (1985 = 100) 120 100 80 60 40 20 0 Jan-60 Jan-62 Jan-64 Jan-66 Jan-68 Jan-70 Jan-72 Jan-74 In keeping with the principles of the Box-Jenkins method, the analysis will follow the usual sequence, illustrated overleaf. The series is clearly non-stationary in the mean and variance so we must first transform the series to achieve stationarity. We do this by taking the natural logarithm and differencing (see chart below). Jan-76 Jan-78 Jan-80 Jan-82 Jan-84 Jan-86 Jan-88 Jan-90 Jan-92 Changes in Log(Wholesale Prices Index) 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00-0.01-0.02-0.03 Actual Forecast 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 Copyright 1999-2001 IISEC Modelling the WPI Page 1

1. Compute the ACF and PACF of the differenced time series and use these to identify appropriate models of the form ARMA(p,q), p = 1,2; q = 1,2. Consider how any seasonal effect might be modelled explicitly. 2. Perform an analysis of variance for each model to compute the model and error sums of squares and test the significance of each model and its individual parameters. To test the significance of the model parameters you will need to estimate the parameter standard error, given by the equation: Where, σ ˆ a i T ˆ ( ) 1 = σ X X ii X is the matrix of independent variables used in the regression model and σ^ is the estimate of the residual standard deviation (MSE). 3. Compute the Akaike Information Criterion (AIC) and and Bayes Information Criterion (BIC) for each model and use these to estimate the model parameters and determine the model which best fits the data. It may help you to perform the analysis in the following way: Model a 1 a 2 b 1 b 4 AIC BIC AR(2) ARMA(1,1). For each model, use the Excel SOLVER function to find the coefficient values which minimize the AIC (or BIC). The preferred model will have the overall minimum AIC (or BIC). 4. Check the ACF and PACF of the residuals and perform the Box-Pierce and Ljung- Box portmanteau tests to test that the residuals are white nose. Copyright 1999-2001 IISEC Modelling the WPI Page 2

Box-Jenkins Methodology Phase I Identification Data Preparation Transform data to stabilize variance Difference data to obtain stationary series Model Selection Use ACF and PACF to identify appropriate models Phase II Estimation and Testing Estimation Derive MLE parameter estimates for each model Use model selection criteria to choose the best model Diagnostics Check ACF/PACF of residuals Do portmanteau and other tests of residuals Are residuals white noise? No Phase III Forecasting Forecasting Use model to forecast Test effectiveness of model forecasting ability Copyright 1999-2001 IISEC Modelling the WPI Page 3

Solution: Box-Jenkins Methodology - US Wholesale Price Indicator 1. The ACF and the PACF of the differenced log WPI series are shown below. The positive, geometrically decaying pattern of the ACF, coupled with the significant PACF coefficients at lags 1 and 2 suggest either an AR process with p = 1 or p =2 or possibly an ARMA(1,1) process. Note the jump in the ACF at lag 4. Since we are using quarterly data we might want to incorporate a seasonal factor at lag 4. ACF and PACF for D Ln(WPI) 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0-0.1-0.2-0.3 ACF PACF Lower 95% Upper 95% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2. The class of models we are considering is of the form ARMA[p,(q 1, q 2 )], p = 1, 2; q 1 = 1, 2; q 2 = 4 : y t = a 0 + a 1 y t-1 + a 2 y t-2 + ε t + β 1 ε t-1 + β 4 ε t-4 Where, in the case of an AR(1) model: a 2 and β 1 and β 4 are zero AR(2) model: β 1 and β 4 are zero, ARMA(2,1) model: β 4 is zero, etc. Our forecast values are computed using the formula: y' t = a 0 + a 1 y t-1 + a 2 y t-2 + ε t + β 1 e t-1 + β 4 e t-4 Copyright 1999-2001 IISEC Modelling the WPI Page 4

Where a 0 and a 1 are estimates of the model parameters a 0 and a 1, etc., and e t is the error term (y t - y' t ). Start by entering a dummy value for coefficient a 0 in the cell C3 and a 1 in the cell C4. Leave the other coefficients blank for now (so we are testing an AR(1) model). Start by setting y' 4 =a0+a1_*f19+a2_*f18 (in cell I20). Compute e 4 (=F20-I20) in cell J20. Then compute y' 5 using the formula =a0+a1_*f20+a2_*f19+b1_*j20+b4_*j17 in cell I21 and then copied down into the remaining cells in the column. Next, copy the formula for the error term in cell J20 down into the remaining cells in that column. To prepare the ANOVA, we need to compute the model sums of squares = 2 SSM ( yˆ y) t We calculate the mean using the Excel formula =AVERAGE(F24:F146) in cell F148 and then calculate the SSM using the formula =SUMPRODUCT(I24:I146- $F$148,I24:I146-$F$148) in cell D12. The error sums of squares SSE = Σe t 2 can be computed directly using the Excel formula: = SUMSQ(J24:J146) in cell D13. Add SSM to SSE to compute the total sums of squares SST in cell D14. Next we compute the model and error mean square terms by dividing SSM and SSE by their respective degrees of freedom {m and (n-m-1) respectively}.. Finally we can compute the F-statistic by taking the ratio F = SSM/SSR. This has an F distribution with m and n-m-1 degrees of freedom. We use the Excel function FDIST to calculate the probability of observing a value of F this large or larger (under the hypothesis that the model parameters are zero). The p-value indicates that the model is statistically significant at that probability level. If the p-value is small, the indication is that it is likely that the model is useful in explaining some of the variation in the series. The standard error of the parameter estimates can now be computed. First we need to find the matrix X T X, which is located in the range (N148:S153). X T X 1 2 3 4 5 1 123.00000 1.30434 1.29721-0.00781-0.02357 2 1.30434 0.04004 0.02918 0.01623 0.00413 3 1.29721 0.02918 0.04013 0.00002 0.00468 4-0.00648 0.01719 0.00001 0.01625 0.00152 5-0.01599 0.00457 0.00598 0.00018 0.01645 Then we calculate the inverse of the of (m+1) x (m+1) sub-matrix of X T X. Copyright 1999-2001 IISEC Modelling the WPI Page 5

For example, for the AR(1) model we require the inverse of the 2x2 sub-matrix of X T X., comprising the upper left hand quadrant highlighted in yellow. We do this using the Excel function MINVERSE in the following formula in cell O156: O156 =INDEX(MINVERSE($O$149:$P$150),$N156,O$155) Similarly: P156 =INDEX(MINVERSE($O$149:$P$150),$N156,P$155) O157 =INDEX(MINVERSE($O$149:$P$150),$N157,O$155) P157 =INDEX(MINVERSE($O$149:$P$150),$N157,P$155) This gives us the complete 2 x 2 inverse matrix (X T X) -1. To compute the standard error for the parameter a 0, we use the first diagonal element of the inverse matrix (X T X) -1 11 = 0.01242. The standard error of the constant coefficient estimate is therefore: (MSE x 0.01242) 1/2 = 0.0013 This is given by the Excel formula in cell D3: D3 = IF(ISBLANK(a0),"",(MSE*O156)^0.5) To compute the standard error for the parameter a 1, we use 2 nd diagonal element in the inverse matrix (X T X) -1 22. The Excel formula in cell D4 is D4 =IF(ISBLANK(a1_),"",(MSE*P157)^0.5) N.B the ISBLANK function ensures that the SE is calculated only if a parameter values has been estimated otherwise the SE is set to null. The t-statistic is the ratio of the parameter estimate to the standard error (E4 =C4/D4). The one-sided t-test is performed using the Excel function TDIST in the formula F3 = IF(E3="","",TDIST(E3,$C$13,1)) This tells us the probability of deriving an estimate a 0, if the true value of a 0 is zero. A typical completed ANOVA table is shown below (for the AR(1) model):- MLE SE t p a 0 0.005 0.0013 3.408 0.04% a 1 0.580 0.0738 7.864 0.00% a 2 b 1 b 4 m 1 n 123 ANOVA DF SS MS F p Model 1 0.0088 0.00883 61.85 0.000 Error 121 0.0173 0.00014 Total 122 0.0261 Copyright 1999-2001 IISEC Modelling the WPI Page 6

3. :We can now compute the Akaike Information Criterion using the Excel formula =n*ln(sse)+2*m in cell K4. For comparison, we compute the Schwartz Bayesian Information criterion (BIC) in cell K5 using the Excel Formula =n*ln(sse)+m*ln(n). To test the forecasting performance of our model we compute the coefficient of determination R 2 using the Excel formula =SSM/SST in cell K7. The adjusted R 2 is calculated in cell K8 using the Excel formula K8 =(1-(1-K7)*C14/C13). So far, we have been working with a dummy value of our model coefficients. Now that we have computed the formula for the AIC (BIC) we can proceed to find the maximum likelihood estimates of the coefficients. We do this by using Excel SOLVER to find the coefficient values which minimize the AIC (or BIC). To run SOLVER, go to the Forecasting commandbar and choose Solver. The following dialog box appears: Copyright 1999-2001 IISEC Modelling the WPI Page 7

Enter the cell reference of the AIC field (K4), which is the function to be minimized. In the By Changing Cells field, enter the cell reference(s) of the model parameters (C3 and C4). Click the Solve button and SOLVER will find the minimum AIC using a gradient decent search method. We find the following results for the optimal AR(1) model: MLE SE t p a 0 0.005 0.0013 3.408 0.000 Max Likelihood a 1 0.580 0.0738 7.864 0.000 AIC -497.25 a2 0.0000 BIC -494.44 b 1 0.0000 DW 2.27 b 4 0.0000 R 2 33.8% m 1 Adj. R 2 33.3% n 123 ANOVA DF SS MS F p Portmanteau Tests Model 1 0.0088 0.00883 61.85 0.000 Q(20) p Error 121 0.0173 0.00014 Box-Pierce 23.76 0.206 Total 122 0.0261 Ljung-Box 25.59 0.142 Both of the model parameters are highly significant and as a whole the model has significant expla natory power (R 2 = 33.8%). Using a similar technique to estimate the parameters for all the relevant models, and the corresponding AIC (and BIC), we arrive at the results shown in the table below. Model a 0 a 1 a 2 b 1 b 4 AIC BIC Adj. R 2 AR(1) 0.0013 0.0738-497.3-494.4 33.3% 0.04% 0.00% AR(2) 0.0035 0.4423 0.2345-502.3-496.6 36.4% 0.52% 0.00% 0.46% ARMA(1,(1,4)) 0.0025 0.7700-0.4246 0.3120-511.0-502.6 42.7% 5.96% 0.03% 3.48% 0.07% ARMA(2,(1,4)) 0.0025 0.7969-0.0238-0.4411 0.3132-509.0-497.8 42.3% 6.25% 0.02% 43.38% 2.98% 0.06% Comparing the various models, we can see that the AR(2) model dominates the AR(1) model in that is has lower AIC and BIC and higher adjusted R 2. The ARMA(2,(1,4)) seasonal model appears to contain a spurious auto-regressive term at lag 2, which is non-significant at the 57% level. The best model overall appears to be the seasonal ARMA(1,(1,4)) model, which has the lowest AIC (-511.0) and BIC (-502.6) and highest R 2 (42.7%). The form of the model is: y t = 0.0025+ 0.7700y t-1 + ε t 0.4246ε t-1 + 0.3120ε t-4 Copyright 1999-2001 IISEC Modelling the WPI Page 8

A chart of the data series and forecasts produced by the model is shown below. Changes in Log(Wholesale Prices Index) 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00-0.01-0.02-0.03 Actual Forecast 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 4. The ACF and PACF of the residuals of the AR(1) model are shown in the chart below. While the Portmanteau tests of the 20 residual autocorrelations of the AR(1) model indicate that, as a whole they are insignificant, the ACF and PACF correlogram indicates significant non-zero autocorrelations at lags 4 and 6, probably due to seasonal non-stationarity. 0.30 0.25 0.20 0.15 0.10 0.05 Residuals ACF and PACF ACF PACF Lower 95% Upper 95% 0.00-0.05 1 3 5 7 9 11 13 15 17 19-0.10-0.15-0.20 By contrast, the ACF and PACF of the ARMA(1, (1,4)) model shown below indicate that the residuals are white noise, as the correlation coefficients all lie within the 95% confidence limits. The Portmanteau tests shown below confirm that the residual autocorrelations are collectively insignificant and therefore that the residuals are white noise. Copyright 1999-2001 IISEC Modelling the WPI Page 9