Problem Set 2: Box-Jenkins methodology

Similar documents
Econometrics II Heij et al. Chapter 7.1

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Univariate linear models

Univariate ARIMA Models

Review Session: Econometrics - CLEFIN (20192)

CHAPTER 6: SPECIFICATION VARIABLES

Exercise Sheet 6: Solutions

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements:

10. Time series regression and forecasting

1 Quantitative Techniques in Practice

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Econ 623 Econometrics II Topic 2: Stationary Time Series

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

STAT Financial Time Series

APPLIED MACROECONOMETRICS Licenciatura Universidade Nova de Lisboa Faculdade de Economia. FINAL EXAM JUNE 3, 2004 Starts at 14:00 Ends at 16:30

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1

Chapter 2: Unit Roots

Empirical Market Microstructure Analysis (EMMA)

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016 Organized by African Heritage Institution

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

The Evolution of Snp Petrom Stock List - Study Through Autoregressive Models

7. Integrated Processes

at least 50 and preferably 100 observations should be available to build a proper model

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

7. Integrated Processes

Frequency Forecasting using Time Series ARIMA model

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Applied time-series analysis

Financial Time Series Analysis: Part II

Advanced Econometrics

Forecasting of the Austrian Inflation Rate

Lecture 2: Univariate Time Series

ECON 366: ECONOMETRICS II. SPRING TERM 2005: LAB EXERCISE #10 Nonspherical Errors Continued. Brief Suggested Solutions

2. An Introduction to Moving Average Models and ARMA Models

Non-Stationary Time Series and Unit Root Testing

7. Forecasting with ARIMA models

Stationary Stochastic Time Series Models

Univariate Time Series Analysis; ARIMA Models

Ch 6. Model Specification. Time Series Analysis

Non-Stationary Time Series and Unit Root Testing

Exercise Sheet 5: Solutions

CHAPTER 8 FORECASTING PRACTICE I

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Chapter 6: Model Specification for Time Series

Econometrics of financial markets, -solutions to seminar 1. Problem 1

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s.

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994).

Midterm Suggested Solutions

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Ch. 14 Stationary ARMA Process

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Box-Jenkins ARIMA Advanced Time Series

ARDL Cointegration Tests for Beginner

Lecture 1: Stationary Time Series Analysis

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016

Analysis. Components of a Time Series

Heteroskedasticity. Part VII. Heteroskedasticity

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Exercises (in progress) Applied Econometrics Part 1

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

Final Exam Financial Data Analysis at the University of Freiburg (Winter Semester 2008/2009) Friday, November 14, 2008,

ARIMA Modelling and Forecasting

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Introduction to ARMA and GARCH processes

Chapter 8: Model Diagnostics

The Simple Regression Model. Part II. The Simple Regression Model

1 Linear Difference Equations

About the seasonal effects on the potential liquid consumption

Non-Stationary Time Series and Unit Root Testing

Lecture 4a: ARMA Model

AR, MA and ARMA models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009

A Data-Driven Model for Software Reliability Prediction

Lecture 5: Estimation of time series

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

5 Autoregressive-Moving-Average Modeling

CORRELATION, ASSOCIATION, CAUSATION, AND GRANGER CAUSATION IN ACCOUNTING RESEARCH

Econometría 2: Análisis de series de Tiempo

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Forecasting with ARMA

LECTURE 11. Introduction to Econometrics. Autocorrelation

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

in the time series. The relation between y and x is contemporaneous.

Modeling Monthly Average Temperature of Dhahran City of Saudi-Arabia Using Arima Models

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Econometrics I: Univariate Time Series Econometrics (1)

Forecasting Seasonal Time Series 1. Introduction. Philip Hans Franses Econometric Institute Erasmus University Rotterdam

Linear Model Under General Variance Structure: Autocorrelation

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Econometric Forecasting

Transcription:

Problem Set : Box-Jenkins methodology 1) For an AR1) process we have: γ0) = σ ε 1 φ σ ε γ0) = 1 φ Hence, For a MA1) process, p lim R = φ γ0) = 1 + θ )σ ε σ ε 1 = γ0) 1 + θ Therefore, p lim R = 1 1 1 + θ = θ 1 + θ We can see that a consistent estimator of the R would "approach" different values depending on the true parameters of the process and not on the goodness of fit of the estimation procedure. So, R is no longer an appropriate goodness of fit measure. Instead, information criteria and the principle of parsimony should be heeded. ) To identify the series you should plot the ACF and the PACF functions. You must choose the e-views command Quick, then Series Statistics and then Correlogram. When you get there, you should specify the series you would like to plot, and then select the number of lags you would like to include default=36) in the estimation. When you plot the Correlogram for the series Return you will be given the following results: 1

Autocorrelation Partial Correlation AC PAC Q-Stat Prob. *. * 1 0.147 0.147 0.081 0.000.... -0.014-0.036 0.5 0.000.... 3-0.043-0.037 1.997 0.000.... 4 0.03 0.044.94 0.000.... 5 0.031 0.018 3.88 0.000. *. 6-0.083-0.093 30.80 0.000... 7-0.081-0.05 36.394 0.000.... 8-0.039-0.0 37.83 0.000.... 9 0.051 0.050 40.8 0.000.... 10 0.003-0.014 40.93 0.000... * 11 0.056 0.068 43.47 0.000.... 1 0.047 0.033 45.308 0.000. *.. 13 0.066 0.045 49.433 0.000.... 14 0.041 0.019 50.99 0.000.... 15 0.009 0.007 51.065 0.000.... 16-0.010-0.009 51.154 0.000.... 17-0.046-0.037 53.137 0.000.... 18-0.01 0.004 53.85 0.000.... 19 0.03 0.040 53.799 0.000.... 0 0.017 0.01 54.069 0.000.... 1 0.015 0.01 54.94 0.000.... -0.047-0.055 56.366 0.000.... 3-0.006-0.005 56.403 0.000.... 4-0.015-0.033 56.64 0.000 We should recall that: - An ARp) process has a declining AC function and the PACF is zero for lags greater than p. - A MAq) process has a AC function that is zero for lags greater than q and a PACF that declines exponentially. Looking at the table, we can see that the cumulative correlation as descibed by the Q statistic shows that is significant for all the lags,.and therefore you reject the null hypothesis of ρk) = 0 for each lag. As a result, series does not seem to be a White Noise process. For example, the estimated value for Q-Stat ρ1) is significant at a 95%, which implies that the null hypothesis of ρ1) = 0.is rejected. We will attempt to identify the series following a "from specific to general" approach. Since the estimate of ρ1) is too small it is very diffi cult looking at the above table) to tell which type of process is the best to characterize the series under scrutiny a MA or AR process). We first try with an AR1) structure. Using e-views you should go to Quick, Estimate Equation, an then type the dependent and independent variables that you will include in the regression:

Returns C AR1 ). We obtain the following results: Variable Coeffi cient Std. Error t-statistic Prob. C 0.00054 0.0007 1.99515 0.0463 AR1) 0.147150 0.0348 4.530161 0.0000 R-squared 0.01659 Mean dependent var 0.000543 Adjusted R-squared 0.00604 S.D. dependent var 0.00713 S.E. of regression 0.007058 Akaike info criterion -7.06710 Sum squared resid 0.046181 Schwarz criterion -7.056713 Log likelihood 384.677 F-statistic 0.536 Durbin-Watson stat 1.989755 ProbF-statistic) 0.000007 Inverted AR Roots.15 The series returns seems to be an AR1) which appears to be stationary because the AR root is 1 0.15 > 1. Nevertheless, we can only be pleased with our representation of the series if the residuals are clean. Therefore we need to check the correlogram of the residuals to inquire whether they are a White Noise process. If they are not, we continue augmenting the model. 3

Autocorrelation Partial Correlation AC PAC Q-Stat Prob.... 1 0.005 0.005 0.03.... -0.030-0.030 0.8505 0.356.... 3-0.048-0.048.996 0.4.... 4 0.035 0.035 4.1439 0.46.... 5 0.040 0.037 5.67 0.9. *. 6-0.079-0.080 11.503 0.04. *. 7-0.066-0.061 15.61 0.016.... 8-0.036-0.038 16.85 0.018.... 9 0.059 0.046 0.100 0.010.... 10-0.01-0.017 0.46 0.016.... 11 0.051 0.061.686 0.01.... 1 0.030 0.036 3.560 0.015.... 13 0.056 0.048 6.501 0.009.... 14 0.031 0.06 7.418 0.011.... 15 0.004 0.01 7.437 0.017.... 16-0.005-0.00 7.456 0.05.... 17-0.044-0.037 9.317 0.0.... 18-0.010-0.007 9.409 0.031.... 19 0.04 0.037 9.940 0.038.... 0 0.01 0.015 30.07 0.051.... 1 0.00 0.03 30.471 0.063.... -0.050-0.049 3.871 0.048.... 3 0.003-0.007 3.879 0.064.... 4-0.007-0.06 3.93 0.08 We can see that from the 6th lag onwards, the p-value associated with the Q statistics are smaller than the 5% critical level and therefore the Q-Stat determines that the lag 6 is significant. This test shows that in the sixth lag there is either an AR or a MA component. We estimate the following equation: yt) = φ 1 y t 1 + φ 6 y t 6 + ε t To estimate this model we go in the Eviews menu to Quick, Estimate Equation and type: Return C AR1) AR6), obtaining the following results: 4

Variable Coeffi cient Std. Error t-statistic Prob. C 0.000541 0.00046.0030 0.080 AR1) 0.14980 0.03386 4.65564 0.0000 AR6) -0.087959 0.03345 -.719380 0.0067 R-squared 0.09410 Mean dependent var 0.000543 Adjusted R-squared 0.07314 S.D. dependent var 0.00713 S.E. of regression 0.007034 Akaike info criterion -7.0791 Sum squared resid 0.045815 Schwarz criterion -7.057311 Log likelihood 388.37 F-statistic 14.0948 Durbin-Watson stat.00063 ProbF-statistic) 0.000001 Inverted AR Roots.60 -.33i.60+.33i.0+.66i.0 -.66i -.55 -.33i -.55+.33i The t-statistics for the AR1) and AR6) lags are both significant at a 5%. Looking at the Correlogram of the residuals of this regression, we find that that we cannot reject the null hypothesis that the residual is a White Noise process. We can therefore stop the identification procedure and conclude that the above model characterizes the data correctly. The Correlogram shows that we cannot reject the null hypothesis that states that the residuals are a White Noise process. 5

Autocorrelation Partial Correlation AC PAC Q-Stat Prob.... 1 0.000 0.000 0.0001.... -0.09-0.09 0.7955.... 3-0.047-0.047.8766 0.090.... 4 0.03 0.031 3.8171 0.148.... 5 0.045 0.043 5.7335 0.15.... 6 0.01 0.01 5.864 0.10.... 7-0.048-0.043 8.0545 0.153.... 8-0.035-0.031 9.1906 0.163.... 9 0.056 0.05 1.10 0.097.... 10-0.01-0.00 1.45 0.141.... 11 0.05 0.054 14.746 0.098.... 1 0.04 0.034 15.79 0.1.... 13 0.05 0.055 17.840 0.085.... 14 0.09 0.031 18.630 0.098.... 15 0.010 0.010 18.74 0.13.... 16-0.009-0.005 18.807 0.17.... 17-0.040-0.043 0.34 0.159.... 18-0.008-0.014 0.403 0.03.... 19 0.05 0.06 0.996 0.6.... 0 0.009 0.005 1.077 0.76.... 1 0.018 0.07 1.377 0.316.... -0.051-0.050 3.838 0.50.... 3 0.001-0.00 3.838 0.301.... 4-0.006-0.01 3.875 0.354 Nevertheless, since we were not sure about the nature of the sixth lag it seems that it could either be an AR6) or a MA6)) we ill also estimate the following model : yt) = φ 1 y t 1 + θ 6 ε t 6 + ε t 6

Variable Coeffi cient Std. Error t-statistic Prob. C 0.000539 0.00050.15596 0.0314 AR1) 0.145798 0.03547 4.479613 0.0000 MA6) -0.075680 0.0384 -.305661 0.013 R-squared 0.07516 Mean dependent var 0.000543 Adjusted R-squared 0.05416 S.D. dependent var 0.00713 S.E. of regression 0.007041 Akaike info criterion -7.07097 Sum squared resid 0.045904 Schwarz criterion -7.055361 Log likelihood 387.467 F-statistic 13.10050 Durbin-Watson stat 1.989908 ProbF-statistic) 0.00000 Inverted AR Roots.15 Inverted MA Roots.65.33+.56i.33 -.56i -.33 -.56i -.33+.56i -.65 The residual of the above regression seems to be a White Noise process and therefore we can conclude that this model also seems to fit the data correctly. We are therefore faced with a choice to make. Since both models seem to fit the series, we choose the model that has the lower value for the relevant Selection Criteria. We consider the Akaike info criterion AIC) and Schwarz criterion SCH). These selection criteria seem to favour the AR1), AR6) model for the series returns. 3) In this exercise we will derive theoretical forectasts of the following stationary AR1) process y t = c + φy t 1 + ε t We will not, as in an econometrics course, compute the predictor of the random variable and estimator used to provide information for future or otherwise unavailable values). So, when we talk about the forecast error we will not be talking about the prediction error covered in your previous econometrics course, as there will be no sampling error involved since our analysis will be theoretical, no estimation will be undertaken or considered). i. You have learnt that the minimun mean squared error forecast is the conditional mean. Imagine you have a size T sample and consider the problem of obtaining the one-step-ahead forecast: y T 1) = E T y T +1 ) = E T c + φy T + ε T +1 ) = c + φy T 7

Now consider the two-step-ahead forecast: y T ) = E T y T + ) = E T [y T +1 1)] = E T [c + φy T +1 ] = c + φy T 1) = c1 + φ) + φ y T In general, it is true that the k-step-ahead forecast satisfies: y T k) = c + φy T k 1) Solving recursively back to k = 1 we get: y T k) = c1 + φ + φ +... + φ k 1 ) + φ k y T 0) = c1 + φ + φ +... + φ k 1 ) + φ k y T This can be written under stationarity) as: y T k) = c 1 φk 1 φ + φk y T = 1 φ k) c 1 φ + φk y T = 1 φ k) µ + φ k y T We find that the optimal k-step-ahead forecast the conditional mean) is a convex combination of the last observation available and the unconditional mean. ii. This is straightforward. Simply take limits as k tends to infinity. lim y T k) = µ lim 1 φ k) + y T lim k + k + k + φk φ < 1 lim k + φk = 0 lim T k) k + = µ As the forecast horizon lengthens, the importance of the last observation available diminishes. iii. Let e T k) be the k-step-ahead optimal forecast error. e T k) = y T +k y T k) = φ y T +k 1 y T k 1)) + ε T +k = φe T k 1) + ε T +k 8

Solving recursively once again, e T k) = ε T +k + φε T +k 1 + φ ε T +k +... + φ k 1 ε T +1 = E e T k)) = 0 k 1 φ i ε T +k i i=0 k 1 σ e T k) k) = φ i σ ε i=0 = σ ε 1 + φ + φ 4 +... + φ k 1)) = σ 1 φ k ε 1 φ = 1 φ k) γ0) iv. Once again, simply take limits, recalling the stationarity hypothesis: lim k + σ e T k) k) = γ0) v. If the process is gaussian, the forecast error is gaussian too, as it a linear combination of independent and identically distributed normal random variables. Hence, knowing its mean and variance allows us to identify the whole of its distribution: e T k) 1 φ k) γ0) e T k) N N 0, 1) 0, 1 φ k) ) γ0) 1 α = P e T k) 1 φ k) z 1 α γ0) 1 α = P 1 α = P 1 α = P z 1 α e T k) 1 φ k) γ0) z 1 α y T k) z 1 α z 1 α 1 φ k) γ0) e T k) z 1 α 1 φ k) γ0) 1 φ k) γ0) y T +k y T k) + z 1 α 1 φ k) γ0) ) ) 9

Since we actually do not know the true value of the parameters, we should use consistent estimators. Then, an asymptotically valid forecast interval is: P ŷ T k) z 1 α 1 φ k) γ0) y T +k ŷ T k) + z 1 α 1 φ k) γ0) 4) Consider now the following invertible MA1) process i. For our size T sample: y t = µ + ε t + θε t 1 ) = 1 α y T 1) = E T y T +1 ) = E T µ + ε T +1 + θε T ) = µ + θε T Now consider the two-step-ahead forecast: So we now have: y T ) = E T y T + ) y T k) = = E T [y T +1 1)] = E T [µ + θε T +1 ] = µ { µ + θεt k = 1 µ k In general, the optimal k-step-ahead forecast returns abruptly to the mean, as soon as we surpass the moving average order in this case, 1). Although, the forecast function seems to be easy to compute, generally this is not the case. In the information set, in general, we do not have the error term. So, different approaches are employed to approximate it. If we have all the history for y infinite number of observations) and the MA process is invertible, we can approximate the error term with an AR ) representation. ε T 1 + θl) = y T µ so ε T = y t µ) 1 + θl) 1 10

and ε T = y T µ) θ y T 1 µ) + θ 3 y T µ) +... If we have a finite data set we have two approaches. On one hand we can approximate the Optimal Forecast under some assumptions. For example, setting to zero the error for some date and all the error for observations preceding this date. The another approach compute the exact finite sample forecast. Both approaches are detailed in Hamilton 1994. ii. This is even more straightforward that before. iii. Let e T k) be the k-step-ahead optimal forecast error: e T k) = y T +k y T k) { µ + εt = +1 + θε T µ θε T k = 1 µ + ε T +k + θε T +k 1 µ k Then, { e T k) = ε T +1 k = 1 ε T +k + θε T +k 1 k which implies, E e T k)) = 0 { σ e T k) k) = σ ε k = 1 1 + θ )σ ε = γ0) k iv. Even more so that ever, this is straightforward. v. Like before, if the process is gaussian, the forecast error is gaussian too. Let s focus on the k case: e T k) N 0, γ0)) e T k) γ0) N 0, 1) 11

) e T k) 1 α = P z 1 α γ0) ) 1 α = P z 1 α e T k) z 1 α γ0) 1 α = P z 1 α 1 α = P y T k) z 1 α ) γ0) et k) z 1 α γ0) γ0) yt +k y T k) + z 1 α γ0) ) The asymptotically valid forecast interval is now: ) P ŷ T k) z 1 α γ0) yt +k ŷ T k) + z 1 α γ0) = 1 α 1