STAT 520 FORECASTING AND TIME SERIES 2013 FALL Homework 05

Similar documents
ibm: daily closing IBM stock prices (dates not given) internet: number of users logged on to an Internet server each minute (dates/times not given)

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Chapter 8: Model Diagnostics

COMPUTER SESSION 3: ESTIMATION AND FORECASTING.

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Suan Sunandha Rajabhat University

distributed approximately according to white noise. Likewise, for general ARMA(p,q), the residuals can be expressed as

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

Modelling using ARMA processes

STAT 436 / Lecture 16: Key

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9)

Minitab Project Report - Assignment 6

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9) Outline. Return Rate. US Gross National Product

Univariate ARIMA Models

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Final Examination 7/6/2011

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

The data was collected from the website and then converted to a time-series indexed from 1 to 86.

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

STAT Financial Time Series

FE570 Financial Markets and Trading. Stevens Institute of Technology

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Stat 153 Time Series. Problem Set 4

Ch 6. Model Specification. Time Series Analysis

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41202, Spring Quarter 2003, Mr. Ruey S. Tsay

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

MAT3379 (Winter 2016)

Report for Fitting a Time Series Model

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia

Scenario 5: Internet Usage Solution. θ j

Part 1. Multiple Choice (40 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 6 points each)

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Problem Set 2: Box-Jenkins methodology

Homework 4. 1 Data analysis problems

INTRODUCTION TO TIME SERIES ANALYSIS. The Simple Moving Average Model

ST505/S697R: Fall Homework 2 Solution.

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

The log transformation produces a time series whose variance can be treated as constant over time.

Empirical Approach to Modelling and Forecasting Inflation in Ghana

MCMC analysis of classical time series algorithms.

Analysis of Violent Crime in Los Angeles County

Comment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo

Financial Econometrics and Quantitative Risk Managenent Return Properties

at least 50 and preferably 100 observations should be available to build a proper model

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

ITSM-R Reference Manual

Lecture 5: Estimation of time series

2. An Introduction to Moving Average Models and ARMA Models

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Problems from Chapter 3 of Shumway and Stoffer s Book

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Modeling and forecasting global mean temperature time series

Midterm Suggested Solutions

Exercises - Time series analysis

Estimation and application of best ARIMA model for forecasting the uranium price.

Univariate linear models

ARIMA modeling to forecast area and production of rice in West Bengal

Lecture 4a: ARMA Model

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

Forecasting using R. Rob J Hyndman. 2.5 Seasonal ARIMA models. Forecasting using R 1

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

5 Autoregressive-Moving-Average Modeling

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Unit root problem, solution of difference equations Simple deterministic model, question of unit root

Forecasting of meteorological drought using ARIMA model

AR, MA and ARMA models

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Advanced Econometrics

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Case Study: Modelling Industrial Dryer Temperature Arun K. Tangirala 11/19/2016

The Model Building Process Part I: Checking Model Assumptions Best Practice

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Ch3. TRENDS. Time Series Analysis

A test for improved forecasting performance at higher lead times

Automatic seasonal auto regressive moving average models and unit root test detection

COMPUTER SESSION: ARMA PROCESSES

Lecture 6a: Unit Root and ARIMA Models

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Lesson 14: Model Checking

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

Package TimeSeries.OBeu

Analysis. Components of a Time Series

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

Math 423/533: The Main Theoretical Topics

CHAPTER 8 FORECASTING PRACTICE I

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

A stochastic modeling for paddy production in Tamilnadu

Testing for IID Noise/White Noise: I

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

TMA4285 December 2015 Time series models, solution.

Transcription:

STAT 520 FORECASTING AND TIME SERIES 2013 FALL Homework 05 1. ibm data: The random walk model of first differences is chosen to be the suggest model of ibm data. That is (1 B)Y t = e t where e t is a mean zero white noise process. For the diagnose part, first we do runs test to test if first differences are independent. A runs test on the first differences procedures the following output: > runs(diff(ibm))#run test. $pvalue [1] 0.258 $observed.runs [1] 172 $expected.runs [1] 183.2391 The p value= 0.258 for the test is not small, so we would not reject the null hypothesis. This test does not provide evidence of dependency of the first differences.. However, the Shapiro-Wilk test suggest the non-normality of the first difference. Second, we do Shapiro-Wilk test to see if the first differences are normal distributed. The Shapiro-Wilk test on the first differences procedures the output: > shapiro.test(diff(ibm)) Shapiro-Wilk normality test data: diff(ibm) W = 0.9585, p-value = 1.094e-08 The p value= 1.094 10 8 is extremely small, so we would reject null hypothesis. The test provide strong evidence of non-normality for the first differences. Please see Figure 1. The shape of histogram of first differences is bell-shaped with heavy tails. So t-distribution could be a possible choice. The qq-plot of t-distribution with 5 degree of freedom looks good. The followings are R commands to make the histogram and qq-plot. 1

Histogram of diff(ibm) Frequency 0 20 40 60 80 100 120 diff(ibm) 40 30 20 10 0 10 20 40 20 0 10 30 diff(ibm) 4 2 0 2 4 Q Q plot for t(5) Figure 1: Left: Histogram of first differences of ibm data. Right: QQ plot for par(mfrow=c(1,2)) hist(diff(ibm)) qqplot(qt(ppoints(diff(ibm)),df = 7.6), diff(ibm), xlab = "Q-Q plot for t(5)") qqline(diff(ibm)) The R function tsdiag produces the plot in Figure 2. The top plot displays the residuals plotted through time (without connecting lines). The middle plot displays the sample ACF of the residuals. The bottom plot displays the p values of the modified Ljung-Box test for various values of K. A horizontal line at α = 0.05 is added. For the ibm data, we see in Figure 2 that all of the modified Ljung-Box thest p value are mostly larger than 0.05 at early K. However, they are close to α = 0.05 and turn to significant at the end. It raises my concerns that this model may not be adequate. The autocorrelations are looking good even thought there is significant large value at long lags. We now overfit using an ARIMA(1,1,0) and an ARIMA(0,1,1) model. Here is the R output from all three model fits: > arima(ibm,order=c(0,1,0),method= ML ) sigma^2 estimated as 52.62: log likelihood = -1251.37, aic = 2502.74 > arima(ibm,order=c(1,1,0),method= ML ) ar1 0.0869 2

Standardized Residuals 4 2 0 2 4 0 100 200 300 ACF of Residuals 0.10 0.00 0.10 5 10 15 20 P values 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 Figure 2: ibm data. Residual graphics and modified Ljung-Box p values for ARIMA(0,1,0) fit. This figure was created using the tsdiag function in R. 3

s.e. 0.0519 sigma^2 estimated as 52.22: log likelihood = -1249.97, aic = 2501.94 > arima(ibm,order=c(0,1,1),method= ML ) ma1 0.0864 s.e. 0.0512 sigma^2 estimated as 52.22: log likelihood = -1249.97, aic = 2501.95 In the ARIMA(1,1,0) overfit, we see that a 95 percent confidence interval for ϕ, the additional AR model parameter, is 0.0869 ± 1.96(0.0519) = ( 0.189, 0.015), which does include 0. Therefore, ˆϕ is not statistically different than zero, which suggests that the ARIMA(1,1,0) model is not necessary. In the ARIMA(0,1,1) overfit, we see that a 95 percent confidence interval for ϕ, the additional AR model parameter, is 0.0864 ± 1.96(0.0512) = ( 0.187, 0.014), which does include 0. Therefore, ˆθ is not statistically different than zero, which suggests that the ARIMA(0,1,1) model is not necessary. The following table summarizes the output. Because the additional estimates in the overfit models Table 1: Standardized Residuals after ARIMA(0,1,0) model fitted Model ˆθ(ŝe) Additional estimate Significant? ˆσ 2 e AIC ARIMA(0,1,0) 0(0) 52.62 2502.74 ARIMA(1,1,0) 0.0869(0.0519) ˆϕ no 52.22 2501.94 ARIMA(0,1,1) 0.0864(0.0512) ˆθ no 52.22 2501.95 are not statistically different from zero, there is no reason to further consider either model. Note also how the estimate of θ and ϕ becomes less precise in the two larger models. So we still conclude that the final model is (1 B)Y t = e t where e t is a mean zero white noise process with e t following t-distribution with 5 degrees of freedom. 4

internet data: The ARIMA(2, 2, 0) model of first differences is chosen to be the suggest model of internet data. That is (1 ϕ 1 B ϕ 2 B 2 )((1 B) 2 Y t ) = e t. Here is the R output from fitting an ARIMA(2,2,0) model via maximum likelihood: > arima(internet,order=c(2,2,0),method= ML ) ar1 ar2 0.2579-0.4407 s.e. 0.0915 0.0906 sigma^2 estimated as 10.13: log likelihood = -252.73, aic = 509.46 Therefore, the fitted ARIMA(2,2,0) model is 2 Y t = 0.2579 2 Y t 1 0.4407 2 Y t 2 + e t. The white noise variance estimate, using CLS, is ˆσ 2 e 10.13. Figure 3 displays the time series plot (upper right), the histogram (lower left), and the qq plot (lower right) of the standardized residuals. The histogram and the qq plot show no gross departures from normality. The time series plot of the standardized residuals displays no noticeable patterns and looks like a stationary random process. The R output for the Shapiro-Wilk and runs tests is given below: > shapiro.test(rstandard(internet.arima.fit)) Shapiro-Wilk normality test data: rstandard(internet.arima.fit) W = 0.9896, p-value = 0.6305 > runs.test(rstandard(internet.arima.fit)) Error in runs.test(rstandard(internet.arima.fit)) : x is not a factor > runs(rstandard(internet.arima.fit)) $pvalue [1] 0.999 $observed.runs [1] 50 $expected.runs [1] 50.5 The Shapiro-Wilk test does not reject normality (p value=0.0.6305). The runs test does not reject independence (p value=0.999). For the internet data, (standardized) residuals from a ARIMA(2,2,0) fit look to reasonably 5

internet 100 150 200 Standardized residuals 2 1 0 1 2 0 20 40 60 80 100 Time 0 20 40 60 80 100 Time Frequency 0 5 10 15 20 25 Sample Quantiles 2 1 0 1 2 3 2 1 0 1 2 Standardized residuals 2 1 0 1 2 Theoretical Quantiles Figure 3: internet data. Upper left: internet time series. Upper right: Standardized residuals from an ARIMA(2,2,0) fit with zero line added. Lower left: Histogram of the standardized residuals from ARIMA(2,2,0) fit. Lower right: QQ plot of the standardized residuals from ARIMA(2,2,0) fit. 6

Standardized Residuals 2 1 0 1 2 0 20 40 60 80 100 ACF of Residuals 0.2 0.1 0.0 0.1 0.2 5 10 15 20 P values 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 Figure 4: internet data. Residual graphics and modified Ljung-Box p values for ARIMA(2,2,0) fit. This figure was created using the tsdiag function in R. 7

satisfy the normality and independence assumptions. The R function tsdiag produces the plot in Figure 4. The top plot displays the residuals plotted through time (without connecting lines). The middle plot displays the sample ACF of the residuals. The bottom plot displays the p values of the modified Ljung-Box test for various values of K. A horizontal line at α = 0.05 is added. For the internet data, we see in Figure 4 that all of the modified Ljung-Box thest p value are larger than 0.05, lending further support of the ARIMA(2,2,0) model. In other words, the residual output in Figure 4 fully supports the ARIMA(2,2,0) model. Our residual analysis suggests that an ARIMA(2,2,0) model for the internet data is reasonable. We now overfit using an ARIMA(3,2,0) and an ARIMA(2,2,1) model. Here is the R output from all three model fits: > arima(internet,order=c(2,2,0),method= ML ) ar1 ar2 0.2579-0.4407 s.e. 0.0915 0.0906 sigma^2 estimated as 10.13: log likelihood = -252.73, aic = 509.46 > arima(internet,order=c(2,2,1),method= ML )#overfitting diagnostics ar1 ar2 ma1 0.3512-0.4572-0.1161 s.e. 0.2189 0.0937 0.2502 sigma^2 estimated as 10.10: log likelihood = -252.63, aic = 511.26 > arima(internet,order=c(3,2,0),method= ML )#overfitting diagnostics ar1 ar2 ar3 0.2390-0.4298-0.0430 s.e. 0.1021 0.0943 0.1033 sigma^2 estimated as 10.11: log likelihood = -252.65, aic = 511.29 In the ARIMA(2,2,1) overfit, we see that a 95 percent confidence interval for ϕ, the additional MA model parameter, is 0.1161 ± 1.96(0.2502) = ( 0.606, 0.374), which does include 0. Therefore, ˆθ is not statistically different than zero, which suggests that the ARIMA(2,2,1) model is not necessary. In the ARIMA(3,2,0) overfit, we see that a 95 percent confidence interval for ϕ 3, the additional AR model parameter, is 0.0430 ± 1.96(0.1033) = ( 0.245, 0.159), 8

which does include 0. Therefore, ˆϕ3 is not statistically different than zero, which suggests that the ARIMA(3,2,0) model is not necessary. The following table summarizes the output. Because the additional estimates in the overfit Table 2: Overfit ARIMA(2,2,0) model fitted Model ˆθ(ŝe) Additional estimate Significant? ˆσ 2 e AIC ARIMA(2,2,0) 0.2579(0.0915) 10.13 509.46-0.4407(0.0906) 10.13 509.46 ARIMA(2,2,1) -0.1161(0.2501) ˆθ no 10.10 511.26 ARIMA(3,2,0) -0.0430(0.1033) ˆϕ3 no 10.11 511.29 models are not statistically different from zero, there is no reason to further consider either model. Note also how the estimate of θ and ϕ becomes less precise in the two larger models. After all, ARIAM(2,2,0) model performs well and I feel satisfied to choose this model. 9

gasprices data: The ARIMA(0, 1, 1) model of first differences is chosen to be the suggest model of gasprices data. That is ((1 B)Y t ) = (1 θb)e t. Here is the R output from fitting an ARIMA(0,1,1) model via maximum likelihood: > gasprices.arima.fit ma1 0.4418 s.e. 0.0625 sigma^2 estimated as 0.002408: log likelihood = 229.65, aic = -457.3 Therefore, the fitted ARIMA(0,1,1) model is Y t = 0.4418e t 1 + e t. The white noise variance estimate, using CLS, is ˆσ 2 e 0.002408. Figure 5 displays the time series plot (upper right), the histogram (lower left), and the qq plot (lower right) of the standardized residuals. The histogram show the distribution of standardized residuals is a little right-skewed. The qq plot show a little off normal assumption. Time series plot of the standardized residuals displays non-constant variance. The R output for the Shapiro-Wilk and runs tests is given below: > shapiro.test(rstandard(gasprices.arima.fit)) Shapiro-Wilk normality test data: rstandard(gasprices.arima.fit) W = 0.9777, p-value = 0.01824 > runs(rstandard(gasprices.arima.fit)) $pvalue [1] 0.759 $observed.runs [1] 71 $expected.runs [1] 73.33103 The Shapiro-Wilk test does reject normality (p value=0.01824). However, the runs test does not reject independence (p value= 0.759). For the ibm data, (standardized) residuals from a ARIMA(0,1,1) fit satisfy the independence assumptions but not normality. The R function tsdiag produces the plot in Figure 6. The top plot displays the residuals plotted through time (without 10

gasprices 2.0 2.5 3.0 3.5 Standardized residuals 2 1 0 1 2 3 0 20 40 60 80 120 0 20 40 60 80 120 Frequency 0 5 10 15 20 25 30 Sample Quantiles 2 1 0 1 2 3 2 1 0 1 2 3 4 2 1 0 1 2 Figure 5: gasprices data. Upper left: gasrprices time series. Upper right: Standardized residuals from an ARIMA(0,1,1) fit with zero line added. Lower left: Histogram of the standardized residuals from ARIMA(0,1,1) fit. Lower right: QQ plot of the standardized residuals from ARIMA(0,1,1) fit. 11

Standardized Residuals 2 0 1 2 3 0 20 40 60 80 100 120 140 ACF of Residuals 0.15 0.05 0.05 0.15 5 10 15 20 P values 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 Figure 6: gasprices data. Residual graphics and modified Ljung-Box p values for ARIMA(0,1,1) fit. This figure was created using the tsdiag function in R. 12

connecting lines). The middle plot displays the sample ACF of the residuals. The bottom plot displays the p values of the modified Ljung-Box test for various values of K. A horizontal line at α = 0.05 is added. For the gasprices data, we see in Figure 6 that all of the modified Ljung-Box thest p value are larger than 0.05, lending further support of the ARIMA(0,1,1) model. In other words, the residual output in Figure 6 fully supports the ARIMA(0,1,1) model. Our residual analysis suggests that an ARIMA(0,1,1) model for the ibm data is reasonable. We now overfit using an ARIMA(1,1,1) and an ARIMA(0,1,2) model. Here is the R output from all three model fits: > arima(gasprices,order=c(0,1,1),method= ML ) ma1 0.4418 s.e. 0.0625 sigma^2 estimated as 0.002408: log likelihood = 229.65, aic = -457.3 > arima(gasprices,order=c(1,1,1),method= ML ) ar1 ma1 0.3585 0.1713 s.e. 0.1451 0.1479 sigma^2 estimated as 0.002324: log likelihood = 232.18, aic = -460.35 > arima(gasprices,order=c(0,1,2),method= ML ) ma1 ma2 0.5433 0.1946 s.e. 0.0843 0.0779 sigma^2 estimated as 0.002312: log likelihood = 232.53, aic = -461.06 In the ARIMA(1,1,1) overfit, we see that a 95 percent confidence interval for ϕ, the additional AR model parameter, is 0.3585 ± 1.96(0.1451) = ( 0.643, 0.074), which does not include 0. Therefore, ˆϕ is statistically different than zero, which suggests that the ARIMA(1,1,1) model is worth considered. In the ARIMA(0,1,2) overfit, we see that a 95 percent confidence interval for θ 2, the additional AR model parameter, is 0.1946 ± 1.96(0.0779) = (0.042, 0.347), which does not include 0. Therefore, ˆθ2 is statistically different than zero, which suggests that the ARIMA(0,1,2) model is worth considered. The following table summarizes the output. Because the additional estimates in the 13

Table 3: Overfit ARIMA(0,1,1) model Model ˆθ(ŝe) Additional estimate Significant? ˆσ 2 e AIC ARIMA(0,1,1) 0.2579(0.0915) 0.002408-457.3 ARIMA(1,1,1) 0.3585(0.1451) ˆϕ yes 0.002324-460.35 ARIMA(0,1,2) 0.1946(0.0779) ˆθ2 yes 0.002312-461.06 overfit models are statistically different from zero, considering further additional models is worth. After all, the original model is not totally satisfied after diagnosis. It is needed to do further search to find more appropriate model in the future. 14

95% Log Likelihood 0 50 100 150 200 2 1 0 1 2 λ Figure 7: Box-Cox transformation output. 2. 2.(a) Because the numbers of home runs are counts, this suggests that a transformation is needed. The Box-Cox transformation output in Figure 7 shows that λ = 0.5 resides in an approximate 95 percent confidence interval for λ. Recall that λ = 0.5 corresponds to the square-root transformation. Taking first differences of the squared-root of data is need because there should exist trend of the square-root of home runs data. 2.(b) The ACF of {Z t should decay very slowly because there is trend and it is not a stationary process. Comparing with ACF of {Z t, we expected to see ACF of { Z t decay to zero faster. 2.(c) According to the output from R, we have ˆθ = 0.2488 and estimated standard deviation 0.1072. An approximate 95 percent confidence interval for θ is 0.2488 ± 1.96(0.1072) = ( 0.459, 0.039). We are 95 percent confident that θ is between 0.459 and 0.039. 2.(d) According to Shapiro test and runs test of standardized residuals, we do not have evidence to reject that the residuals are mean zero white process. However, the modified Ljung-Box test p values raise serious concerns over adequacy of the IMA(1,1) model fit. 15

3. Table 4: Standardized Residuals after IMA(1,1) model fitted Time lag 1 2 3 4 5 6 7 8 9 10 ACF 0.882 0.811 0.733 0.690 0.674 0.668 0.661 0.621 0.568 0.519 PACF 0.882 0.144-0.029 0.111 0.159 0.093 0.050-0.097-0.087-0.015 Table 5: Ljung-Box Test and Fisher and Gallagher Test m 2 3 4 5 6 7 8 9 10 Q 0.156 4.539 10.917 13.624 13.941 15.364 17.934 17.984 18.031 p value 0.693 0.103 0.012 0.009 0.016 0.018 0.012 0.021 0.035 Q W 0.079 1.566 3.903 5.847 7.196 8.363 9.559 10.495 11.249 p value 1.000 0.605 0.140 0.053 0.036 0.029 0.023 0.021 0.022 M W 0.079 1.566 4.068 6.206 7.849 9.023 9.926 10.726 11.437 p value 1.000 0.605 0.121 0.039 0.022 0.018 0.018 0.018 0.020 According to Table 5, Fisher and Gallagher test suggests that IMA(1,1) may not be adequate just like Ljung-Box test does. The followings are R commands to calculate ACF, PACF, Q, QW, and M W. round(acf(homeruns,plot=f,lag.max=10)$acf,3) #ACF round(pacf(homeruns,plot=f,lag.max=10)$acf,3) #PACF Q_LjungBox = function(r,m){ #LjungBox test statistic n = length(r); mn = length(m) Q_LjungBox = array(,mn) for (i in 1:mn){ k = 1:m[i] acf_r = acf(r,plot=f,lag.max=m[i])$acf Q_LjungBox[i] = n*(n+2)*sum((acf_r^2)/(n-k)) return(q_ljungbox) p_value_for_q_ljungbox = function(q_ljungbox,m,p=0,q=1){ mn = length(m); p_value = array(,mn) for (i in 1:mn){ p_value[i] = 1-pchisq(Q_LjungBox[i],df=m[i]-p-q) return(p_value) 16

m = 2:10 round(q_ljungbox(rstandard(homerun.fit),m),3) round(p_value_for_q_ljungbox(q_ljungbox(rstandard(homerun.fit),m),m),3) QW = function(r,m){ #Q.W.m n = length(r); mn = length(m) QW = array(,mn) for (i in 1:mn){ k = 1:m[i] acf_r = acf(r,plot=f,lag.max=m[i])$acf QW[i] = n*(n+2)*sum((m[i]-k+1)/m[i] *(acf_r^2)/(n-k)) return(qw) MW = function(r,m){ #M.W.m n = length(r); mn = length(m) MW = array(,mn) for (i in 1:mn){ k = 1:m[i] pacf_r = pacf(r,plot=f,lag.max=m[i])$acf MW[i] = n*(n+2)*sum( (m[i]-k+1)/m[i] *(pacf_r^2)/(n-k) ) return(mw) p_value_for_qm = function(qm, m){ mn = length(m) p_value = array(,mn) for (i in 1:mn){ alpha = (3/4)*(m[i]*(m[i]+1)^2)/(2*m[i]^2+3*m[i]+1-6*m[i]) #Equation(7) beta = (2/3)*(2*m[i]^2+3*m[i]+1-6*m[i])/(m[i]*(m[i]+1)) #Equation(8) p_value[i] = 1-pgamma(QM[i],shape=alpha,scale=beta) return(p_value) m = 2:10 round(qw(rstandard(homerun.fit),2:10),3) round(p_value_for_qm(qw(rstandard(homerun.fit),m),m),3) round(mw(rstandard(homerun.fit),2:10),3) round(p_value_for_qm(mw(rstandard(homerun.fit),m),m),3) 17