Information Criteria and Model Selection

Similar documents
ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Review Session: Econometrics - CLEFIN (20192)

Alfredo A. Romero * College of William and Mary

Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay

Time Series. Chapter Time Series Data

Tutorial lecture 2: System identification

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

Performance of Autoregressive Order Selection Criteria: A Simulation Study

On Autoregressive Order Selection Criteria

A NOTE ON THE SELECTION OF TIME SERIES MODELS. June 2001

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models

Model selection criteria Λ

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

STAT Financial Time Series

Terence Tai-Leung Chong. Abstract

Modelling using ARMA processes

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Introduction to Estimation Methods for Time Series models Lecture 2

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project

Model selection using penalty function criteria

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Modelling the Skewed Exponential Power Distribution in Finance

Model comparison and selection

Applied time-series analysis

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

End-Semester Examination MA 373 : Statistical Analysis on Financial Data

Final Exam Financial Data Analysis at the University of Freiburg (Winter Semester 2008/2009) Friday, November 14, 2008,

Introduction to Time Series Analysis. Lecture 11.

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

TIME SERIES DATA ANALYSIS USING EVIEWS

Lesson 2: Analysis of time series

6.3 Forecasting ARMA processes

Regression, Ridge Regression, Lasso

LTI Systems, Additive Noise, and Order Estimation

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

ARIMA Modelling and Forecasting

System Identification General Aspects and Structure

Comparison of New Approach Criteria for Estimating the Order of Autoregressive Process

Univariate ARIMA Models

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Minimum Message Length Autoregressive Model Order Selection

Arma-Arch Modeling Of The Returns Of First Bank Of Nigeria

Econometric Forecasting

New Introduction to Multiple Time Series Analysis

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS. Yngve Selén and Erik G. Larsson

Quantitative Finance I

Advanced Econometrics

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals

Lecture on ARMA model

Lesson 9: Autoregressive-Moving Average (ARMA) models

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Introduction to ARMA and GARCH processes

AR-order estimation by testing sets using the Modified Information Criterion

Testing Restrictions and Comparing Models

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Estimation and application of best ARIMA model for forecasting the uranium price.

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

1.5 Testing and Model Selection

Performance of lag length selection criteria in three different situations

SRNDNA Model Fitting in RL Workshop

Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science Delft Institute of Applied Mathematics

FEA Working Paper No

Forecasting exchange rate volatility using conditional variance models selected by information criteria

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994).

Problem Set 2: Box-Jenkins methodology

Linear models. Chapter Overview. Linear process: A process {X n } is a linear process if it has the representation.

Ch 6. Model Specification. Time Series Analysis

10. Time series regression and forecasting

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Akaike criterion: Kullback-Leibler discrepancy

13. Estimation and Extensions in the ARCH model. MA6622, Ernesto Mordecki, CityU, HK, References for this Lecture:

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

The Multiple Regression Model Estimation

Figure 29: AR model fit into speech sample ah (top), the residual, and the random sample of the model (bottom).

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016 Organized by African Heritage Institution

Determining the number of components in mixture models for hierarchical data

Modelling the Covariance

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Discrete time processes

Volume 30, Issue 1. The relationship between the F-test and the Schwarz criterion: Implications for Granger-causality tests

Example: Baseball winning % for NL West pennant winner

A Course in Time Series Analysis

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Transcription:

Information Criteria and Model Selection Herman J. Bierens Pennsylvania State University March 12, 2006 1. Introduction Let L n (k) be the maximum likelihood of a model with k parameters based on a sample of size n, and let k 0 be the correct number of parameters. Suppose that for k > k 0 the model with k parameters is nested in the model with k 0 parameters, so that L n ) is obtained by setting k!k 0 parameters in the larger model to constants. Without loss of generality we may assume that these constants are zeros. Thus, denoting the likelihood function of the least parsimonious model by ˆL n (θ), θ 0 Θ dú m, L n (k) ' max θ0θ k ˆL n (θ), where Θ k ' θ ' θ 1 θ 2 0 Θ: θ 2 ' 0 0ú m&k (1) for k # m. Thus, the models with k < k 0 parameters are misspecified, and the models with k > k 0 parameters are correctly specified but over-parametrized. The Akaike (1974, 1976), Hannan-Quinn (1979), and Schwarz (1978) information criteria for selecting the most parsimonious correct model are (k))/n % 2k/n, (k))/n % 2k.ln(ln(n))/n, (k))/n % k.ln(n)/n, respectively. Since the Schwarz information criterion is derived using Bayesian arguments, this criterion is also known as the Bayesian Information Criterion (BIC). These criteria take the general form (k))/n % k.n(n)/n, (2) where n(n) = 2 in the Akaike case, n(n) ' 2.ln(ln(n)) in the Hannan-Quinn case, and n(n) = ln(n) in the Schwarz case. Using these criteria, the model is selected that corresponds to ˆk ' argmin k#m (k). (3) 1

2. Consistency If k < k 0 then the model with k parameters is misspecified, so that p (k))/n < p ))/n. (4) Hence, it follows from (2), (4) and n(n)/n ' 0 that in all three cases ) $ (k)] ' P[&2. ))/n % k 0.n(n)/n $ &2. (k))/n % k.n(n)/n] (5) ' P[ ))/n & (k))/n # 0.5 &k).n(n)/n] ' 0, so that P[ ˆk < k 0 ] # ) $ (k) forsomek < k 0 ] # ' k<k0 ) $ (k)] ' 0 (6) For k > k 0 it follows from the likelihood ratio test that 2 (k)) & )) 6 d X k&k0 - χ 2 k&k 0, (7) where 6 d indicates convergence in distribution. Then in the Akaike case, n cn (k0) & cn (k) ' 2 ln(ln (k)) & ln(ln (k0)) & 2(k&k0 ) 6d Xk&k0 & 2(k&k 0 ), hence )> (k)] ' P[X k&k0 >2(k&k 0 )] > 0. Therefore, the Akaike criterion may asymptotically overshoot the correct number of parameters: two cases hence P[ ˆk $ k 0 ] ' 1, but P[ ˆk > k 0 ]>0, Since in the Hannan-Quinn and Schwarz cases, n(n) ' 4, p &2 )) & (k)) /n(n) ' 0 (7) implies that in these p n( ) & (k))/n(n) ' p &2 )) & (k)) /n(n) % k 0 &k ' k 0 &k # &1 so that This implies, similar to (6), that cases, ) $ (k)] ' 0. P[ ˆk > k 0 ] ' 0. Thus, in the Hannan-Quinn and Schwarz 2

P[ ˆk ' k 0 ] ' 1. (8) Note that the consistency result (8) holds for any criterion of the type (2) with n(n)/n ' 0 and n(n) ' 4, (9) for example, let n(n) ' n. 3. Applications 3.1 VAR and AR model selection Let L n (k) be the maximum likelihood of a d-variate Gaussian VAR(p) model, ' a 0 % ' p j'1 A j &j, U t - i.i.d. N d [0,Σ], where Y is observed for Then k = d + d 2 t 0ú d t ' 1&p,...,n..p and (k)) '& 1 n.d & 1 n.d.ln(2π) & 1 n.ln det(ˆσ 2 2 2 p ), where ˆΣ p is the maximum likelihood estimator of the error variance Σ. Hence, &2.ln(Ln (k))/n ' ln det(ˆσp ) % d. 1 % ln(2π). (10) The second term does not depend on p. Therefore, the model is selected that corresponds to ˆp ' argmin p n (p), where )) % 2(d%d 2 p)/n, )) % 2(d%d 2 p)ln(ln(n))/n, )) % (d%d 2 p)ln(n)/n. Similarly, these criteria can also be used to determine the order p of an AR(p) model: ' α 0 % ' p j'1 α j &j, U t - i.i.d. N[0,σ 2 ], where again 0úis observed for t ' 1&p,...,n, simply by replacing d with 1 and det(ˆσ p ) with the ML estimator ˆσ 2 p of the error variance σ 2 : n (p) ' ln(ˆσ2 p ) % 2(1%p)/n, n (p) ' ln(ˆσ2 p ) % 2(1%p)ln(ln(n))/n, n (p) ' ln(ˆσ2 p ) % (1%p)ln(n)/n. 3.2 ARMA model specification Similarly, in the ARMA(p,q) case ' α 0 % ' p j'1 α j &j & ' q j'1 β j U t&j, U t - i.i.d. N[0,σ 2 ], 3

these criteria become n (p,q) ' ln(ˆσ 2 p,q ) % 2(1%p%q)/n, n (p,q) ' ln(ˆσ 2 p,q ) % 2(1%p%q)ln(ln(n))/n, n (p,q) ' ln(ˆσ 2 p,q ) % (1%p%q)ln(n)/n, where now ˆσ 2 p,q is the ML estimator of the error variance σ 2 and n is the number of observations used in the ML estimation. It can be shown [see Hannan (1980)] that in the case of common roots in the AR and MA polynomials the Hannan-Quinn and Schwarz criteria still select the correct orders p and q consistently: Given upper bounds p $ p 0 and q $ q 0, where p 0 and q 0 are the correct orders of an ARMA(p,q) process, we have P[ˆp ' p 0, ˆq ' q 0 ] ' 1, where (ˆp,ˆq) ' argmin 0#p# p,0#q# q n (p,q). 3.3 ARCH and GARCH models If a model is extended to include ARCH or GARCH errors, it is recommended to subtract the term 1 % ln(2π) from &2. (k))/n [see (10)] in the formula for the information criteria, in order to make these criteria comparable with those for the model without (G)ARCH errors. Thus, n (k))/n % 2k/n & 1 & ln(2π), n (k))/n % 2k.ln(ln(n))/n & 1 & ln(2π), n (k))/n % k.ln(n)/n & 1 & ln(2π), where again k is the number of parameters, including the (G)ARCH parameters. References Akaike, H. (1974): "A New Look at the Statistical Model Identification," I.E.E.E. Transactions on Automatic Control, AC 19, 716-723. Akaike, H. (1976): "Canonical Correlation Analysis of Time Series and the Use of an Information Criterion," in R. K. Mehra and D. G. Lainotis (eds.), System Identification: Advances and Case Studies, Academic Press, New York, 52-107. 4

Hannan, E. J. (1980): "The Estimation of the Order of an ARMA Process", Annals of Statistics, 8, 1071-1081. Hannan, E. J., and B. G. Quinn (1979): "The Determination of the Order of an Autoregression," Journal of the Royal Statistical Society, B, 41, 190-195. Schwarz, G. (1978): "Estimating the Dimension of a Model," Annals of Statistics, 6, 461-464. 5