Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Size: px
Start display at page:

Download "Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006"

Transcription

1 Least Squares Model Averaging Bruce E. Hansen University of Wisconsin January 2006 Revised: August 2006

2 Introduction This paper developes a model averaging estimator for linear regression. Model averaging is a generalization of model selection We propose a Mallows criterion to select the weights. These weights are optimal in the sense that they asymptotically minimize the conditional squared error. A technical caveat is that the proof restricts attention to discrete weights, while continuous weights are desired. A substantive cavaet is that we restrict attention to homoskedastic errors. (May be able to extend the method to heteroskedastic errors by using CV.) A companion paper applies the method to forecast combination.

3 Model Selection AIC: (Akaike, 1973) Mallows C p : (Mallows, 1973) BIC: (Schwarz, 1978) Cross-Validation: (Stone, 1974) Generalized Cross Validation: (Craven and Wahba, 1979) Focused Information Criterion: (Claeskens and Hjort, 2003) GMM Selection: (Andrews and Lu, 2001) EL Selection: (Hong, Preston and and Shum, 2003)

4 Model Averaging Bayesian Model Averaging: Draper (1995) Raftery, Madigan and Hoeting (1997) Exponential AIC weights: Buckland et. al. (1997) Burnham and Anderson (2002) Mixing estimator: Yang (2001) Risk properties: Leung and Barron (2004) Hjort and Claeskens (2003)

5 Selection of terms in series expansion Andrews (1991a) Newey (1997) Asymptotic Optimality of Model Selection Rules: Shibata (1980, 1981, 1983): AIC in Gaussian regression Lee and Karagrigoriou (2001): AIC in non-gaussian regression Li (1987): Mallows, GCV and CV in homoskedastic regression Andrews (1991b): Generalized Mallows, GCV and CV in heteroskedastic regression

6 Model y i = µ i + e i X µ i = θ j x ji E (e i x i ) = 0 j=1 E e 2 i x i = σ 2. The coefficients and regressors may be terms in a series expansion, or a large number of potential regressors. The goal is to estimate µ i using a finite sample. There are an infinite number of coefficients. The regressors are ordered, at least by groups.

7 Approximating models The m 0 th approximating model uses the first k m regressors y i = k m X j=1 θ j x ji + b mi + e i Approximation error: b mi = X j=k m +1 θ j x ji. In matrix notation, Y = X m Θ m + b m + e Least-squares estimate of Θ m ˆΘ m = (X 0 mx m ) 1 X 0 my = Θ m +(X 0 mx m ) 1 X 0 mb m +(X 0 mx m ) 1 X 0 me

8 Model Selection Let ˆσ 2 (m) = 1 ³Y X ˆΘm 0 m ³Y X m ˆΘm. n The BIC, AIC, Mallows, and CV selection rules picks m to minimize BIC(m) =lnˆσ 2 (m)+ln(n) k m n AIC(m) =lnˆσ 2 (m)+ 2k m n C(m) =ˆσ 2 (m)+ 2k m n σ2 CV (m) = 1 nx yi ˆµ n i, i (m) 2 i=1 where ˆµ i, i (m) is the leave-one-out estimate of µ i. If m is small, the AIC, Mallows and CV rules are often quite similar.

9 Properties of Model Selection Rules If the true model is finite, BIC is consistent; AIC/Mallows over-select. If the true model is infintie-dimensional, Li (Annals of Statistics, 1987) showed that Mallows and CV selection are asymptically optimal. Mallows selected model ˆm =argmin m C(m) Average squared error: Optimality: As n L n (m) = ³µ X m ˆΘm 0 ³µ X m ˆΘm L n (ˆm) inf m L n (m) p 1

10 Model Averaging Let W =(w 1,..., w M ) 0 be a vector of non-negative weights. It is an element of the unit simplex in R M H n = ( W [0, 1] M : A model average estimator of Θ M is ˆΘ(W )= MX m=1 ) MX w m =1. m=1 w m ˆΘ m 0.

11 The model average estimate of µ is ˆµ(W ) = X M ˆΘ MX = w m X m ˆΘm = = m=1 MX w m X m (XmX 0 m ) 1 XmY 0 m=1 MX w m P m Y m=1 = P (W )Y where P (W )= MX w m P m. m=1

12 Lemma 1 1. tr (P (W )) = P M m=1 w mk m k(w ). 2. tr (P (W ) P (W )) = P M P M m=1 l=1 w mw l min (k l,k m )=W 0 Γ M W where k 1 k 1 k 1 k 1 k 1 k 2 k 2 k 2 Γ M = k 1 k 2 k 3 k k 1 k 2 k 3 k M 3. λ max (P (W )) 1.

13 Lemma 2. LetL n (W )=(ˆµ(W ) µ) 0 (ˆµ(W ) µ). Then R n (W )=E (L n (W ) X) =W 0 A n + σ 2 Γ M W where Γ M = k 1 k 1 k 1 k 1 k 1 k 2 k 2 k 2 k 1 k 2 k 3 k , A n = a 1 a 2 a 3 a M a 2 a 2 a 3 a M a 3 a 3 a 3 a M , k 1 k 2 k 3 k M a M a M a M a M and a m = b 0 m (I P m ) b m are monotonically decreasing. (Recall that b m are the approximation errors: Y = X m Θ m + b m + e.)

14 Lemma2showsthatR n (W ) is quadratic in W. Contour sets are ellipses in R M. The (infeasible) optimal weight vector minimizes R n (W ) over W H n It necessarily puts non-zero weight on at least two models (unless a 1 = a m ) To see this, suppose that M =2and WLOG k 1 =0and a 2 =0. Then R n (W )=a 1 w k 2 w 2 2 which is uniquely minimized by w 1 = σ2 k 2 a 1 + σ 2 k 2

15 Weight Selection existing approaches Howshouldtheweightvectorbeseletedinpractice? Bayesian Model Averaging (BMA). If π m are the prior probabilities then w m exp ³ n 2 BIC m π m. If all models are given equal prior weight, then w m exp ³ n 2 BIC m. Smoothed AIC (SAIC) was introduced by Buckland et. al. (1997) and embraced by Burnham and Anderson (2002) and Hjort and Claeskens (2003) w m exp ³ n 2 AIC m. Hjort and Claeskens (2003) also discuss an Empirical Bayes Smoother.

16 Mallows Criterion Define the residual vector ê(w )=Y X M ˆΘ (W ) and the average (or effective) number of parameters MX k (W )= w m k m. m=1 The Mallow s criterion for the model average estimator is C n (W ) = ê(w ) 0 ê(w )+2σ 2 k (W ) = W 0 ē 0 ēw +2σ 2 K 0 W where ē =(ê 1,..., ê M ) and K =(k 1,..., k M ) 0.

17 The Mallows selected weight vector is The solution is found numerically. Ŵ =argmin W H n C n (W ). This is classic quadratic programming with boundary constraints, for which algorithms are widely available.

18 First Justification The Mallows criterion is an unbiased estimate of the expected squared error. Lemma 3. EC n (W )=EL n (W )+nσ 2 This is a classic known property of the Mallows criterion.

19 Second Justification For some finite integer N. Let the weights w m be restricted to {0, 1 N, 2 N,, 1}. Let H n (N) be the subset of H n restricted to this set of weights. Mallows weights restricted to discrete set H n (N) Ŵ N = argmin C n (W ) W H n (N) These weights are optimal in the sense of asymptotic minimization of conditional squared error.

20 Theorem 1. As n, if ξ n = inf W H n R n (W ) almost surely, and E ³ e i 4(N+1) x i < then L n ³ŴN inf W Hn (N) L n (W ) p 1. The proof is an application of Theorem 2.1 of Li (1987). No restriction is placed on M, the largest model.

21 Comment on the assumption ξ n = inf W H n R n (W ) which is typical in the analysis of non-parametric models. Recall R n (W ) = E (L n (W ) X) L n (W ) = (ˆµ(W ) µ) 0 (ˆµ(W ) µ) means that there is no correct finite dimensional model. Otherwise, if there is a correct model m 0,theninf W Hn R n (W )=0. In this case, Mallows weights will not be optimal.

22 Technical Difficulty The proof of the theorem requires that the function L n (W ) be uniformly well behaved as a function of the weight vector W. This requires a stochastic equicontinuity property over W H n. The difficulty is that the dimension of H n grows with M, which is growing with n. This can be made less binding by selecting N large. While w m is restricted to a set of discrete weights, there is no restriction on M, the largest model allowed. It would be desirable to allow for continuous weights w m.

23 Estimation of σ 2 for Mallows Criterion The Mallows criterion depends on unknown σ 2. It must be replaced with an estimate. Itisdesirablefortheestimatetohavelowbias. One choice is ˆσ 2 K = 1 n K ³Y X K ˆΘK 0 ³Y X K ˆΘK where K corresponds to a large approximating model. Theorem 2. If K and K/n 0 as n then ˆσ 2 K p σ 2.

24 Simulation Experiment X y i = θ j x ji + e i θ j j=1 = c 2αj α 1/2 e i N(0, 1) x 1i =1. Remaining x ji are iid N(0, 1).. The coefficient c is used to control R 2 = c 2 /(1 + c 2 ) n =50, 150, 400, 1000 α =.5, 1.0 and 1.5 Larger α implies that the coefficients θ j decline more quickly with j. M =3n 1/3. (maximum model order)

25 Methods 1. AIC model selection (AIC) 2. Mallows model selection (Mallows) 3. Smoothed AIC (S-AIC): w m exp( 1 2 AIC m) 4. Smoothed BIC (S-BIC): w m exp( 1 2 BIC m) 5. Mallows Model Averaging (MMA) The estimators are compared by expected squared error (risk). Averages across 100,000 simulations. Risk normalized by dividing by infeasible best-fitting LS model m

26

27

28

29 Application U.S. Unemployment Rate Changes Monthly, (120 Observations) AR(m) models, 0 m 12 AIC and Mallows select ˆm =3 Ŵ sets ŵ 0 =.10, ŵ 1 =.17, ŵ 3 =.34, ŵ 9 =.39

30 OLS AIC MA UR t UR t UR t UR t UR t UR t UR t UR t UR t UR t UR t UR t 12.12

31 Least Squares Forecast Averaging Bruce E. Hansen University of Wisconsin March 2006

32 Forecast Combination The goal is to construct a point forecast f n+1 of a target y n+1 given x n R k. There are M models, each producing a forecast ˆf n+1 (m). Model m has k m free parameters. A forecast combination takes the form ˆf n+1 (W )= MX w m ˆfn+1 (m) m=1 where W =(w 1,..., w m ) are non-negative weights which satisfy MX w m =1. m=1 Many methods have been proposed for the selection of weights.

33 Two Recent Successes: (1) Equal Weights Reference: Stock and Watson (1999, 2004, 2005, 2006) w m =1/M Excellent performance in empirical studies. But inherently arbitrary. Depends critically on the class of forecasts ˆf n+1 (m). If a terrible model is included in the set, the forecast will suffer. In this sense, equal weights is an incomplete forecast combination strategy.

34 (2) Bayesian Model Averaging (BMA) Reference: Wright (2003ab). BMA with uninformative priors approximately sets w m = exp 1 2 BIC m P M j=1 exp 1 2 BIC j. where the BIC for model m is BIC m = n ln(ˆσ 2 m)+k m ln n. Advantage: Excellent performance in empirical studies. Deficiencies: (a) Bayesian methods rely on priors, and thus are inherently arbitrary. (b) BMA assumes that the truth is parametric. This paradigm is misspecified when models are approximations.

35 Mean-Squared Forecast Error Define the forecast error e n+1 = y n+1 f n+1 and its variance σ 2 = Ee 2 n+1 The mean-squared forecast error is ³ MSE(W ) = E y n+1 ˆf 2 n+1 (W ) ³ = Ee 2 n+1 + E ˆfn+1 (W ) f n+1 2 Thus minimizing the mean-square forecast error is the same as minimizing the estimated conditional mean. I propose using the MMA weights for forecast combination among linear models.

36 Finite Sample Investigation Static Model Infinite-order iid regression X y i = θ j x ji + e i θ j j=1 = c 2αj α 1/2 e i N(0, 1) x 1i =1. Remaining x ji are iid N(0, 1). R 2 = c 2 /(1 + c 2 ) is controlled by the parameter c. n =50, 100, and200. α {.5, 1.0, 1.5, 2.0}. Larger α implies that the coefficients θ j decline more quickly with j. M =3n 1/3

37

38

39

40 Finite Sample Investigation Dynamic Model Univariate time series y t, generated by MA(m) y t = (1+ψL) m e t e t N(0, 1) ψ [.1,.9] m {1, 2, 3, 4}. Forecasting models: AR(p), 0 p M =2n 1/3 n =50, 100, and 200.

41

42

43

44 Empirical Illustration y t =Unemployment Rate (Monthly) x t =Interest Rate Spread (5 year Treasury - Fed Funds Rate) h-step-ahead Linear Forecasts y t+h 1 = α 0 + α 1 y t α p y t p + β 1 x t 1 + β q x t q + e t. p {1, 2, 3, 4, 5, 6} q {0, 1, 2, 3, 4} 30 Models Sample Period: 1959: :02 Recursive Out-of-Sample Forecasts, Starting with 120 observations

45

46 Out-of-Sample Mean Squared Error For Unemployment Rate Forecasts Equal BIC Weights MMA Weights h = h = h = h = Note: MSE is scaled relative to BIC selection

47 MMA Weights 6-step-ahead Forecast Computed on Full Sample q =0 q =1 q =2 q =3 q =4 p = p = p = p = p = p =

48 BIC Weights 6-step-ahead Forecast Computed on Full Sample q =0 q =1 q =2 q =3 q =4 p = p = p = p = p = p =

49 Summary Econometric models are approximations. Model selection should be guided by approximation accuracy. Squared error is a convenient criterion. Averaging estimates across models has the potential to improve estimation efficiency. Mallows Criterion may be used to select the model average weights. It has an asymptotic optimality property.

50 Cautionary Remarks The Mallows criterion assumes homoskedasticity. A cross-validation criterion may allows for heteroskedasticity (under investigation with Jeff Racine). Inference using model average estimates is difficult. There are no useful standard errors for the model average estimators. The estimates have non-standard distributions and are non-pivotal. It is unclear how to form confidence intervals from the estimates.

Journal of Econometrics

Journal of Econometrics Journal of Econometrics 46 008 34 350 Contents lists available at ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom Least-squares forecast averaging Bruce E. Hansen

More information

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Bruce E. Hansen Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts

More information

JACKKNIFE MODEL AVERAGING. 1. Introduction

JACKKNIFE MODEL AVERAGING. 1. Introduction JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified models for improved estimation

More information

Essays on Least Squares Model Averaging

Essays on Least Squares Model Averaging Essays on Least Squares Model Averaging by Tian Xie A thesis submitted to the Department of Economics in conformity with the requirements for the degree of Doctor of Philosophy Queen s University Kingston,

More information

JACKKNIFE MODEL AVERAGING. 1. Introduction

JACKKNIFE MODEL AVERAGING. 1. Introduction JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified) models for improved estimation

More information

Model Averaging in Predictive Regressions

Model Averaging in Predictive Regressions Model Averaging in Predictive Regressions Chu-An Liu and Biing-Shen Kuo Academia Sinica and National Chengchi University Mar 7, 206 Liu & Kuo (IEAS & NCCU) Model Averaging in Predictive Regressions Mar

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Economics School of Economics -3 Jackknife Model Averaging for Quantile Regressions

More information

Multi-Step Forecast Model Selection

Multi-Step Forecast Model Selection Multi-Step Forecast Model Selection Bruce E. Hansen April 2010 Preliminary Abstract This paper examines model selection and combination in the context of multi-step linear forecasting. We start by investigating

More information

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell. ABSTRACT POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.) Statisticians are often faced with the difficult task of model

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Jackknife Model Averaging for Quantile Regressions Xun Lu and Liangjun Su Department of Economics, Hong Kong University of Science & Technology School of Economics, Singapore Management University, Singapore

More information

Averaging Estimators for Regressions with a Possible Structural Break

Averaging Estimators for Regressions with a Possible Structural Break Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy

ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy ESSAYS ON MODEL AVERAGING by Chu-An Liu A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Economics) at the UNIVERSITY OF WISCONSIN MADISON 0 Date

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Information Criteria and Model Selection

Information Criteria and Model Selection Information Criteria and Model Selection Herman J. Bierens Pennsylvania State University March 12, 2006 1. Introduction Let L n (k) be the maximum likelihood of a model with k parameters based on a sample

More information

Model averaging, asymptotic risk, and regressor groups

Model averaging, asymptotic risk, and regressor groups Quantitative Economics 5 2014), 495 530 1759-7331/20140495 Model averaging, asymptotic risk, and regressor groups Bruce E. Hansen University of Wisconsin This paper examines the asymptotic risk of nested

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0034.R2 Title OPTIMAL MODEL AVERAGING OF VARYING COEFFICIENT MODELS Manuscript ID SS-2017-0034.R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0034

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 :

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Model Averaging in Predictive Regressions

Model Averaging in Predictive Regressions MPRA Munich Personal RePEc Archive Model Averaging in Predictive Regressions Chu-An Liu and Biing-Shen Kuo Academia Sinica, National Chengchi University 8 March 206 Online at https://mpra.ub.uni-muenchen.de/706/

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Model Combining in Factorial Data Analysis

Model Combining in Factorial Data Analysis Model Combining in Factorial Data Analysis Lihua Chen Department of Mathematics, The University of Toledo Panayotis Giannakouros Department of Economics, University of Missouri Kansas City Yuhong Yang

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Time Series Econometrics For the 21st Century

Time Series Econometrics For the 21st Century Time Series Econometrics For the 21st Century by Bruce E. Hansen Department of Economics University of Wisconsin January 2017 Bruce Hansen (University of Wisconsin) Time Series Econometrics January 2017

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions

Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions TR-No. 14-07, Hiroshima Statistical Research Group, 1 12 Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 2, Mariko Yamamura 1 and Hirokazu Yanagihara 2 1

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

A NOTE ON THE SELECTION OF TIME SERIES MODELS. June 2001

A NOTE ON THE SELECTION OF TIME SERIES MODELS. June 2001 A NOTE ON THE SELECTION OF TIME SERIES MODELS Serena Ng Pierre Perron June 2001 Abstract We consider issues related to the order of an autoregression selected using information criteria. We study the sensitivity

More information

GMM-based Model Averaging

GMM-based Model Averaging GMM-based Model Averaging Luis F. Martins Department of Quantitative Methods, ISCTE-LUI, Portugal Centre for International Macroeconomic Studies (CIMS), UK (luis.martins@iscte.pt) Vasco J. Gabriel CIMS,

More information

AGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0

AGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0 AGEC 661 ote Eleven Ximing Wu M-estimator So far we ve focused on linear models, where the estimators have a closed form solution. If the population model is nonlinear, the estimators often do not have

More information

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,

More information

Lectures on Structural Change

Lectures on Structural Change Lectures on Structural Change Eric Zivot Department of Economics, University of Washington April5,2003 1 Overview of Testing for and Estimating Structural Change in Econometric Models 1. Day 1: Tests of

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Variable Selection in Predictive Regressions

Variable Selection in Predictive Regressions Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when

More information

Moment and IV Selection Approaches: A Comparative Simulation Study

Moment and IV Selection Approaches: A Comparative Simulation Study Moment and IV Selection Approaches: A Comparative Simulation Study Mehmet Caner Esfandiar Maasoumi Juan Andrés Riquelme August 7, 2014 Abstract We compare three moment selection approaches, followed by

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin October 2018 Bruce Hansen (University of Wisconsin) Exact

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

The Role of "Leads" in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji

The Role of Leads in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji he Role of "Leads" in the Dynamic itle of Cointegrating Regression Models Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji Citation Issue 2006-12 Date ype echnical Report ext Version publisher URL http://hdl.handle.net/10086/13599

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Working Paper 2016:1 Department of Statistics On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Sebastian

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin June 2017 Bruce Hansen (University of Wisconsin) Exact

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Additional Topics on Linear Regression

Additional Topics on Linear Regression Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Additional Topics 1 / 49 1 Tests for Functional Form Misspecification 2 Nonlinear

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) YES, Eurandom, 10 October 2011 p. 1/32 Part II 2) Adaptation and oracle inequalities YES, Eurandom, 10 October 2011

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă mmp@stat.washington.edu Reading: Murphy: BIC, AIC 8.4.2 (pp 255), SRM 6.5 (pp 204) Hastie, Tibshirani

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Combining predictions from linear models when training and test inputs differ

Combining predictions from linear models when training and test inputs differ Combining predictions from linear models when training and test inputs differ Thijs van Ommen Centrum Wiskunde & Informatica Amsterdam, The Netherlands Abstract Methods for combining predictions from different

More information

ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009

ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009 1. Introduction Given a covariance stationary process µ ' E[ ], the Wold decomposition states that where U t ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009 with vanishing

More information

Chapter 6. Ensemble Methods

Chapter 6. Ensemble Methods Chapter 6. Ensemble Methods Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Introduction

More information

Optimizing forecasts for inflation and interest rates by time-series model averaging

Optimizing forecasts for inflation and interest rates by time-series model averaging Optimizing forecasts for inflation and interest rates by time-series model averaging Presented at the ISF 2008, Nice 1 Introduction 2 The rival prediction models 3 Prediction horse race 4 Parametric bootstrap

More information

Variable selection in linear regression through adaptive penalty selection

Variable selection in linear regression through adaptive penalty selection PREPRINT 國立臺灣大學數學系預印本 Department of Mathematics, National Taiwan University www.math.ntu.edu.tw/~mathlib/preprint/2012-04.pdf Variable selection in linear regression through adaptive penalty selection

More information

Econometría 2: Análisis de series de Tiempo

Econometría 2: Análisis de series de Tiempo Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 IX. Vector Time Series Models VARMA Models A. 1. Motivation: The vector

More information

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s.

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s. 9. AUTOCORRELATION [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s. ) Assumptions: All of SIC except SIC.3 (the random sample assumption).

More information

On Uniform Asymptotic Risk of Averaging GMM Estimators

On Uniform Asymptotic Risk of Averaging GMM Estimators On Uniform Asymptotic Risk of Averaging GMM Estimators Xu Cheng Zhipeng Liao Ruoyao Shi This Version: August, 28 Abstract This paper studies the averaging GMM estimator that combines a conservative GMM

More information

arxiv: v2 [math.st] 9 Feb 2017

arxiv: v2 [math.st] 9 Feb 2017 Submitted to the Annals of Statistics PREDICTION ERROR AFTER MODEL SEARCH By Xiaoying Tian Harris, Department of Statistics, Stanford University arxiv:1610.06107v math.st 9 Feb 017 Estimation of the prediction

More information

Akaike Information Criterion

Akaike Information Criterion Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7, 2012-1- background Background Model statistical model: Y j =

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

1.1 Basis of Statistical Decision Theory

1.1 Basis of Statistical Decision Theory ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

Model Selection in the Presence of Incidental Parameters

Model Selection in the Presence of Incidental Parameters Model Selection in the Presence of Incidental Parameters Yoonseok Lee University of Michigan July 2012 Abstract This paper considers model selection of nonlinear panel data models in the presence of incidental

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Non-linear least squares

Non-linear least squares Non-linear least squares Concept of non-linear least squares We have extensively studied linear least squares or linear regression. We see that there is a unique regression line that can be determined

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Measures of Fit from AR(p)

Measures of Fit from AR(p) Measures of Fit from AR(p) Residual Sum of Squared Errors Residual Mean Squared Error Root MSE (Standard Error of Regression) R-squared R-bar-squared = = T t e t SSR 1 2 ˆ = = T t e t p T s 1 2 2 ˆ 1 1

More information

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) Explain/Obtain the variance-covariance matrix of Both in the bivariate case (two regressors) and

More information

MFE Financial Econometrics 2018 Final Exam Model Solutions

MFE Financial Econometrics 2018 Final Exam Model Solutions MFE Financial Econometrics 2018 Final Exam Model Solutions Tuesday 12 th March, 2019 1. If (X, ε) N (0, I 2 ) what is the distribution of Y = µ + β X + ε? Y N ( µ, β 2 + 1 ) 2. What is the Cramer-Rao lower

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models

Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models Eric Zivot April 26, 2010 Outline Likehood of SV Models Survey of Estimation Techniques for SV Models GMM Estimation

More information

Advanced Statistics I : Gaussian Linear Model (and beyond)

Advanced Statistics I : Gaussian Linear Model (and beyond) Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection

More information

An Averaging GMM Estimator Robust to Misspecication

An Averaging GMM Estimator Robust to Misspecication An Averaging GMM Estimator Robust to Misspecication Xu Cheng y Zhipeng Liao z Ruoyao Shi x This Version: January, 28 Abstract This paper studies the averaging GMM estimator that combines a conservative

More information

Why experimenters should not randomize, and what they should do instead

Why experimenters should not randomize, and what they should do instead Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Model selection in penalized Gaussian graphical models

Model selection in penalized Gaussian graphical models University of Groningen e.c.wit@rug.nl http://www.math.rug.nl/ ernst January 2014 Penalized likelihood generates a PATH of solutions Consider an experiment: Γ genes measured across T time points. Assume

More information