ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy

Size: px
Start display at page:

Download "ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy"

Transcription

1 ESSAYS ON MODEL AVERAGING by Chu-An Liu A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Economics) at the UNIVERSITY OF WISCONSIN MADISON 0 Date of final oral examination: 5/7/ The dissertation is approved by the following members of the Final Oral Committee: Bruce Hansen, Professor, Economics Jack Porter, Professor, Economics Xiaoxia Shi, Assistant Professor, Economics Andres Aradillas-Lopez, Assistant Professor, Economics Chunming Zhang, Professor, Statistics

2 c Copyright by Chu-An Liu 0 All Rights Reserved

3 i ACKNOWLEDGMENTS I would not have been able to complete this dissertation without the help and support of many people. Words cannot express my gratitude to my advisor, Bruce Hansen, for his guidance, assistance, and encouragement at all stages of this research. He has always given me wise advice and pointed me in the right direction. I am indebted to Jack Porter for his valuable advice and for teaching me the attitude of doing research. I would like to thank Xiaoxia Shi for useful advice and tips to be an effective speaker. I also appreciate the help I got from Yu-Chin Hsu, Biing-Shen Kuo, and Shinn-Juh Lin. Finally, I want to thank Andres Aradillas-Lopez and Chunming Zhang for serving on my oral defense committee. Thanks to many friends, the past five years in Madison have been a fantastic and unforgettable journey for me. First, I would like to thank Ying-Ying Lee, Hsueh-Hsiang (Cher) Li, and Jen-Che Liao who started the journey from the very beginning with me and have continued to support me. Next, I would like to thank reading group partners: Andrew (Drew) Anderson, Laura Dague, Shengjie Hong, SeoJeong (Jay) Lee, Enrique Pinzon Garcia, Nelson Ramirez Rondan, Mai Seki, Naoya Sueishi, Jing Tao, and Jin Yan for helpful discussions and insightful comments. I also want to thank Chih-Sheng Hsieh, Roger (Zak) Koeckeritz, Ching-Yang (Jim) Lin, Wang-Chang (Vito) Tsao, Hui-Chi (Brooke) Wu, and Cheng-Ying (Anita) Yang for warm support and encouragement. Last, but not least, I am most grateful to my family, especially my mother Mei-Hua Hsu and my father I-Chia Liu, for always being there for me. I could not have started and finished this journey without for their love, support, and encouragement.

4 ii TABLE OF CONTENTS Page LIST OF TABLES LIST OF FIGURES ABSTRACT iv vi viii A Plug-In Averaging Estimator for Regressions with Heteroskedastic Errors Introduction Model and Estimation Asymptotic Properties Plug-In Averaging Estimator AIC, S-AIC and JMA Estimators AIC and Smoothed AIC Jackknife Model Averaging Estimator Model Averaging for Two Models Simulation Results Simulation Setup Finite Sample Comparison Robust Simulation Asymptotic Comparison Confidence Intervals Asymptotic Quantiles Coverage Probabilities An Empirical Example Conclusion Model Averaging in Predictive Regressions Introduction Model and Estimation MSE and MSFE

5 iii Page.4 Weight Selection Finite Sample Investigation Simulation Experiment I: Regression Model Simulation Experiment II: MAX(,) Model Conclusion Averaging Estimators for Kernel Regressions Introduction Model and Estimation Asymptotic Properties Cross-Validation Simulations Conclusion APPENDICES Appendix A: Proofs, Tables, and Figures in Chapter Appendix B: Proofs and Figures in Chapter Appendix C: Proofs, Figures, and Tables in Chapter BIBLIOGRAPHY

6 iv LIST OF TABLES Table Page Appendix Table A. Maximum risk for DGP and DGP A. Maximum regret for DGP and DGP A.3 Maximum risk and maximum regret for DGP 3 and DGP A.4 Coverage probabilities of 90% and 95% confidence intervals A.5 Coefficient estimates and standard errors for Model Setup A A.6 Coefficient estimates and standard errors for Model Setup B A.7 Weights placed on each submodel for Model Setup A A.8 Weights placed on each submodel for Model Setup B A.9 Regressor set of the submodel for Model Setup A A.0 Regressor set of the submodel for Model Setup B C. MSEE results for DGP with x N(0,) and e N(0,) C. MSEE results for DGP with x N(0,) and e N(0,) C.3 MSEE results for DGP 3 with x N(0,) and e N(0,) C.4 MSEE results for DGP with x U(0,) and e N(0,) C.5 MSEE results for DGP with x U(0,) and e N(0,) C.6 MSEE results for DGP 3 with x U(0,) and e N(0,)

7 v Appendix Table Page C.7 MSEE results for DGP with x 0.5N(,)+0.5N(.75,0.5) and e N(0,) 5 C.8 MSEE results for DGP with x 0.5N(,)+0.5N(.75,0.5) and e N(0,) 5 C.9 MSEE results for DGP 3 with x 0.5N(,)+0.5N(.75,0.5) and e N(0,) 6 C.0 MSEE results for DGP with x N(0,) and e N(0,x i ) C. MSEE results for DGP with x N(0,) and e N(0,x i) C. MSEE results for DGP 3 with x N(0,) and e N(0,x i)

8 vi LIST OF FIGURES Figure Page Appendix Figure A. Risk functions for DGP, σ i =, ρ = 0.3, ρ = A. Risk functions for DGP, σ i =, ρ = 0.3, ρ = A.3 Risk functions for DGP, σ i = x i, ρ = 0.3, ρ = A.4 Risk functions for DGP, σ i = x i, ρ = 0.3, ρ = A.5 Risk functions for DGP, σ i =, ρ = 0.6, ρ = A.6 Risk functions for DGP, σ i =, ρ = 0.6, ρ = A.7 Risk functions for DGP, σ i = x i, ρ = 0.6, ρ = A.8 Risk functions for DGP, σ i = x i, ρ = 0.6, ρ = A.9 Risk functions for DGP 3, σ i =, ρ = 0.3, ρ = A.0 Risk functions for DGP 4, σ i =, ρ = 0.3, ρ = A. Asymptotic risk for DGP, σ i = A. Asymptotic risk for DGP, σ i = x i A.3 Asymptotic quantiles for DGP, σ i A.4 Asymptotic quantiles for DGP, σ i =, α = =, α = B. Relative risk for the regression model, homoskedastic errors, ρ x = B. Relative risk for the regression model, heteroskedastic errors, ρ x =

9 vii Appendix Figure Page B.3 Relative risk for the regression model, homoskedastic errors, ρ x = B.4 Relative risk for the regression model, heteroskedastic errors, ρ x = B.5 Relative risk for the regression model, homoskedastic errors, ρ x = 0.5, small B.6 Relative risk for the regression model, heteroskedastic errors, ρ x = 0.5, small B.7 Relative risk for the MAX(,) model, homoskedastic errors B.8 Relative risk for the MAX(,) model, heteroskedastic errors B.9 Model size for the MAX(,) model, homoskedastic errors B.0 Model size for the MAX(,) model, heteroskedastic errors C. Asymptotic comparison C. Asymptotic comparison C.3 Densities of the cross-validated weights, x N(0,), e N(0,) C.4 Densities of the cross-validated weights, x U(0,), e N(0,) C.5 Densities of the cross-validated weights, x 0.5N(,)+0.5N(.75,0.5), e N(0,)... C.6 Densities of the cross-validated weights, x N(0,), e N(0,σ i ), σ i = x i

10 viii ABSTRACT This dissertation is a collection of three essays on model averaging, organized in the form of three chapters. The first chapter proposes a new model averaging estimator for the linear regression model with heteroskedastic errors. We address the issues of how to assign the weights for candidate models optimally and how to make inference based on the averaging estimator. We first derive the asymptotic distribution of the averaging estimator with fixed weights in a local asymptotic framework, which allows us to characterize the optimal weights. The optimal weights are obtained by minimizing the asymptotic mean squared error. Second, we propose a plug-in estimator of the optimal weights and use these estimated weights to construct a plug-in averaging estimator of the parameter of interest. We derive the asymptotic distribution of the proposed estimator. Third, we show that confidence intervals based on normal approximations lead to distorted inference in this context. We suggest a plug-in method to construct confidence intervals, which have good finite-sample coverage probabilities. Monte Carlo simulations show that the plug-in averaging estimator has much lower expected squared error than other existing model selection and model averaging methods. Also, the plug-in averaging estimator achieves the minimax risk and minimax regret. As an empirical illustration, the proposed methodology is applied to cross-country growth regressions. The second chapter investigates model combination in a predictive regression. We derive the mean squared forecast error (MSFE) of the model averaging estimator in a local

11 ix asymptotic framework. We show that the optimal model weights which minimize the MSFE depend on the local parameters and the covariance matrix of the predictive regression. We propose a plug-in estimator of the optimal weights and use these estimated weights to construct the forecast combination. Monte Carlo simulations show that our averaging estimator has much lower MSFE than the weighted AIC, the weighted BIC, and the Jackknife model averaging estimators. The third chapter proposes a model averaging approach to reduce the mean squared error (MSE) and weighted integrated mean squared error (WIMSE) of kernel estimators of regression functions. At each point of estimation, we construct a weighted average of the local constant and local linear estimators. The optimal local and global weights for averaging are chosen to minimize the MSE and WIMSE of the averaging estimator, respectively. We propose two data-driven approaches for bandwidth and weight selection and derive the rate of convergence of the cross-validated weights to their optimal benchmark values. Monte Carlo simulations show that the proposed estimator can achieve significant efficiency gains over the local constant and local linear estimators.

12 Chapter A Plug-In Averaging Estimator for Regressions with Heteroskedastic Errors. Introduction In recent years, interest has increased in model averaging from the frequentist perspective. Unlike model selection, which picks a single model among the candidate models, model averaging incorporates all available information by averaging over all potential models. Model averaging is more robust than model selection since the averaging estimator considers the uncertainty across different models as well as the model bias from each candidate model. The central questions of concern are how to optimally assign the weights for candidate models and how to make inference based on the averaging estimator. This chapter proposes a plug-in averaging estimator to resolve both of these issues. We first derive the asymptotic distribution of the averaging estimator in a local asymptotic framework, and then choose the optimal fixed weights by minimizing the asymptotic mean squared error (AMSE). The idea of the plug-in averaging estimator is to estimate the infeasible optimal weights by minimizing the sample analog of the AMSE. We show that the plug-in averaging estimator has a nonstandard asymptotic distribution. This chapter also suggests a plug-in method to construct the confidence interval. Empirical studies often must consider whether additional regressors should be included in the baseline model. Adding more regressors reduces the model bias but causes a large variance. To address the trade-off between bias and variance, this chapter studies model averaginginalocalasymptoticframeworkwheretheregressioncoefficientsareinalocaln /

13 neighborhood of zero. The weak instrument literature also uses the local-to-zero framework (seestaigerandstock(997)). Fortheregressionmodel, theo(n / )frameworkiscanonical in the sense that both squared model biases and estimator variances have the same order O(n ). Therefore, the optimal model is the one that has the best trade-off between squared model biases and estimator variances. The local-to-zero framework is crucial to analyze the asymptotic distribution of the averaging estimator. If all regression coefficients are fixed, then the model bias term tends to infinity and dominates the limiting distribution. In such a situation, the model which includes all regressors is the only one we should consider. The local asymptotic framework also implies that all of the candidate models are close to each other as the sample size increases. Hence, it is informative to employ model averaging rather than model selection in this framework. The empirical literature tends to focus on one particular parameter instead of assessing the overall properties of the model. In contrast to most existing model selection and model averaging methods, our method is tailored to the parameter of interest. The averaging estimator is constructed based on the focus parameter instead of the global fit of the model. The focus parameter is a smooth real-valued function of regression coefficients. We first consider the fixed weights for candidate models and then derive the asymptotic distribution of the averaging estimator of the focus parameter in a local asymptotic framework, which allows us to characterize the optimal weights. The optimal weights are found by numerical minimization of the AMSE. We propose a plug-in estimator of the infeasible optimal weights. The optimal weights cannot be estimated consistently because they depend on the local parameters which cannot be estimated consistently. Estimated weights are asymptotically random, and this must be taken into account in the asymptotic distribution of the plug-in averaging estimator. To address this issue, we first show the joint convergence in distribution of all candidate models. That is, we express all limiting distributions in terms of the same normal random vector W N(0,Ω). Then, we derive the asymptotic distribution of the data-driven weights, which can also be expressed in terms of the same normal random vector W. Finally, we show the asymptotic

14 3 distribution of the plug-in estimator is a non-linear function of the normal random vector W. In addition to the plug-in averaging estimator, we also derive the asymptotic distributions of the Akaike information criterion (AIC) selection estimator (Akaike, 973), the smoothed AIC (S-AIC) model averaging estimator (Buckland, Burnham, and Augustin, 997), and the Jackknife Model Averaging (JMA) estimator (Hansen and Racine, 0) in the local asymptotic framework. Although the asymptotic distribution of the averaging estimator with data-driven weights is non-standard, it can be approximated by simulation. Numerical comparisons show that the plug-in averaging estimator has substantially smaller asymptotic risk than other data-driven averaging estimators in most ranges of the parameter space. One straightforward way to construct the confidence interval for the focus parameter is to employ the t-statistic. The confidence interval is constructed by inverting the t-statistic based on the parameter of interest. We show that the asymptotic distribution of the model averaging t-statistic depends on unknown local parameters, and thus cannot be directly used for inference. We propose a plug-in method to construct the confidence interval based on a non-standard limiting distribution. The idea is to simulate the limiting distribution of the model averaging t-statistic by replacing the unknown parameters with plug-in estimators. The confidence interval is constructed based on the α quantile of the simulated distribution. Our simulations show that the coverage probability of the plug-in confidence interval is close to the nominal level, while the confidence interval based on normal approximations leads to distorted inference. There is a growing body of literature on frequentist model averaging. Buckland, Burnham, and Augustin (997) suggest selecting the weights using the exponential AIC. Yang (00) and Yuan and Yang (005) propose an adaptive regression by mixing models. Hansen (007) introduces the Mallows Model Averaging estimator for nested and homoskedastic models where the weights are selected by minimizing the Mallows criterion. Wan, Zhang, and Zou (00) extend the asymptotical optimality of the Mallows Model Averaging estimator for continuous weights and a non-nested set-up. Hansen and Racine (0) propose the Jackknife Model Averaging estimator for non-nested and heteroskedastic models where

15 4 the weights are chosen by minimizing a leave-one-out cross-validation criterion. Liang, Zou, Wan, and Zhang (0) suggest selecting the weights by minimizing the trace of an unbiased estimator of mean squared error. These papers propose methods of determining weights without deriving the asymptotic distribution of the proposed estimator, which is difficult to make inference based on their estimators. In contrast to frequentist model averaging, there is a large body of literature on Bayesian model averaging (see Hoeting, Madigan, Raftery, and Volinsky (999) for a literature review). The idea of using the local asymptotic framework to investigate the limiting distributions of model averaging estimators is developed by Hjort and Claeskens (003) and Claeskens and Hjort (008). However, their work is limited to the likelihood-based model. Other work on the asymptotic properties of averaging estimators includes Leung and Barron (006), Pötscher (006), and Hansen (009, 00a). Leung and Barron (006) study the risk bound of the averaging estimator under a normal error assumption. Pötscher (006) analyzes the finite sample and asymptotic distributions of the averaging estimator for the two-model case. Hansen (009) evaluates the AMSE of averaging estimators for the linear regression model with a possible structural break. Hansen (00a) examines the AMSE and forecast expected squared error of averaging estimators in an autoregressive model with a near unit root in a local-to-unity framework. All of these studies are limited to the homoskedastic framework. There is a large literature on inference after model selection, including Pötscher (99), Kabaila (995, 998), Leeb and Pötscher (003, 005, 006, 008). These papers point out that the coverage probability of the confidence interval based on the model selection estimator is lower than the nominal level. They also argue that the conditional and unconditional distribution of post-model-selection estimators cannot be uniformly consistently estimated. In the model averaging literature, Hjort and Claeskens (003) and Claeskens and Hjort (008) show that the traditional confidence interval based on normal approximations leads to distorted inference. Pötscher (006) argues that the finite-sample distribution of the averaging estimator cannot be uniformly consistently estimated.

16 5 There are also alternatives to model selection and model averaging. Tibshirani (996) introduces the LASSO estimator, a method for simultaneous estimation and variable selection. Zou(006) proposes the adaptive LASSO approach and presents its oracle properties. White and Lu (00) propose a new Hausman (978) type test of robustness for the core regression coefficients. They also provide a feasible optimally combined GLS estimator. Hansen, Lunde, and Nason (0) propose the model confidence set which is constructed based on an equivalence test. The outline of the chapter is as follows. Section presents the model and the averaging estimator of the focus parameter. Section 3 presents the asymptotic distribution of the averaging estimator with fixed weights in a local asymptotic framework. Section 4 introduces the plug-in averaging estimator and derives the limiting distribution. Section 5 presents the asymptotic distributions of AIC, S-AIC and JMA estimators. The results of the two-model case are presented. Section 6 evaluates the finite sample risk and asymptotic risk of the plugin averaging estimator and other averaging estimators. Section 7 discusses the confidence interval construction. Section 8 applies the plug-in averaging estimator to cross-country growth regressions. Section 9 concludes. Proofs, figures, and tables are included in the Appendix.. Model and Estimation Consider a linear regression model y i = x iβ +z iγ +e i (..) E(e i x i,z i ) = 0 (..) E(e i x i,z i ) = σ (x i,z i ) (..3) where y i is a scalar dependent variable, x i = (x i,...,x ki ) and z i = (z i,...,z li ) are vectors of regressors, e i is an unobservable random error, and β(k ) and γ(l ) are unknown parameter vectors. The error term is allowed to be heteroskedastic and there is no further assumption on the distribution of the error term. Here, x i are the core regressors which must

17 6 be included in the model based on theoretical grounds, while z i are the auxiliary regressors which may or may not be included in the model. Note that x i may only include a constant term or even an empty matrix. By distinguishing between the core and auxiliary regressors, we can reduce the total number of the candidate models. In matrix notation, we write the model as y = Xβ+Zγ +e = Hθ+e (..4) where H = (X,Z) and θ = (β,γ ). The parameter of interest is µ = µ(θ) = µ(β,γ), which is a smooth real-valued function. Unlike the traditional model selection and model averaging approaches which assess the global fit of the model, we evaluate the model based on the focus parameter µ. For example, µ may be an individual coefficient or a ratio of two coefficients of regressors. Let M be the number of submodels, where the submodel includes all core regressors X and a subset of auxiliary regressors Z. The m th submodel has k + l m regressors. If we consider a sequence of nested models, then M = l+. If we consider all possible submodels, then M = l. Let Π m be the l m l selection matrix which selects the included auxiliary regressors. Here, l m is the number of auxiliary regressors z i included in the submodel m. The least-squares estimator of θ for the full model, i.e. all auxiliary regressors are included in the model, is ˆθ = ˆβ = (H H) H y (..5) ˆγ and the estimator for the submodel m is θ m = β m γ m = (H mh m ) H my (..6) where H m = (X,ZΠ m ) with m =,...,M. Let I denote an identity matrix and 0 a zero matrix. If Π m = I l, then we have θ m = (H H) H y = ˆθ, the least-squares estimator for

18 7 the full model. If Π m = 0, then we have θ m = (X X) X y, the least-squares estimator for the narrow model, or the smallest model among all possible submodels. We now define the averaging estimator of the focus parameter µ. Let w = (w,...,w M ) be a weight vector with w m 0 and M m= w m =. That is, the weight vector lies in the unit simplex in R M : H n = { w [0,] M : } M w m =. The sum of the weight vector is required to be one. Otherwise, the averaging estimator is not consistent. Let µ m = µ( θ m ) = µ( β m, γ m ) denote the submodel estimates. The averaging estimator of µ is µ(w) = m= M w m µ m. (..7) m= Here we want to point out that we have less restrictions on the weight function than other existing methods. Leung and Barron (006), Pötscher (006), and Liang, Zou, Wan, and Zhang (0) assume the parametric form of the weight function. Hansen (007) and Hansen and Racine (0) restrict the weights to be discrete. Contrary to these works, we allow continuous weights without assuming any parametric form, which is more general and applicable than other approaches..3 Asymptotic Properties To establish the asymptotic distribution of the averaging estimator, we follow Hjort and Claeskens (003) and use a local-to-zero asymptotic framework where the auxiliary parameters γ are in a local n / neighborhood of zero. Let h i = (x i,z i) and Q = E(h i h i) partitioned so that E(x i x i ) = Q xx, E(x i z i ) = Q xz, and E(z i z i ) = Q zz. Let Ω = lim n n n n i= j= E( h i h j e ) ie j partitionedsothatlimn n n n i= j= E( x i x j e ) ie j = Ω xx, lim n n n n i= j= E( ) x i z je i e j = Ωxz, and lim n n n n i= j= E( ) z i z je i e j = Ω zz. Note that if the error term e i is serially uncorrelated, Ω can be simplified as Ω = E(h i h ie i).

19 8 Assumption.3.. As n, n / γ = n / γ n δ R l. Assumption.3.. As n, n H H p Q and n / H e d W N(0,Ω). Assumption.3. is the key assumption to develop the asymptotic distribution. It is a common assumption in the weak instrument literature, see Staiger and Stock (997). This assumption says the partial correlations between the auxiliary regressors and the dependent variable are weak. This assumption implies that as the sample size increases, all of the submodels are close to each other. Under this framework, it is informative to know if we can do better by averaging the candidate models, instead of choosing one single model. Also note that the O(n / ) framework gives squared model biases of the same order O(n ) as estimator variances. Hence, the optimal model is the one that achieve the best trade-off between bias and variance. Assumption.3. is a high-level condition which permits the application of cross-section, panel, and time-series data. This condition holds under appropriate primitive assumptions. For example, if y i is a stationary and ergodic martingale difference sequence with finite fourth moments, then the condition follows from the weak law of large numbers and the central limit theorem for martingale difference sequences. Since the selection matrix Π m is non-random with elements either 0 or, for the submodel m we have n H mh m where Q m is nonsingular with Q m = Q xx Q xz Π m Π m Q zx Π m Q zz Π m p Q m and n / H m e d N(0,Ω m ) with Ω m = Ω xx Ω xz Π m Π m Ω zx Π m Ω zz Π m. Let θ m = (β,γ m ) = (β,γ Π m ). In this section, we concentrate on fixed weights. The averaging estimator with data-driven weights is presented in the next section. The following

20 9 lemmas describe the asymptotic distributions of the least-squares estimators and the limiting distribution of the focus parameter. Lemma.3.. Suppose Assumptions hold. As n, we have n (ˆθ θ ) d Q W N ( 0,Q ΩQ ), ) d n ( θm θ m A m δ +B m W N ( A m δ, Q m Ω ) mq m, where A m = Q m Q xz Π m Q zz (I l Π m Π m), B m = Q m S m, and S m = I k 0 k lm 0 l k Π m. Note that S m is an extended selection matrix of dimension (k + l) (k + l m ). Denote D θm = ( D β,d γ m ), D β = µ/ β, and D γm = µ/ γ m with partial derivatives evaluated at the null points (β,0 ). Lemma.3.. Suppose Assumptions hold. As n, we have ( ) d n µ( θ m ) µ(θ) Λ m = a m δ +b m W N( a m δ, D θ m Q m Ω mq m D ) θ m where a m = (I l Π mπ m ) Q zx Q zz Π m Q m D θm D γ and b m = S m Q m D θm. The main difference between Lemma.3. and.3. is the asymptotic distribution of the focus parameter involves the partial derivatives. Note that both A m δ and a mδ represent the bias terms of submodel estimators. To be more precise, the biases come from the omitted auxiliary regressors. As we can see from (I l Π m Π m), this is the selection matrix which selects the omitted auxiliary regressors. Lemma.3. and.3. imply joint convergence in distribution of all submodels since all asymptotic distributions of submodels can be expressed in terms of the same normal random

21 0 vector W. The following theorem shows the asymptotic normality of the averaging estimator with fixed weights. Theorem.3.. Suppose Assumptions hold. As n, we have n( µ(w) µ) d N(a δ,v) where a = V = Ω m,p = M w m a m, m= M wmd θ m Q m Ω m Q m D θm + w m w p D θ m Q m Ω m,p Q p D θp, m= m<p Ω xx Ω xz Π p, Π m Ω zx Π m Ω zz Π p and a m is defined in Lemma.3.. Following by Theorem.4., we can derive the AMSE of the averaging estimator. Here, we define the AMSE as AMSE(ˆµ) = lim n E ( n(ˆµ µ) ). Then the AMSE of the averaging estimator (..7) is where ζ is an M M matrix with the (m,p)th element AMSE( µ(w)) = w ζw (.3.) ζ m,p = δ a m a p δ +D θ m Q m Ω m,pq p D θ p (.3.) where a m is defined in Lemma.3. and Ω m,p is defined in Theorem.4.. The optimal fixed-weight vector is the value which minimizes AMSE( µ(w)) over w H n : w o = argmin w H n w ζw. (.3.3) Although there is no closed-form solution to (.3.3) when M >, the weight vector can be found numerically via quadratic programming for which numerical algorithms are available for most programming languages. The minimized AMSE gives a benchmark to compare the AMSE and MSE of data-driven averaging estimators.

22 .4 Plug-In Averaging Estimator The optimal fixed weights derived in the previous section are infeasible, since they depend on the unknown parameters, D θm, Q m, Ω m,p, a m, and δ. Furthermore, the optimal fixed weights cannot be estimated directly because there is no closed form expression of the optimal fixed weights when the number of models is greater than two. A straightforward solution is to estimate the AMSE of the averaging estimator given in (.3.) and (.3.), and to choose the data-driven weights by minimizing the sample analog of the AMSE. We first consider the estimator of the second term of ζ m,p. Let ˆD θ = µ(ˆθ)/ θ be an estimator of D θ where ˆθ is the estimate from the full model. By Lemma.3., it follows that ˆD θ is a consistent estimator of D θ. Let ˆQ = n n i= h ih i be the method of moments estimator for Q. Then by Assumption.3., it follows that ˆQ p Q. Let ˆΩ be the method of moments estimator for Ω. If the error term e i is serially uncorrelated, then Ω can be estimated consistently by the heteroskedasticity-consistent covariance matrix estimator ˆΩ = n n h i h iê i, (.4.) i= which isproposed by White (980). Here ê i = y i x iˆβ z iˆγ is theleast squares residual from thefull model. If theerror terme i is serially correlated, then Ω canbeestimated consistently by the heteroskedasticity and autocorrelation consistent covariance matrix estimator ˆΩ = n k(j/s n )ˆΓ(j), (.4.) j= n ˆΓ(j) = n j h i h n i+jêiê i+j, for j 0, (.4.3) i= ˆΓ(j) = ˆΓ( j), for j < 0, (.4.4) where k( ) is a kernel function and S n the bandwidth. Under some regularity conditions, it follows that ˆΩ p Ω; for serially uncorrelated errors, see White (980), White (984), and for serially correlated errors, see Newey and West (987), and Andrews (99b). Let ˆD θm = S mˆd θ, ˆQ m = S mˆqs m, ˆΩ m,p = S mˆωs p, and S m defined in Lemma.3. is a non-random

23 selection matrix. By the continuous mapping theorem, we have ˆD θ mˆq D θ m Q m Ω m,p Q p D θp. Next, we consider the estimator of the first term of ζ m,p. Define â m as ˆΩ m m,pˆq p ˆD θp p â m = (I l Π m Π m) ˆQ zx ˆQ ˆD ˆQ zz Π m θm ˆD γ. (.4.5) m Then, by the continuous mapping theorem it can be shown that â m p a m. Unlike D θm, Q m, Ω m,p, and a m, there is no consistent estimator for the local parameter δ. This implies that the optimal weights cannot be estimated consistently. Here we use nˆγ as the estimator of δ, where ˆγ are the estimates from the full model. From Lemma.3., we have ˆδ = nˆγ d W δ = δ +Π l Q W N(δ,Π l Q ΩQ Π l ) (.4.6) where Π l = (0 l k,i l ). The limiting distribution of the plug-in estimator ˆδ is W δ which is a linear function of the normal random vector W. We use this result to establish the asymptotic distribution of the plug-in averaging estimator. Note that the first term of ζ m,p can be rewritten as a m δδ a p. Hence, we can estimate δδ instead of δ. Since W δ W δ has mean δδ + Π l Q ΩQ Π l, another possible estimator is nˆγˆγ Π lˆq ˆΩˆQ Π l for δδ. However, it might happen that the estimator of the squared biasterms, thediagonal terms of δδ, arenegative. Furthermore, the asymptotic distribution of the squared bias estimator is more complicated. Therefore, we only consider the estimator nˆγ. We now define the plug-in averaging estimator. The plug-in estimator of AMSE( µ(w)) is w ˆζw where ˆζ is the sample analog of ζ with the (m,p)th element ˆζ m,p = ˆδ â m â pˆδ + ˆD θ mˆq ˆΩ m m,pˆq p The weight vector of the plug-in estimator is defined as ˆD θp. ŵ pia = argmin w H n w ˆζw. (.4.7)

24 3 The plug-in averaging estimator is µ(ŵ pia ) = M ŵ pia,m µ m. (.4.8) The following assumption is imposed on the estimator of the covariance matrix. m= Assumption.4.. There exists ˆΩ such that ˆΩ p Ω. Assumption.4. is a high-level condition on the estimator of the covariance matrix. Rather than impose regularity conditions, we assume there exists a consistent estimator for Ω. The consistent estimators for the covariance matrix are given in (.4.) and (.4.) for serially uncorrelated errors and serially correlated errors, respectively. The sufficient condition for the consistency is e i is i.i.d. or a martingale difference sequence with finite fourth moment. For serial correlation, data is a mean zero α-mixing or ϕ-mixing sequence. Theorem.4.. Suppose Assumptions hold. As n, we have w ˆζw d w ζ w where ζ is an M M matrix with the (m,p)th element and W δ = δ +Π l Q W. Also, we have ζ m,p = W δ a ma p W δ +D θ m Q m Ω m,pq p D θ p ŵ pia d w pia = argmin w H n w ζ w, (.4.9) and n ( µ(ŵpia ) µ ) d M wpia,m Λ m (.4.0) m= where Λ m = a m δ +b m W. Theorem.4.3 shows that the estimated weights are asymptotically random. In order to derive the asymptotic distribution of the plug-in averaging estimator, we show that there is

25 4 joint convergence in distribution of all submodel estimators µ m and estimated weights ŵ pia. The joint convergence in distribution comes from the fact that both Λ m and wpia,m can be expressed in terms of the normal random vector W. It turns out the limiting distribution of the plug-in averaging estimator is not normally distributed. Instead, it is a non-linear function of the normal random vector W. The non-normal nature of the limiting distribution of the averaging estimator with datadriven weights is also pointed out by Hjort and Claeskens (003) and Claeskens and Hjort (008). Here, we use this result to compare the AMSE of the plung-in averaging estimator with those of other data-driven averaging estimators. Besides the numerical comparison, the result is also useful to construct the confidence interval..5 AIC, S-AIC and JMA Estimators In this section, we present the asymptotic distributions of the AIC model selection estimator, the S-AIC model averaging estimator, and the Jackknife Model Averaging estimator. The limiting distributions of AIC, S-AIC, and JMA estimators are non-standard in the local asymptotic framework. We also present the results of the two-model case..5. AIC and Smoothed AIC The model selection estimator based on information criteria is a special case of the model averaging estimator. The model selection puts the whole weight on the model with the smallest value of the information criterion and give other models zero weight. Hence, the weight function of the model selection estimator can be described by the indicator function. The AIC for the linear regression model (3..) is AIC m = nlog( σ m )+(k +l m), m =,,...,M

26 5 where σ m = n n i=ẽ mi and ẽ mi are the least squares residuals from the submodel m, that is, ẽ mi = y i x i β m z mi γ m and z mi = Π m z i. The AIC model selection estimator is thus µ(ŵ aic ) = M ŵ aic,m µ m, m= ŵ aic,m = {AIC m = min(aic,aic,...,aic M )}. Instead of estimating the regression function based on a single model, the S-AIC model averaging estimator proposed by Buckland, Burnham, and Augustin (997) assigns the weights of each candidate models by using the exponential Akaike information criterion. The weight for each submodel is proportional to the log-likelihood of model. The S-AIC model averaging estimator is defined as µ(ŵ saic ) = ŵ saic,m = M ŵ saic,m µ m, (.5.) m= exp( AIC m) M m= exp( AIC m). (.5.) The S-AIC weight is similar to the smoothed Bayesian information criterion (S-BIC) model averaging where the weights are chosen by using the exponential Bayesian information criterion. The S-BIC weight is exp( BIC m)/ M m= exp( BIC m), where BIC m = nlog( σ m )+log(n)(k+l m). The weights of the Bayesian model averaging are interpreted as the posterior model probabilities. Therefore, the S-AIC weight may be interpreted as the model probability. The S-AIC model averaging estimator is appealing because of its simplicity. Also, there is a closed form expression of the S-AIC weights for any number of submodels. However, both AIC and S-AIC are not robust for heteroskedastic regressions. The misspecificationrobust version of AIC is Takeuchi information criterion, see Burnham and Anderson (00). Furthermore, the S-AIC weights ignore the covariances between the submodel estimators and are formed based on the global fit of the model. Therefore, the S-AIC weight of each submodel does not adjust according to the parameter of interest.

27 6 Hjort and Claeskens (003) and Claeskens and Hjort (008) show the limiting distributions of the AIC model selection estimator and the S-AIC model averaging estimator in the likelihood framework. Let AIC be the AIC for the narrow model. Following Theorem 5.4 of Claeskens and Hjort (008), we can show that the AIC AIC m d W δσ m W δ (k+l m ) (.5.3) ( ) where Σ m = V δ Π m Πm V δ Π Πm m V δ and V δ = Π l Q ΩQ Π l. Note that (.5.3) can be expressed as R Ψ m R (k+l m ) where R N(V / δ δ,i l ) and Ψ m = V / ( ) δ Π m Πm V δ Π m Π m V / δ. Here R Ψ m R has a noncentral chi-squared distribution with l m degrees of freedom and non-centrality parameter λ m = δ V / δ Ψ m V / δ δ. Similar to the plug-in averaging estimator, the asymptotic distributions of the AIC model selection estimator and the S-AIC model averaging estimator can be expressed as a non-linear functions of the normal random vector W. Theorem.5.. Suppose Assumptions hold. As n, the asymptotic distribution of the S-AIC model averaging estimator is where and Λ m = a m δ +b m W. n ( µ(ŵsaic ) µ ) d M m= w saic,mλ m w saic,m = exp( W δ Σ mw δ (k +l m )) M m= exp( W δ Σ mw δ (k +l m )).5. Jackknife Model Averaging Estimator The Jackknife Model Averaging estimator is proposed by Hansen and Racine (0). They suggest to select the weights by minimizing a leave-one-out cross-validation criterion. They show the asymptotic optimality of the JMA estimator. That is, the average squared

28 7 error of the JMA estimator is asymptotic equivalent to the lowest expected squared error. The asymptotic optimality of the cross-validation criterion is first established by Li(987) for model selection in homoskedastic regression with an infinite number of regressors. Following Li (987), Andrews (99a) shows the asymptotic optimality of the cross-validation criterion for model selection for heteroskedastic regressions. Hansen and Racine (0) extend the asymptotic optimality from model selection to model averaging. However, the optimality result of Theorem in Hansen and Racine (0) requires the condition which there is no submodel m for which the bias is zero. Therefore, it cannot apply to the context of the linear regression model with a finite number of regressors. In other words, the JMA is not asymptotically optimal in our framework. A similar model averaging estimator with the asymptotic optimality property but not robust to heteroskedasticity is the Mallows Model Averaging estimator proposed by Hansen (007). Define the leave-one-out cross-validation criterion for the averaging estimator for the linear regression model (3..) as follows: CV n (w) = n w ẽ iẽ i w (.5.4) where ẽ i = (ẽ, i,...,ẽ M, i ) is a n M matrix of leave-one-out least-squares residuals and ẽ m, i are the residuals of submodel m obtained by least-squares estimation without the i th observation. The weight vector of the JMA estimator is the value which minimizes CV n (w). By adding and subtracting the sum of squared residuals of the full model ê, we can nê rewrite (.5.4) as CV n (w) = n w ξ n w+ nê ê (.5.5) where ξ n is an M M matrix with the (m,p)th element ξ m,p = ẽ m, iẽ p, i ê ê. (.5.6) Note that minimizing CV n (w) over w = (w,...,w M ) is equivalent to minimizing w ξ n w since nê ê is not related to the weight vector w. In the following theorem, we show that ξ m,p

29 8 converges to a non-linear function of the normal random vector W. The JMA estimator can be represented as µ(ŵ jma ) = M ŵ jma,m µ m, (.5.7) m= ŵ jma = argmin w H n w ξ n w. (.5.8) Here, the weight vector is defined as the minimizer of the quadratic function of w which can be found by quadratic programming as the optimal fixed-weight vector and the plugin weight vector. However, unlike the plug-in averaging estimator where the weights are tailored to the parameter of interest, the JMA estimator selects the weights based on the conditional mean function. One disadvantage of the JMA estimator is the computational burden, which is substantial when both the sample size and the number of regressors are large. The following assumption is imposed on the data generating process. Assumption.5.. (a) {(y i,x i,z i ) : i =,...,n} are i.i.d. (b) E(e 4 i) <, E(x 4 ji) < for j =,...,k, and E(zji 4 ) < for j =,...,l. Condition (a) in Assumption.5. is the i.i.d. assumption, which is also made in Hansen and Racine (0). The result in Theorem.5. can be extended to the stationary case. Condition (b) is the standard assumption for the linear regression model. Note that Assumption.5. implies Assumption.3.. Therefore, the results in Lemma and Theorem.4. hold under Assumptions.3. and.5.. Theorem.5.. Suppose Assumptions.3. and.5. hold. As n, we have w ξ n w d w ξ w where ξ is an M M matrix with the (m,p)th element ξ m,p = Ẅ mqẅ p +tr ( Q m Ω m ) +tr ( Q p Ω p ) (.5.9)

30 9 and Ẅ m = Ä m δ + B m W with and Ä m = Π l S mq m Q xz Π m Q zz (I l Π m Π m), Also, we have and where Λ m = a mδ +b mw. B m = ( Q S m Q m S m). ŵ jma d w jma = argmin w H n w ξ w, (.5.0) n ( µ(ŵjma ) µ ) d M wjma,m Λ m (.5.) m=.5.3 Model Averaging for Two Models In this section, we concentrate on the special case with only two candidate models in the linear regression framework. As we mentioned in previous section, when the number of total models equals two, we have a closed-form solution for the weight vector. Pötscher (006) also analyzes the asymptotic distribution for the two-model case, but assumes the error term is normal distributed. Here, we generalize his results by relaxing the assumption on the error term and also considering the case of two non-nested candidate models. Suppose the auxiliary regressors are partition as Z = (ZΠ,ZΠ ) = (Z,Z ) where Π = (I l,0 l l ) and Π = (0 l l,i l ). Then the regression model (3..) can be rewritten as y = Xβ +Z γ +Z γ +e (.5.)

31 0 where γ is l, γ is l, and l +l = l. We assume the Model includes the regressors X and Z while the Model includes the regressors X and Z. If l = l, then the Model is the restricted model and the Model is the unrestricted model, which is the framework of Pötscher (006). If l > 0 and l > 0, then the Model and are two non-nested models. We denote the estimators of the fucus parameter for the two candidate models by µ = µ( θ ) = µ( β, γ,0) and µ = µ( θ ) = µ( β,0, γ ), respectively. Let w be the weight for µ and w be the weight for µ. The averaging estimator for two models is µ(w) = w µ +( w) µ. (.5.3) Letw o betheinfeasibleoptimalfixed-weight. ThefollowingcorollarydescribestheAMSE of the averaging estimator with the infeasible optimal fixed-weight. Corollary.5.. Suppose Assumptions hold. Then the AMSE of the averaging estimator for two models is AMSE( µ(w)) = w ζ, +( w) ζ, +w( w)ζ, where ζ m,p is defined in (.3.). The weight w which minimizes AMSE( µ(w)) is ζ, ζ, ζ, +ζ, ζ, if ζ, < min{ζ,,ζ, }, w o = if ζ, ζ, < ζ,, 0 if ζ, ζ, < ζ,, and the minimized AMSE is AMSE( µ(w o )) = ζ, ζ, ζ, ζ, +ζ, ζ, if ζ, < min{ζ,,ζ, }, ζ, if ζ, ζ, < ζ,, ζ, if ζ, ζ, < ζ,. The values of ζ, and ζ, in Corollary.5. represent the AMSE of the Model and, respectively. As long as ζ, < min{ζ,,ζ, }, the AMSE of the averaging estimator with the optimal fixed-weight is strictly less than the AMSE of any convex combination of the Model and.

32 We now consider the averaging estimator with data-driven weights when there are only two candidate models. Let ŵ saic, ŵ pia and ŵ jma be the weights chosen by the S-AIC model averaging estimator, the plug-in averaging estimator, and the JMA estimator. From Theorem.5., it can be shown that the AMSE of the S-AIC model averaging estimator µ(ŵ siac ) is AMSE ( µ(ŵ saic ) ) ( = E wsaic ζ, + ( wsaic) ( ) ) ζ, +wsaic w saic ζ, where w saic = (exp( W δ Σ W δ (k + l )))/( m= exp( W δ Σ mw δ (k + l m ))). The following corollary presents the AMSE of the plug-in averaging estimator and the JMA estimator. Corollary.5.. (a) Suppose Assumptions hold. Then the AMSE of the plugin averaging estimator for two models is AMSE ( µ(ŵ pia ) ) = E ( w piaζ, + ( w pia) ζ, + w pia( w pia ) ζ, ) where wpia = ζ, ζ, ζ, +ζ, ζ, if ζ, < min{ζ,,ζ,}, if ζ, ζ, < ζ,, 0 if ζ, ζ, < ζ,, and ζ m,p is defined in Theorem.4.3. (b) Suppose Assumptions.3. and.5. hold. Then the AMSE of the Jackknife Model Averaging estimator for two models is AMSE ( µ(ŵ jma ) ) = E ( w jma ξ, + ( w jma) ξ, + w jma( w jma ) ξ, ) where wjma = ξ, ξ, ξ, +ξ, ξ, if ξ, < min{ξ,,ξ, }, if ξ, ξ, < ξ,, 0 if ξ, ξ, < ξ,, and ξ m,p is defined in Theorem.5.. Note that w o, wpia, and w jma have the similar form but different interpretations. wo is non-random since all ζ,, ζ,, and ζ, are constants. Both wpia and w jma are random

33 because ζm,p and ξ m,p are a non-linear function of the normal random vector W. The results also implies the non-standard limiting distribution of the data-driven estimator in the simple two-model case..6 Simulation Results In this section, we investigate the finite sample and asymptotic mean square error of the plug-in averaging estimator via Monte Carlo experiments..6. Simulation Setup We consider a linear regression model with a finite number of regressors y i = J θ j x ji +e i, i =,...,n. (.6.) j= We let x i and x i be the core regressors and the remaining x ji are the auxiliary regressors. We set x i = to be the intercept. The random variables (x i,...,x Ji ) are generated from a joint normal distribution N(0,Σ) where the diagonal elements of Σ are, E(x i x ji ) = ρ for j 3, and E(x ji x ki ) = ρ for j,k 3 and j k. The error term e i is independent of x ji and is generated from a normal distribution N(0,σ i ), where σ i = for the homoskedastic simulation and σ i = x i for the heteroskedastic simulation. The parameters are determined by the following two rules: ( n n DGP : θ = 8, l,,,..., c/ 8 l l) n, (.6.) ( n n DGP : θ = 8, l,,,..., c/ 8 l l) n, (.6.3) where l = J. The parameter c is selected to control the population = θ Σθ /(+ θ Σθ ) where θ = (θ,...,θ J ) and varies on a grid between 0. and 0.9. The local parameters are determined by δ j = nθ j = c(l j + 3)/l for j 3. The number of the regressors is varied between J = 3, 5, 7, and 9. We consider all possible submodels, that is, the number of models is M = J.

34 3.6. Finite Sample Comparison We consider six estimators: () AIC model selection estimator (labeled AIC), () BIC model selection estimator (labeled BIC), (3) S-AIC model averaging estimator (labeled S- AIC), (4) S-BIC model averaging estimator (labeled S-BIC), (5) Jackknife Model Averaging estimator (labeled JMA), and (6) Plug-In averaging estimator (labeled Plug-In). The parameter of interest is µ = θ. To evaluate the finite behavior of the averaging estimators, we compute therisk based onthe quadraticloss function, i.e. E ( n(ˆθ θ ) ). The risk (expected squared error) is calculated by averaging across 5, 000 random samples. We normalize the risk by dividing by the optimal asymptotic risk. The optimal asymptotic risk is defined as w o ζw o, where ζ and w o are defined in (.3.) and (.3.3). The sample sizes are 50, 00, 50, 00 for M =, 8, 3, and 8. The risk functions are displayed in Figures A.-A.4 for (ρ,ρ ) = (0.3,0.), and Figures A.5-A.8 for (ρ,ρ ) = (0.6,0.4). In each figure, the risk is displayed for M =, 8, 3, and 8, respectively. The dotted line represents the AIC model selection estimator, the solid line with asterisk represents the BIC model selection estimator, the dash-dotted line represents the S-AIC model averaging estimator, the dash line with circle represents the S-BIC model averaging estimator, the dashed lines represents the JMA estimator, and the solid line represents the plug-in averaging estimator. There are several remarks about the simulations results. First, the risk of all estimators increases as the number of models increases. When we only consider the restricted and nonrestricted models, i.e. M =, all estimators have similar risk. Second, it can be seen that the plug-in averaging estimator dominates other estimators in most ranges of the population. The JMA estimator has smaller risk than the S-AIC estimator for DGP, but S-AIC achieves lower risk when M and are larger for DGP. The S-BIC estimator and the BIC model selection estimator have poor performance relative to the other methods in most cases. Also note that the model-averaging-type estimators have lower risk than the model-selection-type counterpart estimators. Third, all estimators have smaller normalized risk under heteroskedastic errors, but the ranking of the estimators in the heteroskedastic

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Working Paper 2016:1 Department of Statistics On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Sebastian

More information

Model Averaging in Predictive Regressions

Model Averaging in Predictive Regressions Model Averaging in Predictive Regressions Chu-An Liu and Biing-Shen Kuo Academia Sinica and National Chengchi University Mar 7, 206 Liu & Kuo (IEAS & NCCU) Model Averaging in Predictive Regressions Mar

More information

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006 Least Squares Model Averaging Bruce E. Hansen University of Wisconsin January 2006 Revised: August 2006 Introduction This paper developes a model averaging estimator for linear regression. Model averaging

More information

Essays on Least Squares Model Averaging

Essays on Least Squares Model Averaging Essays on Least Squares Model Averaging by Tian Xie A thesis submitted to the Department of Economics in conformity with the requirements for the degree of Doctor of Philosophy Queen s University Kingston,

More information

Model Averaging in Predictive Regressions

Model Averaging in Predictive Regressions MPRA Munich Personal RePEc Archive Model Averaging in Predictive Regressions Chu-An Liu and Biing-Shen Kuo Academia Sinica, National Chengchi University 8 March 206 Online at https://mpra.ub.uni-muenchen.de/706/

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Economics School of Economics -3 Jackknife Model Averaging for Quantile Regressions

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Moment and IV Selection Approaches: A Comparative Simulation Study

Moment and IV Selection Approaches: A Comparative Simulation Study Moment and IV Selection Approaches: A Comparative Simulation Study Mehmet Caner Esfandiar Maasoumi Juan Andrés Riquelme August 7, 2014 Abstract We compare three moment selection approaches, followed by

More information

Averaging Estimators for Regressions with a Possible Structural Break

Averaging Estimators for Regressions with a Possible Structural Break Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Size Distortion and Modi cation of Classical Vuong Tests

Size Distortion and Modi cation of Classical Vuong Tests Size Distortion and Modi cation of Classical Vuong Tests Xiaoxia Shi University of Wisconsin at Madison March 2011 X. Shi (UW-Mdsn) H 0 : LR = 0 IUPUI 1 / 30 Vuong Test (Vuong, 1989) Data fx i g n i=1.

More information

JACKKNIFE MODEL AVERAGING. 1. Introduction

JACKKNIFE MODEL AVERAGING. 1. Introduction JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified models for improved estimation

More information

JACKKNIFE MODEL AVERAGING. 1. Introduction

JACKKNIFE MODEL AVERAGING. 1. Introduction JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified) models for improved estimation

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Using all observations when forecasting under structural breaks

Using all observations when forecasting under structural breaks Using all observations when forecasting under structural breaks Stanislav Anatolyev New Economic School Victor Kitov Moscow State University December 2007 Abstract We extend the idea of the trade-off window

More information

Using Invalid Instruments on Purpose: Focused Moment Selection and Averaging for GMM

Using Invalid Instruments on Purpose: Focused Moment Selection and Averaging for GMM Using Invalid Instruments on Purpose: Focused Moment Selection and Averaging for GMM Francis J. DiTraglia a a Faculty of Economics, University of Cambridge Abstract In finite samples, the use of an invalid

More information

Economics 582 Random Effects Estimation

Economics 582 Random Effects Estimation Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell. ABSTRACT POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.) Statisticians are often faced with the difficult task of model

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Bruce E. Hansen Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts

More information

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms Byunghoon ang Department of Economics, University of Wisconsin-Madison First version December 9, 204; Revised November

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Jackknife Model Averaging for Quantile Regressions Xun Lu and Liangjun Su Department of Economics, Hong Kong University of Science & Technology School of Economics, Singapore Management University, Singapore

More information

Multi-Step Forecast Model Selection

Multi-Step Forecast Model Selection Multi-Step Forecast Model Selection Bruce E. Hansen April 2010 Preliminary Abstract This paper examines model selection and combination in the context of multi-step linear forecasting. We start by investigating

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Testing Error Correction in Panel data

Testing Error Correction in Panel data University of Vienna, Dept. of Economics Master in Economics Vienna 2010 The Model (1) Westerlund (2007) consider the following DGP: y it = φ 1i + φ 2i t + z it (1) x it = x it 1 + υ it (2) where the stochastic

More information

1 Introduction to Generalized Least Squares

1 Introduction to Generalized Least Squares ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

GMM Estimation and Testing

GMM Estimation and Testing GMM Estimation and Testing Whitney Newey July 2007 Idea: Estimate parameters by setting sample moments to be close to population counterpart. Definitions: β : p 1 parameter vector, with true value β 0.

More information

Linear Instrumental Variables Model Averaging Estimation

Linear Instrumental Variables Model Averaging Estimation Linear Instrumental Variables Model Averaging Estimation Luis F. Martins Department of Quantitative Methods, ISCE-LUI, Portugal Centre for International Macroeconomic Studies CIMS, UK luis.martins@iscte.pt

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Lecture 11 Weak IV. Econ 715

Lecture 11 Weak IV. Econ 715 Lecture 11 Weak IV Instrument exogeneity and instrument relevance are two crucial requirements in empirical analysis using GMM. It now appears that in many applications of GMM and IV regressions, instruments

More information

The Generalized Cochrane-Orcutt Transformation Estimation For Spurious and Fractional Spurious Regressions

The Generalized Cochrane-Orcutt Transformation Estimation For Spurious and Fractional Spurious Regressions The Generalized Cochrane-Orcutt Transformation Estimation For Spurious and Fractional Spurious Regressions Shin-Huei Wang and Cheng Hsiao Jan 31, 2010 Abstract This paper proposes a highly consistent estimation,

More information

Model averaging, asymptotic risk, and regressor groups

Model averaging, asymptotic risk, and regressor groups Quantitative Economics 5 2014), 495 530 1759-7331/20140495 Model averaging, asymptotic risk, and regressor groups Bruce E. Hansen University of Wisconsin This paper examines the asymptotic risk of nested

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Journal of Econometrics

Journal of Econometrics Journal of Econometrics 46 008 34 350 Contents lists available at ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom Least-squares forecast averaging Bruce E. Hansen

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

Additional Topics on Linear Regression

Additional Topics on Linear Regression Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Additional Topics 1 / 49 1 Tests for Functional Form Misspecification 2 Nonlinear

More information

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities João Victor Issler (FGV) and Claudia F. Rodrigues (VALE) August, 2012 J.V. Issler and C.F. Rodrigues () Forecast Models for

More information

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics GMM, HAC estimators, & Standard Errors for Business Cycle Statistics Wouter J. Den Haan London School of Economics c Wouter J. Den Haan Overview Generic GMM problem Estimation Heteroskedastic and Autocorrelation

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance V (u i x i ) = σ 2 is common to all observations i = 1,..., In many applications, we may suspect

More information

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments CIRJE-F-466 Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments Yukitoshi Matsushita CIRJE, Faculty of Economics, University of Tokyo February 2007

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

ESSAYS ON INSTRUMENTAL VARIABLES. Enrique Pinzón García. A dissertation submitted in partial fulfillment of. the requirements for the degree of

ESSAYS ON INSTRUMENTAL VARIABLES. Enrique Pinzón García. A dissertation submitted in partial fulfillment of. the requirements for the degree of ESSAYS ON INSTRUMENTAL VARIABLES By Enrique Pinzón García A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Economics) at the UNIVERSITY OF WISCONSIN-MADISON

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994).

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994). Chapter 4 Analysis of a Single Time Series Note: The primary reference for these notes is Enders (4). An alternative and more technical treatment can be found in Hamilton (994). Most data used in financial

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0034.R2 Title OPTIMAL MODEL AVERAGING OF VARYING COEFFICIENT MODELS Manuscript ID SS-2017-0034.R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0034

More information

GMM-based Model Averaging

GMM-based Model Averaging GMM-based Model Averaging Luis F. Martins Department of Quantitative Methods, ISCTE-LUI, Portugal Centre for International Macroeconomic Studies (CIMS), UK (luis.martins@iscte.pt) Vasco J. Gabriel CIMS,

More information

Robust Inference. Bruce E. Hansen University of Wisconsin. This draft: September 2014 Preliminary. Do not cite. Abstract

Robust Inference. Bruce E. Hansen University of Wisconsin. This draft: September 2014 Preliminary. Do not cite. Abstract Robust Inference Bruce E. Hansen University of Wisconsin This draft: September 2014 Preliminary. Do not cite. Abstract This paper examines inference in sieve nonparametric regression allowing for asymptotic

More information

Panel Threshold Regression Models with Endogenous Threshold Variables

Panel Threshold Regression Models with Endogenous Threshold Variables Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Asymptotic distribution of GMM Estimator

Asymptotic distribution of GMM Estimator Asymptotic distribution of GMM Estimator Eduardo Rossi University of Pavia Econometria finanziaria 2010 Rossi (2010) GMM 2010 1 / 45 Outline 1 Asymptotic Normality of the GMM Estimator 2 Long Run Covariance

More information

Multiple Equation GMM with Common Coefficients: Panel Data

Multiple Equation GMM with Common Coefficients: Panel Data Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES

A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES MICHAEL O HARA AND CHRISTOPHER F. PARMETER Abstract. This paper presents a Monte Carlo comparison of

More information

Variable Selection in Predictive Regressions

Variable Selection in Predictive Regressions Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when

More information

Weighted-Average Least Squares Prediction

Weighted-Average Least Squares Prediction Econometric Reviews ISSN: 0747-4938 (Print) 1532-4168 (Online) Journal homepage: http://www.tandfonline.com/loi/lecr20 Weighted-Average Least Squares Prediction Jan R. Magnus, Wendun Wang & Xinyu Zhang

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Model Selection in the Presence of Incidental Parameters

Model Selection in the Presence of Incidental Parameters Model Selection in the Presence of Incidental Parameters Yoonseok Lee University of Michigan July 2012 Abstract This paper considers model selection of nonlinear panel data models in the presence of incidental

More information

Discrepancy-Based Model Selection Criteria Using Cross Validation

Discrepancy-Based Model Selection Criteria Using Cross Validation 33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

A better way to bootstrap pairs

A better way to bootstrap pairs A better way to bootstrap pairs Emmanuel Flachaire GREQAM - Université de la Méditerranée CORE - Université Catholique de Louvain April 999 Abstract In this paper we are interested in heteroskedastic regression

More information

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets Econometrics II - EXAM Outline Solutions All questions hae 5pts Answer each question in separate sheets. Consider the two linear simultaneous equations G with two exogeneous ariables K, y γ + y γ + x δ

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

The risk of machine learning

The risk of machine learning / 33 The risk of machine learning Alberto Abadie Maximilian Kasy July 27, 27 2 / 33 Two key features of machine learning procedures Regularization / shrinkage: Improve prediction or estimation performance

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

This chapter reviews properties of regression estimators and test statistics based on

This chapter reviews properties of regression estimators and test statistics based on Chapter 12 COINTEGRATING AND SPURIOUS REGRESSIONS This chapter reviews properties of regression estimators and test statistics based on the estimators when the regressors and regressant are difference

More information

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Heteroskedasticity and Autocorrelation Consistent Standard Errors NBER Summer Institute Minicourse What s New in Econometrics: ime Series Lecture 9 July 6, 008 Heteroskedasticity and Autocorrelation Consistent Standard Errors Lecture 9, July, 008 Outline. What are HAC

More information

ECONOMETRICS MODEL SELECTION: THEORY AND APPLICATIONS. A Dissertation WEI LONG

ECONOMETRICS MODEL SELECTION: THEORY AND APPLICATIONS. A Dissertation WEI LONG ECONOMETRICS MODEL SELECTION: THEORY AND APPLICATIONS A Dissertation by WEI LONG Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the requirements

More information

On Uniform Asymptotic Risk of Averaging GMM Estimators

On Uniform Asymptotic Risk of Averaging GMM Estimators On Uniform Asymptotic Risk of Averaging GMM Estimators Xu Cheng Zhipeng Liao Ruoyao Shi This Version: August, 28 Abstract This paper studies the averaging GMM estimator that combines a conservative GMM

More information

Estimation of Time-invariant Effects in Static Panel Data Models

Estimation of Time-invariant Effects in Static Panel Data Models Estimation of Time-invariant Effects in Static Panel Data Models M. Hashem Pesaran University of Southern California, and Trinity College, Cambridge Qiankun Zhou University of Southern California September

More information

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction Instrumental Variables Estimation and Weak-Identification-Robust Inference Based on a Conditional Quantile Restriction Vadim Marmer Department of Economics University of British Columbia vadim.marmer@gmail.com

More information

Asymmetric least squares estimation and testing

Asymmetric least squares estimation and testing Asymmetric least squares estimation and testing Whitney Newey and James Powell Princeton University and University of Wisconsin-Madison January 27, 2012 Outline ALS estimators Large sample properties Asymptotic

More information

An Encompassing Test for Non-Nested Quantile Regression Models

An Encompassing Test for Non-Nested Quantile Regression Models An Encompassing Test for Non-Nested Quantile Regression Models Chung-Ming Kuan Department of Finance National Taiwan University Hsin-Yi Lin Department of Economics National Chengchi University Abstract

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Department of Economics Seminar Series. Yoonseok Lee University of Michigan. Model Selection in the Presence of Incidental Parameters

Department of Economics Seminar Series. Yoonseok Lee University of Michigan. Model Selection in the Presence of Incidental Parameters Department of Economics Seminar Series Yoonseok Lee University of Michigan Model Selection in the Presence of Incidental Parameters Friday, January 25, 2013 9:30 a.m. S110 Memorial Union Model Selection

More information