JACKKNIFE MODEL AVERAGING. 1. Introduction

Size: px
Start display at page:

Download "JACKKNIFE MODEL AVERAGING. 1. Introduction"

Transcription

1 JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified models for improved estimation of an unknown conditional mean in the face of model uncertainty in heteroskedastic error settings. We propose a jackknife model averaging ( estimator which selects the weights by minimizing a cross-validation criterion. In models that are linear in the parameters, this criterion is quadratic in the weights, so computation of the weights is a simple application of quadratic programming. We show that our estimator is asymptotically optimal in the sense of achieving the lowest possible expected squared error. Monte Carlo simulations and an illustrative application show that can achieve significant efficiency gains over existing model selection and averaging methods in the presence of heteroskedasticity. 1. Introduction Confronting parametric model uncertainly has a rich heritage in the statistics and econometrics literature. The two most popular approaches towards dealing with model uncertainty are model selection and model averaging, while both Bayesian and non-bayesian approaches have been proposed. Model selection remains perhaps the most popular approach for dealing with model uncertainly. This method has the user first adopt an estimation criterion such as the Akaike information criterion (, Akaike (1970 and then select from among a set of candidate models that which scores most highly based upon this criterion. Unfortunately, a variety of such criteria have been proposed in the literature, different criteria favor different models from within a given set of candidate models, and seasoned practitioners know that some criteria will favor more parsimonious models (e.g., the Schwarz-Bayes information criterion (; Schwarz (1978 while others will favor more heavily parameterized models (e.g.,. Model averaging, on the other hand, deals with model uncertainly not by having the user select one model from among a set of candidate models according to a criterion such as or, but rather by averaging over the set of candidate models in a particular manner. Many readers will no doubt be familiar with the Bayesian model averaging literature, and we direct the interested reader to Hoeting, Madigan, Raftery & Volinsky (1999 for a comprehensive review of this literature. There also exist frequentist methods for model averaging. See, for instance, Buckland, Burnhamn & Augustin (1997 who suggested a method based upon exponential weights, and Hansen (2007 Date: July 13, Hansen would like to gratefully acknowledge support from the National Science Foundation. Racine would like to gratefully acknowledge support from Natural Sciences and Engineering Research Council of Canada (NSERC: the Social Sciences and Humanities Research Council of Canada (SSHRC: and the Shared Hierarchical Academic Research Computing Network (SHARCNET: 1

2 2 BRUCE E. HANSEN AND JEFFREY S. RACINE who suggested a Mallows Model Averaging ( method based on a Mallows criterion. Existing frequentist approaches, however, exclude heteroskedasticity, which limits their applicability. In this paper we propose a new frequentist model averaging method which we term jackknife model averaging (hereafter that selects the weights by minimizing a cross-validation criterion. In models that are linear in the parameters the cross-validation criterion is a simple quadratic function of the weights, so the solution is found through a standard application of numerical quadratic programming. Applying the theory developed in Li (1987, Andrews (1991 and Hansen (2007, we provide an asymptotic optimality justification for the estimator which permits heteroskedasticity of unknown form, unlike existing model averaging methods. We show that the estimator is asymptotically optimal in the sense of achieving the lowest possible expected squared error over the class of linear estimators constructed from a countable set of weights. The class of linear estimators includes but is not limited to linear least-squares, ridge regression, Nadaraya-Watson and local polynomial kernel regression with fixed bandwidths, nearest neighbor estimators, series estimators, estimators of additive interaction models, and spline estimators. In particular, our analysis of nested linear models shows that achieves asymptotic efficient estimation quite generally. We demonstrate the potential efficiency gains through a simple Monte Carlo experiment. In a classic regression setting, we show that achieves lower mean squared error than competitive methods. In the presence of homoskedastic errors, and are nearly equivalent, but when the errors are heteroskedastic, has significantly lower MSE. We also illustrate the method with an empirical application to cross-section earnings forecasting. We find that achieves out-of-sample forecast squared error which is either equivalent or lower than that achieved by all other methods considered. The remainder of the paper is organized as follows. Section 2 presents the jackknife weighting method. Section 3 provides an asymptotic optimality theory under high-level conditions for a broad class of linear estimators. Section 4 presents the optimality theory for nested linear regression models. Section 5 reports the Monte Carlo simulation experiment, and Section 6 the application to forecasting earnings. Section 7 concludes, and the mathematical proofs are presented in the appendix. 2. Jackknife Weighting Consider a sample of independent observations that satisfy y i = µ i + e i for i = 1,..., n, where µ i is the unobserved mean of y i which depends on the regressor vector x i. The error e i is mean zero with heteroskedastic variance σ 2 i. Define the vectors y = (y 1,..., y n, µ = (µ 1,..., µ n and e = (e 1,..., e n. Suppose that to estimate µ there is a set of M estimators {ˆµ 1,..., ˆµ M} where M may depend on n. For example, in the case of least-squares estimation, ˆµ m = P m y where P m = X m (X m X m 1 X m and the i th row of X m is x m i, a function of x i.

3 JACKKNIFE MODEL AVERAGING 3 Let w m be a weight assigned to estimator ˆµ m and define the weight vector w = ( w 1,..., w M. Typically we will restrict the weights to be non-negative, less than or equal to one, and sum to one. In this case w lies on the R M unit simplex: { } (1 H n = w [0, 1] M : w m = 1. An averaging estimator for µ takes the form ˆµ (w = w mˆµ m. Note that we can write ˆµ (w = ˆµw where ˆµ = (ˆµ 1,..., ˆµ M is the n M matrix of estimates. In this paper we consider jackknife selection of w (also known as leave-one-out cross-validation. The jackknife estimate of the mean is µ (w = w m µ m where µ m = ( µ m 1,..., µm n and µ m i is a version of ˆµ m i computed with the i th observation deleted. ( 1 In the least-squares example discussed above, µ m i = x m i X m ( i Xm ( i X m ( i y ( i, where X m ( i and y ( i denote the matrices X m and y with the i th row deleted. We can alternatively write µ (w = µw where µ = ( µ 1,..., µ M. Thus the jackknife residual vector is ẽ (w = y µ (w = ẽw where ẽ = ( ẽ 1,..., ẽ M and ẽ m = y µ m. This estimator is particularly convenient in the case of linear regression estimates, for it is well known that ẽ m = D m ê m where ê m = y P m y is the least squares residual vector and D m is the n n diagonal matrix with the i th diagonal element equal to (1 h m ii 1, where h m ii is the i th diagonal element of P m (see, for example, equation (1.4 of Li (1987, or the generalization by Racine (1997. Using this representation, the jackknife residual vector ẽ m can be computed with a simple linear operation and does not require n separate regressions. The jackknife estimate of expected apparent error 1 is E n (w = 1 n ẽ (w 2 = w S n w where denotes the Euclidean norm and S n = 1 nẽ ẽ is M M. The jackknife (or cross-validation choice of w is the value which minimizes E n (w. This is (2 ŵ = argmin w H n E n (w. The estimator of µ is ˆµ (ŵ. Even though E n (w is a quadratic function of w, the solution to (2 is not typically available in closed form due to the inequality constraints on w. The minimization problem (2 falls in the class of quadratic programming, for which numerical solutions have been thoroughly studied and 1 For a detailed overview of expected apparent and excess error, we direct the reader to Efron (1982, Chapter 7.

4 4 BRUCE E. HANSEN AND JEFFREY S. RACINE algorithms are widely available. For example, in the R language (R Development Core Team (2007 it is solved using the quadprog package, in GAUSS it is solved using the qprog command, and in MATLAB the quadprog command. Even when M is quite large the solution to (2 is computationally fast using any of these packages. Algebraically, (2 is solving a least-squares problem. The vector ŵ is the M 1 coefficient vector obtained by the constrained regression of y on the n M matrix µ. We can write this regression problem as y = µŵ + ẽ. The jackknife weight vector ŵ are the weights which find the linear combination of the different estimators which yield the lowest squared error. This is a generalization of jackknife model selection, which picks the individual model which minimizes the leave-one-out squared error. Also, it is useful to note that the unrestricted solution to (2 is simply ŵ = ( µ µ 1 µ y. These are the jackknife weights obtained if the restrictions (1 are not imposed. While it may appear intuitive to use these unrestricted weights, the optimality theory suggests otherwise and we therefore recommend that the weights always be restricted to the unit simplex (1. Define the expected squared error 3. Asymptotic Optimality (3 R n (w = E 1 n µ ˆµ (w 2. We seek conditions under which the jackknife selected weight vector ŵ is optimal in the sense of making R n (w as small as possible. Ideally, we wish to show that R n (ŵ inf w Hn R n (w p 1. For technical reasons we follow Hansen (2007 and establish a more limited result by replacing H n with a countably discrete subset H n. Let ŵ be the weight vector selected under this restriction: ŵ = argmin E n (w. w Hn Our optimality theory also restricts attention to linear estimators, which take the form ˆµ m = P m y where P m is not a function of y. As mentioned in the introduction, this class of estimators includes linear least-squares, ridge regression, Nadaraya-Watson and local polynomial kernel regression with fixed bandwidths, nearest neighbor estimators, series estimators, estimators of additive interaction models, and spline estimators. When the class of estimators is linear, the averaging estimator is also linear. That is, we can write ˆµ (w = P (w y where (4 P (w = is a linear operator indexed by w. w m P m

5 JACKKNIFE MODEL AVERAGING 5 Similarly, the jackknife estimates take the linear form µ m = P m y and µ (w = P (w y where P (w = M wm P m, and the matrices P m and P (w have zeros on the diagonal. In this section we establish optimality for the estimator under a set of high-level conditions. Define the jackknife average squared error R n (w = E 1 n µ µ (w 2. Let λ (A denote the largest eigenvalue of the matrix A. For some integer N 1, assume the following conditions hold. (A.1 (A.2 (A.3 (A.4 (A.5 (A.6 0 < inf i lim n w H n w H n lim σi 2 sup σi 2 <, i sup Ee 4N i <, i sup n w Hn sup w H n λ (P (w <, (nr n (w N 0, sup λ ( P (w <, R n (w R n (w 1 0. Theorem 1. If (A.1-(A.6 hold, then ŵ is asymptotically optimal in the sense that (5 R n (ŵ inf w H n R n (w p 1. Condition (A.1 excludes unbounded heteroskedasticity and (A.2 is a moment bound. Condition (A.3 is quite mild; typically λ (P n (w 1. To explain condition (A.4, note that in most applications the expected squared error R n (w is of order n 1+δ for some 0 δ < 1. If the cardinality of H n is of polynomial order and δ > 0, then (A.4 follows for sufficiently large N. Condition (A.5 is an analogue of (A.3 for the leave-one-out estimator. Condition (A.6 says that as n gets large, the fit of the regular and leave-one-out estimators become equivalent, uniformly over the class of averaging estimators. This excludes the possibility of extremely unbalanced designs. Theorem 1 is an application of results in Li (1987 and Andrews (1991. The theorem shows that the average squared error of the jackknife estimator is asymptotically as small as the infeasible best possible choice in the index set. It is similar to Theorem 5.1 of Li (1987 and Theorem 4.1* of Andrews (1991 which also give high-level conditions for the asymptotic optimality of cross-validation, but makes two different assumptions. One difference is that our condition (A.6 is somewhat stronger than theirs, as (A.6 specifies uniform convergence over the index set H n, while Li and Andrews only require such convergence for sequences w n for which R n (w n 0 or R n (w n 0. This distinction does not appear to exclude relevant applications, however. The other difference is that Li and Andrews assume that inf w Hn R n (w 0, while this assumption

6 6 BRUCE E. HANSEN AND JEFFREY S. RACINE is not required in our Theorem 1. Their assumption means that as n the optimal expected squared error declines to zero. In practice this means that the index set H n needs to expand with n in a way so that the estimate P (w y gets close to the unknown mean µ. Our Theorem 1 does not require this extra assumption. 4. Nested Linear Models For the remainder of the paper we focus attention on nested linear regression estimates. These are estimators which take the form ˆµ m = P m y where P m = X m (X m X m 1 X m and X m is an n k m matrix of regressors. Since the models are nested k m is an increasing sequence and span (X m span ( X m+1. To apply Theorem 1 we need to construct a countably discrete subset Hn of H n. For N as defined 1 in conditions (A.2 and (A.4 let the weights w m be restricted to the set {0, N 1, 2 N 1,, 1}. Let Hn be the subset of H n restricted to this set of weights. Let h m ii denote the i th diagonal element of P m. Theorem 2. Suppose that conditions (A.1 and (A.2 hold, (A.7 and max max 1 m M 1 i n hm ii 0 (A.8 ξ n = inf w H n nr n (w as n. Then conditions (A.3-(A.6 hold, and hence ŵ is asymptotically optimal in the sense of (5. Condition (A.7 requires that the self-weights h m ii be asymptotically negligible for all models considered. This is a reasonable restriction, as it excludes only extremely unbalanced designs. Instead of (A.7, Li (1987 and Andrews (1991 assume the uniform bound h m ii Λk m/n for some Λ <. If k M /n 0 this implies (A.7, but in general neither condition is strictly stronger than the other. Condition (A.8 is identical to the condition in Hansen (2007 for Mallows weight selection, and is quite similar to the conditions of Li (1987 and Andrews (1991. It requires that all finite dimensional models are approximations. Theorem 2 is our main result. It shows that under quite minimal primitive conditions, the estimator is asymptotically optimal in the class of weighted averages of linear estimators. The most significant limitation is the restriction to the countable weight set Hn. However, as N can be selected to be arbitrarily large, this restriction need not be viewed as substantive. In any event, we view this limitation as a technical artifact of the proof technique. For empirical practice we ignore the restriction to Hn and use the unrestricted weights ŵ as defined in (2.

7 JACKKNIFE MODEL AVERAGING 7 5. Finite-Sample Performance In this section we investigate the finite sample mean squared error of the estimator via a modest Monte Carlo experiment. We follow Hansen (2007 and consider an infinite-order regression model of the form y i = j=1 θ jx ji + ɛ i, i = 1,..., n. We set x 1i = 1 to be the intercept, while the remaining x ji are independent and identically distributed N(0, 1 random variates. The error ɛ i is also N(0, σi 2 and is independent of the x ji. For the homoskedastic simulation we set σ i = 1 for all i, while for the heteroskedastic simulation we set σ i = x 2 2i which has a mean value of 1. (Other specifications for the heteroskedasticity were considered and produced similar qualitative results. The parameters are determined by the rule θ j = c 2αj α 1/2. The population R 2 = c 2 /(1 + c 2 is controlled by the parameter c. The sample size is varied between n = 25, 50, 75, and 100. The parameter α is varied from 1/2, 1, and 3/2. 2 Larger values of α imply that the coefficients θ j decline more quickly with j. The number of models M is determined by the rule M = 3n 1/3 (so M = 9, 11, 13, and 14 for the four sample sizes considered herein.the coefficient c was selected to control the population R 2 to vary on a grid between 0.1 and 0.8. We consider four estimators: (1 model selection (, (2 model selection (, (3 model selection (, (4 Jackknife model averaging (, and (5 Mallows model averaging (, Hansen (2007. To evaluate the estimators, we compute the risk (expected squared error. We do this by computing averages across 10,000 simulation draws. For each parameterization, we normalize the risk by dividing by the risk of the infeasible optimal least squares estimator (the risk of the best-fitting model m when averaged over all replications Homoskedastic Error Processes. We first compare risk under homoskedastic errors. The risk calculations are summarized in Figure 1. The four panels in each graph display results for a variety of sample sizes. In each panel, risk (expected squared error is displayed on the y axis and the population R 2 is displayed on the x axis. It can be seen from Figure 1 that for the homoskedastic data generating process, the proposed method dominates it peers for all sample sizes and ranges of the population R 2 considered, though there is only a small gain to be had relative to the method which performs very well overall and becomes indistinguishable as n increases beyond some modest level Heteroskedastic Error Processes. Next, we compare risk under heteroskedastic errors, and the risk calculations are summarized in Figure 2. It can be seen from Figure 2 that for the heteroskedastic data generating process considered herein, the proposed method dominates it peers for all sample sizes and ranges of goodness of fit considered. Furthermore, the gain over that of the approach is substantially larger than for the homoskedastic case summarized in Figure 1, especially for small values of R 2. 2 We report results for 1/2 only for space considerations. All results are available on request from the authors.

8 8 BRUCE E. HANSEN AND JEFFREY S. RACINE n=25 n=50 n=75 n=100 Figure 1. Finite-sample performance, homoskedastic errors, α = 1/2, σ i = Forecasting Earnings We employ Wooldridge s (2003, pg. 226 popular wage1 dataset, a cross-section consisting of a random sample taken from the U.S. Current Population Survey for the year There are 526 observations in total. The dependent variable is the log of average hourly earnings ( lwage, while explanatory variables include dummy variables nonwhite, female, married, numdep, smsa, northcen, south, west, construc, ndurman, trcommpu, trade, services, profserv, profocc, clerocc, servocc, nondummy variables educ, exper, tenure, and interaction variables nonwhite educ, nonwhite exper,

9 JACKKNIFE MODEL AVERAGING 9 n=25 n=50 n=75 n=100 Figure 2. Finite-sample performance, heteroskedastic errors, α = 1/2, σ i = x 2 i2. nonwhite tenure, female educ, female exper, female tenure, married educ, married exper, and married tenure. 3 We presume there exists uncertainly about the appropriate model but need to forecast earnings for a set of hold-out data. We consider two cases, i a set of thirty models ranging from the unconditional mean (k = 1 through a model that includes all variables listed above (k = 30, and ii a set of eleven models that use dummy variables female and married, and non-dummy variables educ, exper, and tenure that range from simple bivariate models that have as covariates each nondummy regressor (k = 2 through a model that contains third-order polynomials in all non-dummy regressors and allows for interactions among all variables (k = See for a full description of the data.

10 10 BRUCE E. HANSEN AND JEFFREY S. RACINE Next, we shuffle the sample into a training set of size n 1 and an evaluation set of size n 2 = n n 1, and select models via model selection based on,, and, then consider model average-based models using and. Finally, we evaluate the models on the independent hold-out data computing their average square prediction error (ASPE. We repeat this procedure 100 times then report the median ASPE over the 100 splits. We vary n 1 and consider n 1 = 100, 200, 300, 400, 500. Tables 1 and 2 report the median ASPE for this experiment. Entries greater than one indicate inferior performance relative to the method. Table 1. Case i, relative out-of-sample predictive efficiency. Entries greater than one indicate inferior performance relative to the method. n C p Table 2. Case ii, relative out-of-sample predictive efficiency. Entries greater than one indicate inferior performance relative to the method. n C p Table 1 reveals that the proposed method delivers models that are no worse than existing model selection-based models or model-average based models across the range of sample sizes considered, often delivering an improved model. The method works very well for case i and yields models comparable to those selected by the method. However, this is not the situation for case ii, as Table 2 reveals. It would appear that the method is somewhat sensitive to the estimate of σ 2 needed for its computation, and relying on a large approximating model (Hansen (2007, pg may not be sufficient to deliver optimal results. This issue deserves further study. This application suggests that the proposed method could be of interest to those who forecast using model-selection criterion. These results, particularly in light of the simulation evidence, suggest that the method could be recommended for general use though it would of course be prudent to base this recommendation on more extensive simulation evidence than that outlined in Section Conclusion We propose a frequentist method for model averaging that does not preclude heteroskedastic settings, unlike many of its peers. The method is computationally tractable, is asymptotically

11 JACKKNIFE MODEL AVERAGING 11 optimal, and its finite-sample performance is better than that exhibited by its peers. An application to forecasting wages is undertaken that demonstrates performance not worse than other methods of model selection considered by way of comparison, and performance that is often better for the range of sample sizes investigated. References Akaike, H. (1970, Statistical predictor identification, Annals of the Institute of Statistics and Mathematics 22, Andrews, D. W. K. (1991, Asymptotic optimality of generalized C L, cross-validation, and generalized crossvalidation in regression with heteroskedastic errors, Journal of Econometrics 47, Buckland, S. T., Burnhamn, K. P. & Augustin, N. H. (1997, Model selection: Biometrics 53, An integral part of inference, Efron, B. (1982, The Jackknife, the Bootstrap, and Other Resampling Plans, Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania Hansen, B. E. (2007, Least squares model averaging, Econometrica 75, Hoeting, J. A., Madigan, D., Raftery, A. E. & Volinsky, C. T. (1999, Bayesian model averaging: A tutorial, Statistical Science 14, Li, K.-C. (1987, Asymptotic optimality for C p, C L, cross-validation and generalized cross- validation: Discrete index set, The Annals of Statistics 15, R Development Core Team (2007, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN URL: Racine, J. (1997, Feasible cross-validatory model selection for general stationary processes, Journal of Applied Econometrics 12, Schwarz, G. (1978, Estimating the dimension of a model, The Annals of Statistics 6, Wooldridge, J. M. (2003, Introductory Econometrics, Thompson South-Western.

12 12 BRUCE E. HANSEN AND JEFFREY S. RACINE Proof of Theorem 1. Observe that w H n Appendix A. Mathematical Proofs ( n R N n (w = w H n w H n ( (nr n (w N Rn (w N R n (w (nr n (w N sup w H n ( Rn (w R n (w (6 0 by (A.4 and (A.6. Set Ω = diag { σ1 2,..., n} σ2 and note that Ee P (w e = tr ( P (w Ω = 0 since the diagonal elements of P (w are zero. Then by the argument given in the proof of Theorem 2.1 of Li (1987 as extended by Andrews (1991 to the heteroskedastic case, conditions (A.1, (A.2, (A.5 and (6 are sufficient to imply that R n (ŵ inf w H n Rn (w p 1. Combined with condition (A.6 we obtain (5 as desired. N Proof of Theorem 2. Hansen (2007, Lemma 1 shows that (A.3 holds, and in the proof of his Theorem 1, shows that (A.4 holds under homoskedasticity, (A.2 and (A.8. However, if σ 2 in that proof is replaced with inf i σi 2, the argument is unchanged and (A.4 is established. Now as shown in equation (1.4 of Li (1987, (7 P m = D m (P m I + I where D m is the n n diagonal matrix with the i th diagonal element equal to (1 h m ii 1. Thus P (w = = w m P m w m D m (P m I + I. Let λ m and λ m denote the largest and smallest diagonal elements of D m. Condition (A.7 implies that (8 λm = 1 + o(1 = λ m uniformly in m M. Hence and thus (A.5 holds. λ ( P (w 1 + w m λ (D m = 2 (1 + o(1

13 It remains to establish (A.6. We calculate that JACKKNIFE MODEL AVERAGING 13 nr n (w = µ (I P (w (I P (w µ + tr (P (w P (w Ω = l=1 w m w l [ µ (I P m (I P l µ + tr (P m P l Ω ] where Ω = diag { σ1 2,..., n} σ2. The first equality is from Andrews (1991 and the second uses (4. Similarly, n R [ ( n (w = w m w l µ I P m (I P l µ + tr ( P P ] m l Ω. l=1 It is sufficient to establish that (9 tr ( P P m l Ω = tr (P m P l Ω (1 + o(1 and ( (10 µ I P m (I P l µ = µ (I P m (I P l µ (1 + o(1 where the o(1 term is uniform in m M and l M. First take (9. Using (7 and the fact that the diagonal elements of P m are zero, (8, again (7, and finally (8 tr ( P P m l Ω = tr ( P md l P l Ω tr ( P md l Ω + tr ( P mω = tr ( P md l P l Ω = tr ( P mp l Ω (1 + o(1 = (tr ((P m I D m P l Ω + tr (P l Ω (1 + o(1 = (tr ((P m I P l Ω + tr (P l Ω (1 + o(1 = tr (P m P l Ω (1 + o(1 establishing (9. Finally (7 implies I P m = D m (I P m, which combined with (8 directly implies (10. We have established (9 and (10 which are sufficient for (A.6. Conditions (A.3-(A.6 have been verified. An application of Theorem 1 yields (5 as stated. Bruce E. Hansen: Department of Economics, Social Science Building, University of Wisconsin, Madison, WI USA , bhansen@ssc.wisc.edu. Jeffrey S. Racine: Department of Economics, Kenneth Taylor Hall, McMaster University, Hamilton, ON Canada L8S 4M4, racinej@mcmaster.ca.

JACKKNIFE MODEL AVERAGING. 1. Introduction

JACKKNIFE MODEL AVERAGING. 1. Introduction JACKKNIFE MODEL AVERAGING BRUCE E. HANSEN AND JEFFREY S. RACINE Abstract. We consider the problem of obtaining appropriate weights for averaging M approximate (misspecified) models for improved estimation

More information

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006 Least Squares Model Averaging Bruce E. Hansen University of Wisconsin January 2006 Revised: August 2006 Introduction This paper developes a model averaging estimator for linear regression. Model averaging

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0034.R2 Title OPTIMAL MODEL AVERAGING OF VARYING COEFFICIENT MODELS Manuscript ID SS-2017-0034.R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0034

More information

Multi-Step Forecast Model Selection

Multi-Step Forecast Model Selection Multi-Step Forecast Model Selection Bruce E. Hansen April 2010 Preliminary Abstract This paper examines model selection and combination in the context of multi-step linear forecasting. We start by investigating

More information

Averaging Estimators for Regressions with a Possible Structural Break

Averaging Estimators for Regressions with a Possible Structural Break Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection

More information

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts

Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Bruce E. Hansen Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Variable Selection in Predictive Regressions

Variable Selection in Predictive Regressions Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Economics School of Economics -3 Jackknife Model Averaging for Quantile Regressions

More information

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003. A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about

More information

Journal of Econometrics

Journal of Econometrics Journal of Econometrics 46 008 34 350 Contents lists available at ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom Least-squares forecast averaging Bruce E. Hansen

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Jackknife Model Averaging for Quantile Regressions

Jackknife Model Averaging for Quantile Regressions Jackknife Model Averaging for Quantile Regressions Xun Lu and Liangjun Su Department of Economics, Hong Kong University of Science & Technology School of Economics, Singapore Management University, Singapore

More information

Essays on Least Squares Model Averaging

Essays on Least Squares Model Averaging Essays on Least Squares Model Averaging by Tian Xie A thesis submitted to the Department of Economics in conformity with the requirements for the degree of Doctor of Philosophy Queen s University Kingston,

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

Using all observations when forecasting under structural breaks

Using all observations when forecasting under structural breaks Using all observations when forecasting under structural breaks Stanislav Anatolyev New Economic School Victor Kitov Moscow State University December 2007 Abstract We extend the idea of the trade-off window

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators Robert M. de Jong Department of Economics Michigan State University 215 Marshall Hall East

More information

Model Averaging in Predictive Regressions

Model Averaging in Predictive Regressions Model Averaging in Predictive Regressions Chu-An Liu and Biing-Shen Kuo Academia Sinica and National Chengchi University Mar 7, 206 Liu & Kuo (IEAS & NCCU) Model Averaging in Predictive Regressions Mar

More information

Optimizing forecasts for inflation and interest rates by time-series model averaging

Optimizing forecasts for inflation and interest rates by time-series model averaging Optimizing forecasts for inflation and interest rates by time-series model averaging Presented at the ISF 2008, Nice 1 Introduction 2 The rival prediction models 3 Prediction horse race 4 Parametric bootstrap

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Discrepancy-Based Model Selection Criteria Using Cross Validation

Discrepancy-Based Model Selection Criteria Using Cross Validation 33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin June 2017 Bruce Hansen (University of Wisconsin) Exact

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Testing for Regime Switching in Singaporean Business Cycles

Testing for Regime Switching in Singaporean Business Cycles Testing for Regime Switching in Singaporean Business Cycles Robert Breunig School of Economics Faculty of Economics and Commerce Australian National University and Alison Stegman Research School of Pacific

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

A Simple, Graphical Procedure for Comparing. Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing. Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb October 26, 2015 > Abstract In this paper, we utilize a new

More information

Problems in model averaging with dummy variables

Problems in model averaging with dummy variables Problems in model averaging with dummy variables David F. Hendry and J. James Reade Economics Department, Oxford University Model Evaluation in Macroeconomics Workshop, University of Oslo 6th May 2005

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Non-Parametric Dependent Data Bootstrap for Conditional Moment Models

Non-Parametric Dependent Data Bootstrap for Conditional Moment Models Non-Parametric Dependent Data Bootstrap for Conditional Moment Models Bruce E. Hansen University of Wisconsin September 1999 Preliminary and Incomplete Abstract A new non-parametric bootstrap is introduced

More information

Quantile regression and heteroskedasticity

Quantile regression and heteroskedasticity Quantile regression and heteroskedasticity José A. F. Machado J.M.C. Santos Silva June 18, 2013 Abstract This note introduces a wrapper for qreg which reports standard errors and t statistics that are

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine May 22, 2009 Abstract This paper provides

More information

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner Econometrics II Nonstandard Standard Error Issues: A Guide for the Practitioner Måns Söderbom 10 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,

More information

Introduction to Eco n o m et rics

Introduction to Eco n o m et rics 2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Introduction to Eco n o m et rics Third Edition G.S. Maddala Formerly

More information

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors Michael Pfaffermayr August 23, 2018 Abstract In gravity models with exporter and importer dummies the robust standard errors of

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.

ABSTRACT. POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell. ABSTRACT POST, JUSTIN BLAISE. Methods to Improve Prediction Accuracy under Structural Constraints. (Under the direction of Howard Bondell.) Statisticians are often faced with the difficult task of model

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Course Packet The purpose of this packet is to show you one particular dataset and how it is used in

More information

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017 MLMED User Guide Nicholas J. Rockwood The Ohio State University rockwood.19@osu.edu Beta Version May, 2017 MLmed is a computational macro for SPSS that simplifies the fitting of multilevel mediation and

More information

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Title Overlapping Sub-sampling and invariance to initial conditions By Maria Kyriacou

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Asymptotic distribution of GMM Estimator

Asymptotic distribution of GMM Estimator Asymptotic distribution of GMM Estimator Eduardo Rossi University of Pavia Econometria finanziaria 2010 Rossi (2010) GMM 2010 1 / 45 Outline 1 Asymptotic Normality of the GMM Estimator 2 Long Run Covariance

More information

The Role of "Leads" in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji

The Role of Leads in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji he Role of "Leads" in the Dynamic itle of Cointegrating Regression Models Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji Citation Issue 2006-12 Date ype echnical Report ext Version publisher URL http://hdl.handle.net/10086/13599

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

Analysis of Cross-Sectional Data

Analysis of Cross-Sectional Data Analysis of Cross-Sectional Data Kevin Sheppard http://www.kevinsheppard.com Oxford MFE This version: November 8, 2017 November 13 14, 2017 Outline Econometric models Specification that can be analyzed

More information

Econometric Forecasting

Econometric Forecasting Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

Instrumental Variable Estimation with Heteroskedasticity and Many Instruments

Instrumental Variable Estimation with Heteroskedasticity and Many Instruments Instrumental Variable Estimation with Heteroskedasticity and Many Instruments Jerry A. Hausman Department of Economics M.I.T. Tiemen Woutersen Department of Economics University of Arizona Norman Swanson

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin October 2018 Bruce Hansen (University of Wisconsin) Exact

More information

ARIMA Modelling and Forecasting

ARIMA Modelling and Forecasting ARIMA Modelling and Forecasting Economic time series often appear nonstationary, because of trends, seasonal patterns, cycles, etc. However, the differences may appear stationary. Δx t x t x t 1 (first

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Working Paper 2016:1 Department of Statistics On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Sebastian

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction Econometrics I Professor William Greene Stern School of Business Department of Economics 1-1/40 http://people.stern.nyu.edu/wgreene/econometrics/econometrics.htm 1-2/40 Overview: This is an intermediate

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Geographically weighted regression approach for origin-destination flows

Geographically weighted regression approach for origin-destination flows Geographically weighted regression approach for origin-destination flows Kazuki Tamesue 1 and Morito Tsutsumi 2 1 Graduate School of Information and Engineering, University of Tsukuba 1-1-1 Tennodai, Tsukuba,

More information

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico

More information

R-squared for Bayesian regression models

R-squared for Bayesian regression models R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10 COS53: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 0 MELISSA CARROLL, LINJIE LUO. BIAS-VARIANCE TRADE-OFF (CONTINUED FROM LAST LECTURE) If V = (X n, Y n )} are observed data, the linear regression problem

More information

A better way to bootstrap pairs

A better way to bootstrap pairs A better way to bootstrap pairs Emmanuel Flachaire GREQAM - Université de la Méditerranée CORE - Université Catholique de Louvain April 999 Abstract In this paper we are interested in heteroskedastic regression

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Robust Estimation of ARMA Models with Near Root Cancellation. Timothy Cogley. Richard Startz * June Abstract

Robust Estimation of ARMA Models with Near Root Cancellation. Timothy Cogley. Richard Startz * June Abstract Robust Estimation of ARMA Models with Near Root Cancellation Timothy Cogley Richard Startz * June 2013 Abstract Standard estimation of ARMA models in which the AR and MA roots nearly cancel, so that individual

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information