Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects

Size: px
Start display at page:

Download "Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects"

Transcription

1 Linear Regression for Panel with Unnown Number of Factors as Interactive Fixed Effects Hyungsi Roger Moon Martin Weidne February 27, 22 Abstract In this paper we study the Gaussian quasi maximum lielihood estimator QMLE in a linear panel regression model with interactive fixed effects for asymptotics where both the number of time periods and the number of cross-sectional units go to infinity. Under appropriate assumptions we show that the limiting distribution of the QMLE for the regression coefficients is independent of the number of interactive fixed effects used in the estimation, as long as this number does not fall below the true number of interactive fixed effects present in the data. The important practical implication of this result is that for inference on the regression coefficients one does not need to estimate the number of interactive effects consistently, but can simply rely on any nown upper bound of this number to calculate the QMLE. Keywords: Panel data, interactive fixed effects, factor models, lielihood expansion, quasi-mle, perturbation theory of linear operators, random matrix theory. JEL-Classification: C23, C33 Introduction Panel data models typically incorporate individual and time effects to control for heterogeneity in cross-section and across time-periods. While often these individual and time effects enter the model additively, they can also be interacted multiplicatively, thus giving rise to so called interactive effects, which we also refer to as a factor structure. The multiplicative form captures the heterogeneity in the data more flexibly, since it allows for common time-varying shocs factors to affect the cross-sectional units with individual specific sensitivities factor loadings. It is this flexibility that motivated the discussion of interactive effects in the econometrics literature, e.g. Holtz-Eain, Newey and Rosen 988, We than the participants of the Cowles Summer Conference Handling Dependence: Temporal, Crosssectional, and Spatial at Yale University for interesting comments. Moon acnowledges the NSF for financial support via SES Department of Economics, University of Southern California, KAP 3, Los Angeles, CA moonr@usc.edu. Department of Economics, University College London, Gower Street, London WCE 6BT, U.K., and CeMMAP. m.weidner@ucl.ac.u.

2 Ahn, Lee, Schmidt 2; 27, Pesaran 26, Bai 29b; 29a, Zaffaroni 29, Moon and Weidner 2. Analogous to the analysis of individual specific effects, one can either choose to model the interactive effects as random random effects/correlated effects or as fixed fixed effects, with each option having its specific merits and drawbacs, that have to be weighed in each empirical application separately. In this paper, we consider the interactive fixed effect specification, i.e. we treat the interactive effects as nuisance parameters, which are estimated jointly with the parameters of interest. The advantages of the fixed effects approach are for instance that it is semi-parametric, since no assumption on the distribution of the interactive effects needs to be made, and that the regressors can be arbitrarily correlated with the interactive effect parameters. Let R be the true number of interactive effects number of factors in the data, and let R be the number of interactive effects used by the econometrician in the data analysis. A ey restriction in the existing literature on interactive fixed effects is that R is assumed to be nown, 2 i.e. R = R. This is true both for the quasi-differencing analysis in Holtz-Eain, Newey and Rosen and for the least squares analysis of Bai 29b. Assuming R to be nown could be quite restrictive, since in many empirical applications there is no consensus about the exact number of factors in the data or in the relevant economic model, so that an estimator which is not robust towards some degree of misspecification of R should not be used. The goal of the present paper is to overcome this problem. For a linear panel regression model with interactive fixed effects we consider the Gaussian quasi maximum lielihood estimator QMLE, 4 which jointly minimized the sum of squared residuals over the regression parameters and the interactive fixed effects parameters see Kiefer 98, Bai 29b, and Moon and Weinder 2. We employ an asymptotic where both the number of cross-sectional and the number of time-serial dimensions becomes large, while the number of interactive effects R and also R is constant. The main finding of the paper is that under appropriate assumptions the QMLE of the regression parameters has the same limiting distribution for all R R. Thus, the QMLE is robust towards inclusion of extra interactive effects in the model, and within the QMLE framewor there is no asymptotic efficiency loss from choosing R larger than R. This result is surprising because the conjecture in the literature is that the QMLE with R > R might be consistent but could be less efficient than the QMLE with R e.g., see Bai 29b. 5 The important empirical implication of our result is that as long as a valid upper bound on the number of factors is nown one can use this upper bound to construct the QMLE, and need not worry about consistent estimation of the number of factors. Since Note that Ahn, Lee, Schmidt 2; 27 tae a hybrid approach in that they treat the factors as nonrandom, but the factor loadings as random. The common correlated effects estimator of Pesaran 26 was introduced in a context, where both the factor loadings and the factors follow certain probability laws, but also exhibits some properties of a fixed effects estimator. When we refer to interactive fixed effects we mean that both factors and factor loadings are treated as non-random parameters. 2 In the literature, consistent estimation procedures for R are established only for pure factor models, not for the model with regressors. 3 Holtz-Eain, Newey and Rosen 988 assume just one interactive effect, but their approach could easily be generalized to multiple interactive effects, as long as their number is nown 4 The QMLE is sometimes called concentrated least squares estimator in the literature. 5 For R < R the QMLE could be inconsistent, since then there are interactive fixed effects in the residuals of the model which can be correlated with the regressors but are not controlled for in the estimation. 2

3 the limiting distribution of the QMLE with R > R is identical to the one with R = R the results of Bai 29b and Moon and Weidner 2 regarding inference on the regression parameters become applicable. In order to derive the asymptotic theory of the QMLE with R R we study the properties of the profile lielihood function, which is the quasi lielihood function after integrating out the interactive fixed effect parameters. Concretely, we derive an approximate quadratic expansion of this profile lielihood in the regression parameters. This expansion is difficult to perform, since integrating out the interactive fixed effects results in an eigenvalue problem in the formulation of the profile lielihood. For R = R we show how to overcome this difficulty by performing a joint expansion of the profile lielihood in the regression parameters and in the idiosyncratic error terms. Using the perturbation theory of linear operators we prove that the profile quasi lielihood function is analytic in a neighborhood of the true parameter, and we obtain explicit formulas for the expansion coefficients, in particular analytic expressions for the approximated score and the approximated Hessian for R = R. 6 To generalize the result to R > R we then show that the difference between the profile lielihood for R = R and for R > R is just a constant term plus a term whose dependence on the regression parameters is sufficiently small to be irrelevant for the asymptotic distribution of the QMLE. Due to the eigenvalue problem in the lielihood function, the derivation of this last result requires some very specific nowledge about the eigenvectors and eigenvalues of the random covariance matrix of the idiosyncratic error matrix. We provide high-level assumptions under which the results hold, and we show that these highlevel assumptions are satisfied, when the idiosyncratic errors of the model are independent and identically normally distributed. As we explain in section 4, the justification of our high-level assumptions for more general distribution of the idiosyncratic errors requires some further progress in the Random Matrix Theory of real random covariance matrices, both regarding the properties of their eigenvalues and of their eigenvectors see Bai 999 for a review of this literature. The paper is organized as follows. In Section 2 we introduce the interactive fixed effect model, its Gaussian quasi lielihood function, and the corresponding QMLE, and also discuss consistency of the QMLE. The asymptotic profile lielihood expansion is derived in Section 3. Section 4 provides a justification for the high-level assumptions that we impose, and discusses the relation of these assumptions to the random matrix theory literature. Monte Carlo results which illustrate the validity of our conclusion at finite sample are presented in Section 5, and the conclusions of the paper are drawn in Section 6. A few words on notation. The transpose of a matrix A is denoted by A. For a column vectors v its Euclidean norm is defined by v = v v. For the n-th largest eigenvalues counting multiple eigenvalues multiple times of a symmetric matrix B we write µ n B. For an m n matrix A the Frobenius or Hilbert Schmidt norm is A HS = TrAA, and the operator or spectral norm is A = max v R n Av v, or equivalently A = µ A A. Furthermore, we use P A = AA A A and M A = AA A A, where is the m m identity matrix, and A A denotes some generalized inverse if A is not of full column ran. For square matrices B, C, we use B > C or B C to indicate that B C is positive semi definite. We use wpa for with probability approaching one, and A = d B to 6 The lielihood expansion for R = R was first presented in Moon and Weidner 29. We separate and extend the expansion result from the 29 woring paper and present it in this paper. The remaining application results of Moon and Weidner 29 are now in Moon and Weidner 2. 3

4 indicate that the random variables A and B have the same probability distribution. 2 Model, QMLE and Consistency A linear panel regression model with cross-sectional dimension N, time-serial dimension T, and interactive fixed effects of dimension R, is given by Y = K β X + ε, ε = λ f + e, 2. = where Y, X, ε and e are N T matrices, λ is a N R matrix, f is a T R matrix, and the regression parameters β are scalars the superscript zero indicates the true value of the parameters. We write β for the K-vector of regression parameters, and introduce the notation β X K = β X. All matrices, vectors and scalars in this paper are real valued. A choice for the number of interactive effects R used in the estimation needs to be made, and we may have R R since the true number of factors R may not be nown accurately. Given the choice R, the quasi maximum lielihood estimator QMLE for the parameters β, λ and f is given by 7 ˆβR, ˆΛ R, ˆFR = argmin {β R K, Λ R N R, F R T R } Y β X Λ F 2 HS. 2.2 The square of the Hilbert-Schmidt norm is simply the sum of the squared elements of the argument matrix, i.e. the QMLE is defined by minimizing the sum of squared residuals, which is equivalent to minimizing the lielihood function for iid normal idiosyncratic errors. The estimator is the quasi MLE since the idiosyncratic errors need not be iid normal and since R might not equal R. The QMLE for β can equivalently be defined by minimizing the profile quasi lielihood function, namely where L R β = ˆβ R = argmin β R K L R β, 2.3 min {Λ R N R, F R T R } Y β X Λ F 2 HS = min F R T R Tr [ Y β X M F Y β X ] = T [ µ t Y β X Y β X ]. 2.4 t=r+ Here, we first concentrated out Λ by use of its own first order condition. The resulting optimization problem for F is a principal components problem, so that the the optimal F is 7 The optimal ˆΛR and ˆF R in 2.2 are not unique, since the objective function is invariant under rightmultiplication of Λ with a non-degenerate R R matrix S, and simultaneous right-multiplication of F with S. However, the column spaces of ˆΛ R and ˆF R are uniquely determined. 4

5 given by the R largest principal components of the T T matrix Y β X Y β X. At the optimum the projector M F therefore exactly projects out the R largest eigenvalues of this matrix, which gives rise to the final formulation of the profile lielihood function as the sum over its T R smallest eigenvalues. 8 This last formulation of L R β is very convenient since it does not involve any explicit optimization over nuisance parameters. Numerical calculation of eigenvalues is very fast, so that the numerical evaluation of L R β is unproblematic for moderately large values of T. The function L R β is not convex in β and might have multiple local minima, which have to be accounted for in the numerical calculation of ˆβ R. We write L β for L R β, which is the profile lielihood obtain from the true number of factors. In order to show consistency of ˆβ R we impose the following assumptions. Assumption. i X = O p, =,..., K, ii e = O p maxn, T. One can justify Assumption i by use of the norm inequality X X HS and the fact that X 2 HS = i,t X2,it = O p, where i =,..., N and t =,..., T, and the last step follows e.g. if X,it has a uniformly bounded second moment. Assumption ii is a condition on the largest eigenvalue of the random covariance matrix e e, which is often studied in the literature on random matrix theory, e.g. Geman 98, Bai, Silverstein, Yin 988, Yin, Bai, and Krishnaiah 988, Silverstein 989. The results in Latala 25 show that e = O p maxn, T if e has independent entries with mean zero and uniformly bounded fourth moment. Some wea dependence of the entries e it across i and t is also permissible see, e.g., Moon and Weidner 2. Assumption 2. i TrX e = O p, =,..., K. ii Consider linear combinations X α = K = α X of the regressors X with K-vector α such that α =. We assume that there exists a constant b > such that min {α R K, α =} T t=r+r + X µ α X α t b, wpa. Assumption 2i requires wea exogeneity of the regressors X. Assumption 2ii is a generalization of the usual non-collinearity condition on the regressors. It requires X αx α to be non-degenerate even after elimination of the largest R + R eigenvalues the sum in the assumption only runs over the smallest T R R eigenvalues of this matrix, while running over all eigenvalues would give the trace operator, and thus the usual noncolinearity condition. In particular, this assumption is violated if there exists a linear combination of the regressors with α = and ranx α R + R, i.e. the assumption rules out low-ran regressors lie time invariant regressors or cross-sectionally invariant 8 Since the model is symmetric under N T, Λ F, Y Y, X X there also exists a dual formulation of L R β that involves solving an eigenvalue problem for an N N matrix. 5

6 regressors. These low-ran regressors require a special treatment in the interactive fixed effect model see Bai 29b and Moon and Weidner 2, and we ignore them in the present paper. If one is not interested explicitly in their regression coefficients, one can always eliminate the low-ran regressors by an appropriate projection of the data, e.g. subtraction of the time or cross-sectional means from the data eliminates all timeinvariant or cross-sectionally invariant regressors. Theorem 2.. Let Assumption and 2 be satisfied and let R R. For N, T we then have minn, T ˆβR β = O p. Remars. i The Theorem guarantees consistency of ˆβ R, R R, in an arbitrary limit N, T. In the rest of this paper we consider asymptotics where N and T grow at the same rate, i.e. N/T κ 2, for some positive constant κ. For these restricted asymptotics the theorem already guarantees N or equivalently T consistency of ˆβ R, which is a useful intermediate result. ii The minn, T convergence rate in Theorem 2. can be generalized further. If we generalize Assumption ii and Assumption 2i to Assumption ii e = O p ξ, and Assumption 2i TrX e = O p ξ, =,..., K, where ξ, then it is possible to establish that ξ ˆβR β = O p. The proof of Theorem 2. is presented in the appendix. The theorem imposes no restriction at all on f and λ, apart from the condition R R. To derive the results in the rest of the paper we do however mae the following strong factor assumption. 9 Assumption 3. i < plim N,T N λ λ <, ii < plim N,T T f f <. The main result of this paper is that the inclusion of unnecessary factors in the estimation does not change the asymptotic distribution of the QMLE for β. Before deriving this result rigorously, we want to provide an intuitive explanation for it. As already mentioned above, the estimator ˆF R is given by the first R principal components of the matrix Y ˆβ R X Y ˆβ R X. We have Y ˆβ R X = λ f + e ˆβ R β X. 2.5 For asymptotics, where N and T grow at the same rate, we find that Assumption and the result of Theorem 2. guarantee that e ˆβ R β X = O p N. The strong factor assumption implies that the norms of the columns of λ and f each grow at a rate of N or equivalently T, so that the spectral norm of λ f grows at the rate. The strong factor assumption therefore guarantees that λ f is the dominant component 9 The strong factor assumption is regularly imposed in the literature on large N and T factor models, including Bai and Ng 22, Stoc and Watson 22 and Bai 29b. Onatsi 26 discussed an alternative wea factor assumption for the purpose of estimating the number of factors in a pure factor model, and a more general discussion of strong and wea factors is given in Chudi, Pesaran and Tosetti 29. 6

7 of Y ˆβ R X, which implies that the first R principal components of Y ˆβ R X Y ˆβ R X are close to f, i.e. the true factors are correctly piced up by the principal component estimator. The additional R R principal components that are estimated for R > R cannot pic up anymore true factors and are thus mostly determined by the remaining term e ˆβ R β X. Our results below show that ˆβ R is not only N consistent, but actually consistent, so that ˆβ R β X = O p, which maes the idiosyncratic error matrix e the dominant part of e ˆβ R β X, i.e. the R R additional principal components in ˆF R are mostly determined by e, and more precisely are close to the R R principal components of e e. This means that they are essentially random and close to uncorrelated with the regressors X. Including unnecessary factors in the QMLE calculation is therefore analogous to including irrelevant regressors in a linear regression which are uncorrelated with the relevant regressors X. From the second line in equation 2.4 we see that these additional random components of ˆF R project out the corresponding R R dimensional subspace of the T -dimensional space spanned by the observations over time, thus effectively reducing the number of time dimensions by R R. This usually results in a somewhat increased finite sample variance of the QMLE, but has no influence asymptotically as T goes to infinity, so that the asymptotic distributions of ˆβ R and ˆβ R are identical for R R. 3 Asymptotic Profile Lielihood Expansion To derive the asymptotics of ˆβ R, we study the asymptotic properties of the profile lielihood function L R β around β. First we notice that the expression cannot easily be discussed by analytic means, since there is no explicit formula for the eigenvalues of a matrix. In particular, a standard Taylor expansion of L β around β cannot easily be derived. In Section 3. we show how to overcome this problem when the true number of factors is nown, i.e. R = R, and in Section 3.2 we generalize the results to R > R. When the true R is nown, the approach we choose is to perform a joint expansion in the regression parameters and in the idiosyncratic error terms. To perform this joint expansion we apply the perturbation theory of linear operators e.g., Kato 98. We thereby obtain an approximate quadratic expansion of L β in β, which can be used to derive the first order asymptotic theory of the QMLE ˆβ R. To carry the results for R = R over to R > R, we first note that equation 2.4 implies that L R β = L β R t=r + µ t [ Y β X Y β X ]. 3. The extra term R t=r + µ [ t Y β X Y β X ] is due to overfitting on the extra factors. We show that the β-dependence of this term is sufficiently small, so that apart from a constant the approximate quadratic expansions of L R β and L β around β are identical. To obtain this result we first strengthen Theorem 2. and show that ˆβ R converges to β at a rate of at least N 3/4, so that we only have to discuss the β- dependence of the extra term in L R β within an N 3/4 shrining neighborhood of β. 7

8 From the analysis of L R β, we can then deduce the main result of the paper, namely ˆβR β = ˆβR β + o p. 3.2 This implies that the limiting distributions of ˆβ R and ˆβ R are identical, and that overestimating the number of factors results in no efficiency loss in terms of the asymptotic variance of the QMLE. 3. When R = R We want to expand the profile lielihood L β simultaneously in β and in the spectral norm of e. Let the K + expansion parameters be defined by ɛ = e / and ɛ = β β, =,..., K, and define the N T matrix X = / e e. With these definitions we obtain Y β X = [ λ f + β β X + e ] = λ f + K = ɛ X. 3.3 According to equation 2.4 the profile lielihood L β can be written as the sum over the T R smallest eigenvalues of the matrix in 3.3 multiplied by its transposed. We consider K = ɛ X / as a small perturbation of the unperturbed matrix λ f /, and thus expand L β in the perturbation parameters ɛ = ɛ,..., ɛ K around ɛ =, namely L β = K ɛ ɛ 2... ɛ g L g λ, f, X, X 2,..., X g, 3.4 g=,..., g= where L g = L g λ, f, X, X 2,..., X g are the expansion coefficients. The unperturbed matrix λ f / has ran R, so that the T R smallest eigenvalues of the unperturbed T T matrix f λ λ f / are all zero, i.e. L β = for ɛ = and thus L λ, f =. Due to Assumption 3 the R non-zero eigenvalues of the unperturbed T T matrix f λ λ f / converge to positive constants as N, T. This means that the separating distance of the T R zero-eigenvalues of the unperturbed T T matrix f λ λ f / converges to a positive constant, i.e. the next largest eigenvalue is well separated. This is exactly the technical condition under which the perturbation theory of linear operators guarantees that the above expansion of L in ɛ exists and is convergent as long as the spectral norm of the perturbation K = ɛ X / is smaller than a particular convergence radius r λ, f, which is closely related to the separating distance of the zero-eigenvalues. For details on that see Kato 98 and Appendix A.2, where we define r λ, f and show that it converges to a positive constant as N, T. Note that for the expansion 3.4 it is crucial that we have R = R, since the perturbation theory of linear operators describes the perturbation of the sum of all zero-eigenvalues of the unperturbed matrix f λ λ f /. For R > R the sum in L R β leaves out the R R largest of these perturbed zero-eigenvalues, which results in a much more complicated mathematical problem, since the structure and raning among 8

9 these perturbed zero-eigenvalues needs to be discussed. The above expansion of L β is applicable whenever the operator norm of the perturbation matrix K = ɛ X / is smaller than r λ, f. Since our assumptions guarantee that X / = O p, for =,..., K, and ɛ = O p minn, T /2 = o p, we have K = ɛ X / = O p β β + o p, i.e. the above expansion is always applicable asymptotically within a shrining neighborhood of β which is sufficient since we already now that ˆβ R is consistent for R R. In addition to guaranteeing converge of the series expansion, the perturbation theory of linear operators also provides explicit formulas for the expansion coefficients L g, namely for g =, 2, 3 we have L λ, f, X =, L 2 λ, f, X, X 2 = TrMλ X M f X 2, L 3 λ, f, X, X 2, X 3 = 3 [Tr M λ X M f X 2 λ λ λ f f f X ], where the dots refer to 5 additional terms obtained from the first one by permutation of, 2 and 3, so that the expression becomes totally symmetric in these indices. A general expression for the coefficients for all orders in g is given in Lemma A. in the appendix. One can show that for g 3 the coefficients L g are bounded as follows L g λ, f, X, X 2,..., X g a b g X X 2... X g, 3.5 where a and b are functions of λ and f that converge to finite positive constants in probability. This bound on the coefficients L g allows us to derive a bound on the remainder term, when the profile lielihood expansion is truncated at a particular order. The lielihood expansion can be applied under more general asymptotics, but here we only consider the limit N, T with N/T κ 2, < κ <, i.e. N and T grow at the same rate. Then, the relevant coefficients of the expansion, which are not treated as part of the remainder term, are L β = g=2 ɛ g Lg λ, f, X, X,..., X = W 2 = L2 λ, f, X, X 2 = TrM λ X M f X 2, L g λ, f, e, e,..., e, g=2 C = L 2 λ, f, X, U = TrM λ X M f e, C 2 = 3 2 L3 λ, f, X, e, U = [ Tr em f e M λ X f f f λ λ λ + Tr e M λ e M f X λ λ λ f f f + Tr e M λ X M f e λ λ λ f f f ]. 3.6 In the first line above we used the fact that L g λ, f, X, X 2,..., X g is linear in the arguments X to X g and that ɛ X = e. The K K matrix W with elements W 2 is the approximated Hessian of the profile lielihood function L β. The K-vectors C and C 2 with elements C and C 2 constitute the approximated score of L β. From 9

10 the expansion 3.4 and the bound 3.5 we obtain the following theorem, whose proof is provided in the appendix. Theorem 3.. Let Assumptions and 3 be satisfied. N/T κ 2, < κ <. Then we have Suppose that N, T with L β = L β 2 β β C + C 2 + β β W β β + L,rem β, where the remainder term L,rem β satisfies for any sequence c sup {β: β β c } L,rem β + β β 2 = o p. Corollary 3.2. Let Assumptions, 2, and 3 be satisfied. Furthermore assume that C = O p. In the limit N, T with N/T κ 2, < κ <, we then have ˆβR β = W C + C 2 + o p = O p. Since the estimator ˆβ R minimizes L β it must in particular satisfy L ˆβ R L β +W C + C 2 /. The Corollary follows from applying Theorem 3. to this inequality and using the consistency of ˆβ R. Details are given in the appendix. Using Theorem 3., the corollary is also directly obtained from the results in Andrews 999. Our assumptions already guarantee C 2 = O p and W = O p, so that only C = O p needs to be assumed explicitly in the Corollary. Corollary 3.2 allows to replicate the result in Bai 29b. Furthermore, the assumptions in the corollary do not restrict the regressor to be strictly exogenous, and the techniques developed here are applied in Moon and Weidner 29 to discuss pre-determined regressors in the linear factor regression model with R = R, in which case the score term C contributes an additional incidental parameter bias to the asymptotic distribution of ˆβ R. Remar. If we weaen Assumption ii to e = o p N 2/3, then Theorem 3. still continues to hold. If we assume that C 2 = O p, then Corollary 3.2 also holds under this weaer condition on e. 3.2 When R > R We now extend the lielihood expansion to the case R > R. Let ˆλβ and ˆfβ be the minimizing parameters in the first line of equation 2.4 for R = R. These are the first R principal components of Y β XY β X and Y β X Y β X, respectively. For the corresponding orthogonal projectors we use the notation Mˆλβ Mˆλβ and M ˆf β M ˆfβ. For the residuals after taing out these first R principal components we write êβ Y β X ˆλβ ˆf β.

11 Analogous to the expansion of L β the perturbation theory of linear operators also provides an expansion for Mˆλβ, M ˆf β and êβ in β β and e, i.e. in addition to describing the sum of the perturbed eigenvalues L β it also describes the structure of the corresponding perturbed eigenvectors. For example, we have êβ = M λ em f β β M λ X M f +higher order terms. The details of these expansions are presented in Lemma A. and A.2 in the appendix. These expansions are crucial when generalizing the lielihood expansion to R > R. Equation 3. can equivalently be written as L R β = L β R R t= µ t [ê βêβ ]. 3.7 Here we used that ê βêβ is the residual of Y β X Y β X after subtracting the first R principal components, which implies that the eigenvalues of these two matrices are the same, except from the R largest ones which are missing in ê βêβ. By applying the expansion of êβ to this expression for L R β one obtains the following. Theorem 3.3. Under Assumption and 3 and for R > R we have R R i L R β = L β µ t [Aβ] + L R,rem, β, t= [ where Aβ = M f e β β X ] [ Mλ e β β X ] M f, and for any constant c > sup {β: N β β c} L R,rem, β N + β β = O p. ii L R β = L β where R R t= [ µ t Bβ + B β ] + L R,rem,2 β, Bβ = 2 Aβ M f e M λ em f e λ λ λ f f f [ + M f β β X e ] Mλ ef f f λ λ λ em f + M f e [ M λ β β X ] f f f λ λ λ em f + M f e M λ ef f f λ λ λ [ β β X ] M f + B eeee + M f B rem, βp f + P f B rem,2 P f,

12 and B eeee = M f e M λ em f e λ λ λ f f λ λ λ em f + M f e M λ ef f f λ λ λ ef f f λ λ λ em f 2 M f e M λ ef f f λ λ f f f e M λ em f + 2 M f e λ λ λ f f f e M λ ef f f λ λ λ em f. Here, B rem, β and B rem,2 are T T matrices, B rem,2 is independent of β and satisfies B rem,2 = O p, and for any constant c > sup {β: N β β c} sup {β: N β β c} B rem, β + β β = O p, β L R,rem,2 + β β 2 = o p. Here, the remainder terms L R,rem, β and L R,rem,2 β stem from terms in ê βêβ whose spectral norm is smaller than O p and o p, respectively, within a N shrining neighborhood of β after dividing by N + β β and + β β, respectively. Using Weyl s inequality those terms can be separated from the eigenvalues µ t [ê βêβ]. The expression for Bβ loos complicated, in particular the terms in B eeee. Note however, that B eeee is β-independent and satisfies B eeee = O p under our assumptions, so that it is relatively easy to deal with these terms. Note furthermore that the structure of Bβ is closely related to the expansion of L β, since by definition we have L β = Trê βêβ, which can be approximated by TrBβ + B β. Plugging the definition of Bβ into TrBβ + B β one indeed recovers the terms of the approximated Hessian and score provided by Theorem 3., which is a convenient consistency chec. We do not give explicit formulas for B rem, β and B rem,2, because those terms enter Bβ projected by P f, which maes them orthogonal to the leading term Aβ, so that they can only have limited influence on the eigenvalues of Bβ+B β. The bounds on the norms of B rem, β and B rem,2 provided in the theorem are sufficient for all conclusions on the properties of µ t [Bβ + B β] below. The proof of the theorem can be found in the appendix. The first part of Theorem 3.3 is useful to show that ˆβ R converges to β at a rate of at least N 3/4. The purpose of the second part is to show that ˆβ R has the same limiting distribution as ˆβ R. To actually obtain these two results one requires further conditions on the β-dependence of the largest few eigenvalues of Aβ and Bβ + B β. Assumption 4. For all constants c > sup {β: N β β c} R R t= {µ t [Aβ] µ t [ Aβ ] µ t [Ãβ ]} N + N 5/4 β β + N 2 β β 2 / logn O p, [ where Ãβ = M f β β X ] [ Mλ β β X ] M f. 2

13 Corollary 3.4. Let R > R, let Assumptions, 2, 3 and 4 be satisfied and furthermore assume that C = O p. In the limit N, T with N/T κ 2, < κ <, we then have N 3/4 ˆβR β = O p. The corollary follows from the inequality L R ˆβ R L R β by applying the first part of Theorem 3.3, Assumption 4, and our expansion of L β. The justfication of Assumption 4 is discussed in the next section. Knowing that ˆβ R converges to β at a rate of at least N 3/4 is a convenient intermediate result. It implies that we only have to study the properties of L R β within a N 3/4 shrining neighborhood of β, which is reflected in the formulation of the following assumption. Assumption 5. For all constants c > sup {β:n 3/4 β β c} R R { t= µt [Bβ + B [ β] µ t Bβ + B β ]} + = o p. β β 2 Combining the first part of Theorem 3.3, Assumption 5, and Theorem 3., we find that the profile lielihood for R > R can be written as L R β = L R β 2 β β C + C 2 + β β W β β + L R,rem β, with a remainder term that satisfies for all constants c > L R,rem β sup {β:n 3/4 β β c} + 2 = o p. β β This result, together with N 3/4 -consistency of ˆβ R, gives rise to the following corollary. Corollary 3.5. Let R > R, let Assumptions, 2, 3, 4 and 5 be satisfied and furthermore assume that C = O p. In the limit N, T with N/T κ 2, < κ <, we then have ˆβR β = W C + C 2 + o p = O p. The proof of Corollary 3.5 is analogous to that of Corollary 3.2. The combination of both corollaries shows that our main result in equation 3.2 holds, i.e. the limiting distributions of ˆβR and ˆβ R are indeed identical. What is left to do is to justify the high-level assumptions 4 and 5. 4 Justification of Assumptions 4 and 5 We start with the justification of Assumption 4. We have Aβ = Aβ +Ãβ A mixedβ, where A mixed β = M f e M λ [ β β X ] M f + the same term transposed. By applying 3

14 Weyl s inequality we thus find R R t= [ {µ t [Aβ] µ t Aβ ] ]} µ t [Ãβ R R t= µ t [A mixed β] 2 R R K e β β max M λ X M f. 4. For asymptotics with N and T growing at the same rate Assumption ii guarantees e = O p N. Using this and inequality 4., we find that M λ X M f = O p N 3/4 is a sufficient condition for Assumption 4. This condition can be justified by assuming that X = Γ f + X, where Γ is an N R matrix and X is an N T matrix with X = O p N 3/4, i.e. X has an approximate factor structure with the same factors that enter into the equation for Y and an idiosyncratic component X. Analogous to our discussion of Assumption ii we can obtain the bound on the norm of X by assuming that its entries X,it are mean zero, have bounded fourth moment and are only wealy correlated across i and t. We have thus provided a way to justify Assumption 4 without imposing any additional condition on the error matrix e, but by restricting the data generating process for the regressors X. Alternatively, one can derive the statement in the assumption by imposing weaer restrictions on X, but maing further assumptions on the error matrix e. An example of this is provided by Theorem 4. below, where we only assume that X = X + X, with ranx being bounded, but without assuming that X is generated by the factors f. The discussion of Assumption 5 is more complicated. By Weyl s inequality we now that the absolute value of µ t [Bβ + B [ β] µ t Bβ + B β ] is bounded by the spectral norm of Bβ + B β Bβ B β, which is of order O p N 3/2 β β + O p N 2 β β 2. This bound is obviously too crude to justify the assumption. What we need here is a bound that not only taes into account the spectral norm of the difference between Bβ + B β and Bβ + B β, but also the structure of the eigenvectors of the various matrices involved. The assumption only restricts the properties of Bβ in an N 3/4 shrining neighborhood of β. In this shrining neighborhood the dominant term in Bβ+B β is M f e M λ em f, since its spectral norm is of order N, while the spectral norm of the remaining terms, e.g. A mixed β above, is at most of order N 3/4. Our goal is to show that the largest few eigenvalues of Bβ + B β only differ by o p from those of the leading term M f e M λ em f, within the shrining neighborhood of β. To do so, we first need to introduce some notation. Let w t R T, t =,..., T R, be the normalized eigenvectors of M f e M λ em f with the constraint f w t =, and let ρ t, t =,..., T R, be the corresponding eigenvalues. Let v i R N, i =,..., N R, be the normalized eigenvectors of M λ em f e M λ with the constraint λ v i =, and let ρ i, i =,..., N R, be the corresponding eigenvalues. Weyl s inequality says µ m G + H µ m G + µ H for arbitrary symmetric n n matrices G and H and m n. Here, we refer to a generalization of this, which reads m t= µ tg + H m t= µ tg + m t= µ th. These inequalities are standard results in linear algebra and are readily derived from the Courant-Fischer-Weyl min-max principle. For T < N the vectors v i, i = T R +,..., N R, correspond to null-eigenvalues, and if there are multiple null-eigenvalues those v i are not uniquely defined. In that case we assume that those v i are drawn randomly from the Haar measure on the unit sphere of the corresponding null-eigenspace. For T > N we assume 4

15 We assume that eigenvalues are sorted in decreasing order, i.e. ρ ρ Note that the eigenvalues ρ t and ρ i are identical for t = i. Let d = max i,t, v ix w t, d 2 = max v iep f, d 3 i = max w te P λ, t d 4 = N 3/4 max v ix P f, d 5 = N 3/4 max w tx P λ, i where i =... N R, t =,..., T R and =... K. Furthermore, define d = max, d, d2, d3, d4, d5. Theorem 4.. Let assumptions and 3 hold, let R > R and consider a limit N, T with N/T κ 2, < κ <. Assume that ρ R R > a N, wpa, for some constant a >. Furthermore, let there exists a series of integers q > R R such that t d q = o p N /4, and q T R t=q ρ R R ρ t = O p. Then, for all constants c > and t =,..., R R we have sup µ t Bβ + B β ρ t = o p, {β:n 3/4 β β c} which implies that Assumption 5 is satisified. We can now justify Assumption 5 by showing that the conditions of Theorem 4. are satisfied. The following discussion is largely heuristic. Since v i and w t are the normalized eigenvalues of M f e M λ em f and M λ em f e M λ we expect them to be essentially uncorrelated with X and ep f, and therefore we expect v i X w t = O p, v i ep f = O p, w te P λ = O p. We also expect v i X P f = O p T and w tx P λ = O p N, which is different to the preceding terms with e, since X can be correlated with f and λ. In the definition of d the maxima over these terms are taen over i and t, so that we anticipate some wea dependence of d on N or equivalently T. Note that we need d = o p N /4 since otherwise q does not exist. The ey to maing this discussion rigorous and show that indeed d = o p N /4, or smaller, is a good nowledge of the properties of the eigenvectors v i and w t. If the entries e it are iid normal, then the matrix of v i s and w t s is Haar-distributed on the N R and T R dimensional subspaces spanned by M λ and M f. In that case the formalization of the above discussion becomes relatively easy, and the result is summarized in Theorem 4.2 below. The conjecture in the random matrix theory literature is that the limiting distribution of the eigenvectors of a random covariance matrix is distribution free, i.e. is independent of the particular distribution of e it see, e.g., Silverstein 99, Bai 999. However, we are not aware of a formulation and corresponding proof of this conjecture that is sufficient for our purposes. the same for w t, t = N R,..., T R. This specification avoids correlation between X and those v i and w t being caused by a particular choice of the eigenvectors that correspond to degenerate null-eigenvalues. 5

16 The second condition in Theorem 4. is on the eigenvectors ρ t of the random covariance matrix M f e M λ em f. Eigenvalues are studied more intensely than eigenvectors in the random matrix theory literature, and it is well-nown that the properly normalized empirical distribution of the eigenvalues the so called empirical spectral distribution of an iid sample covariance matrix converges to the Marčeno-Pastur-law Marčeno and Pastur 967 for asymptotics where N and T grow at the same rate. This means that the sum over the eigenvalues ρ t in Theorem 4. asymptotically becomes an integral over the Marčeno-Pastur limiting spectral distribution. 2 To derive a bound on this sum, one furthermore needs to now the asymptotic properties of ρ R R. For random covariance matrices from iid normal errors, it is nown from Johnstone 2 and Soshniov 22 that the properly normalized few largest eigenvalues converge to the Tracy-Widom law. 3. An additional subtlety in the discussion of the eigenvalues and eigenvectors of the random covariance matrix M f e M λ em f are the projections with M f and M λ, which stem from integrating out the true factors and factor loadings of the model. Those projectors are not normally present in the literature on large dimensional random covariance matrices. If the idiosyncratic error distribution is iid normal these projections are unproblematic, since the distribution of e is rotationally invariant in that case, i.e. the projections are mathematically equivalent to a reduction of the sample size by R in both directions. Thus, if the e it are iid normal, then we can show that the conditions of Theorem 4. are satisfied, and we can therefore verify that the high-level assumptions of the last section hold. This result is summarized in the following theorem. Theorem 4.2. Let R > R, let Assumption 3 hold and consider a limit N, T with N/T κ 2, < κ <. Furthermore, assume that i For all =,..., K we can decompose X = X + X,, such that X = O p N 3/4, X HS = O p, X = O p, ranx Q, where Q is independent of N and T. For the K K matrix W defined by W 2 = Tr X X we assume that plim N,T W 2 >. In addition, we assume that E M λ X M f it 24+ɛ, E M λ X it 6+ɛ and E X M f it 6+ɛ are bounded uniformly across i, j, N and T for some ɛ >.. ii The error matrix e is independent of λ, f, X and elements e it are distributed as iid N, σ 2. X, =,..., K, and its Then, the Assumptions, 2, 4 and 5 are satisfied and we have C = O p. By Corollary 3.2 and 3.5 we can therefore conclude ˆβR β = ˆβR β + o p. The proofs for Theorem 4. and Theorem 4.2 are provided in the supplementary material to this paper. It seems to be quite challenging to extend Theorem 4.2 to non-iid-normal 2 To mae this argument mathematically rigorous one needs to now the convergence rate of the empirical spectral distribution to its limit law, which is a ongoing research subject in the literature, e.g. Bai 993, Bai, Miao and Yao 24, Götze and Tihomirov 2. 3 To our nowledge this result is not established for error distributions that are not normal. Soshniov 22 has a result under non-normality but only for asymptotics with N/T. 6

17 N = T = 5 N = T = e it N, e it t5 e it N, e it t5 bias std bias std bias std bias std R = R = R = R = R = R = Table : Simulation results for the bias and standard error std of the QMLE ˆβ R for different value of R, two different sample sizes N and T, and the two different specifications for e it. The data generating process is described in the main text, in particular the true number of factors here is R = 2. We used, repetitions in the simulation. e it, given the present status of the literature on eigenvalues and eigenvectors of large dimensional random covariance matrices, and we would lie to leave this as a future research topic. 5 Monte Carlo Simulations Here, we consider a panel model with one regressor K =, two factors R = 2 and the following data generating process DGP Y it = β X it + 2 λ ir f tr + e it, X it = + X it + r= 2 λ ir + χ ir f tr, 5. where i =,..., N and t =,..., T. The random variables X it, λ ir, f tr, χ ir and e it are mutually independent, Xit is distributed as iid N,, and λ ir, f tr and χ ir are all distributed as iid N,. For e it we also assume that it is iid across i and t, but we consider two different specifications for the marginal distribution, namely either N, or a Student s t-distribution with 5 degrees of freedom. We choose β =, and use, repetitions in our simulation. For each draw of Y and X we compute the QMLE ˆβ R according to equation 2.3 for different values of R. Table reports the bias and standard error of ˆβ R for sample sizes N = T = 5 and N = T =. For R = OLS estimator and R = we have R < R, i.e. less factors are used in the estimation than are present in the DGP. As a result of this, the QMLE is heavily biased for these values of R, since the factor structure in the DGP is correlated with the regressors, but is not controlled for in the estimation. In contrast, for all values R R the bias of the QMLE is negligible compared to its standard error. Furthermore, the standard error remains almost constant as R increases beyond R = R ; concretely from R = 2 to R = 5 it increases only by about 7% for N = T = 5 and only by 5% for N = T =. Table 2 reports quantiles of the appropriately normalized QMLE for R R and N = T =. One finds that the quantiles remain almost constant as R increases. In r= 7

18 Quantiles of ˆβR β e it 2.5% 5% % 25% 5% 75% 9% 95% 97.5% R = N, R = R = R = R = t5 R = R = R = Table 2: Simulation results for the quantiles of ˆβR β for N = T =, the two different specifications of e it, different values of R, and the data generating process as described in the main text with R = 2. We used, repetitions in the simulation. particular, the differences in the quantiles for different values of R are relatively small compared to the differences between the quantiles, so that the size of a test statistics that is based on ˆβ R is essentially independent of the choice of R R. The findings of the Monte Carlo simulations described in the last two paragraph hold just as well for the specification with normally distributed as for the specification where e it has Student s t-distribution. From this finding one may conjecture that Theorem 4.2 also holds for more general error distributions. 6 Conclusions In this paper we showed that under certain regularity conditions the limiting distribution of the QMLE of a linear panel regression with interactive fixed effects does not change when we include redundant factors in the estimation. The important empirical implication of this result is that one can use an upper bound of the number of factors in the estimation without asymptotic efficiency loss. For inference on the regression coefficients one thus need not worry about consistent estimation of the number of factors in the model. As regularity conditions we mostly impose high-level assumptions, and we verify that these hold under iid normal errors. Our simulation results suggest that normality of the error distribution is not necessary. Along the lines of the arguments presented in Section 4, we expect that progress in the literature on large dimensional random covariance matrices will allow verification of our high-level assumptions under more general error distributions. This is a vital and interesting topic for future research. A Appendix A. Proof of Consistency Proof of Theorem 2.. We first establish a lower bound on L β. Consider the last expression for L β in equation 2.4 and plug in Y = β X +λ f +e, then replace 8

19 λ f by λf, and minimize over the N R matrix λ and the T R matrix f. This gives [ L β min Tr β β X + e F b β β 2 β β + O p minn, T M F ] β β X + e, + Tr ee + O p. A. minn, T where in the first line we minimize over all T R + R matrices F, and to arrive at the second line we decomposed the expression in the component quadratic in β β, linear in β β and independent of β β and applied Assumption and 2. Next, we establish an upper bound on L β. We have L β = T t=r+ µ t [ λ f + e λ f + e ] Tr e M λ e Tr ee. A.2 Further details regarding the derivation of the bounds A. and A.2 are presented in the supplementary material. Since we could choose β = β in the minimization of β, the optimal ˆβ needs to satisfy L ˆβ L β. With the above results we thus find b ˆβ β 2 + O ˆβ β p + O p. A.3 minn, T minn, T From this it follows that ˆβ β = O p minn, T /2, which is what we wanted to show. A.2 Proof of Lielihood Expansion Definition. For the N R matrix λ and the T R matrix f we define d max λ, f = λ f = d min λ, f = µ Rλ f f λ, µ λ f f λ, A.4 i.e. d max λ, f and d min λ, f are the square roots of the maximal and the minimal eigenvalue of λ f f λ /. Furthermore, the convergence radius r λ, f is defined by r λ, f = Lemma A.. If the following condition is satisfies 4dmax λ, f d 2 min λ, f + 2d max λ, f. A.5 K β β X + e < r λ, f, = A.6 9

20 then i the profile quasi lielihood function can be written as a power series in the K + parameters ɛ = e / and ɛ = β β, namely L β = K K K... ɛ ɛ 2... ɛ g L g λ, f, X, X 2,..., X g, g=2 = 2 = g= where the expansion coefficients are given by 4 L g λ, f, X, X 2,..., X g = Lg λ, f, X, X 2,..., X g = [ Lg λ, f ], X, X 2,..., X g + all permutations of,..., g g!, i.e. L g is obtained by total symmetrization of the last g arguments of 5 L g λ, f, X, X 2,..., X g with = g p+ p= S = M λ, T ν ν p = g m +...+m p+ = p 2 ν j, m j Tr S m T ν... Sm 2... S mp T νp... g S m p+, S m = [ λ λ λ f f λ λ λ ] l = λ f X + X f λ, T 2 2 = X X 2, for,, 2 =... K, X = e e, X = X, for = =... K., for m, ii the projector Mˆλβ can be written as a power series in the same parameters ɛ =,..., K, namely Mˆλ β = K K g= = 2 =... K g= ɛ ɛ 2... ɛ g M g λ, f, X, X 2,..., X g, where the expansion coefficients are given by M λ, f = M λ, and for g M g λ, f, X, X 2,..., X g = g M λ, f, X, X 2,..., X g = [ M g ] X, X 2,..., X g + all permutations of,..., g, g! 4 Here we use the round bracet notation, 2,..., g for total symmetrization of these indices. 5 One finds L λ, f, X, X 2,..., X g =, which is why the sum in the power series of L starts from g = 2 instead of g =. 2

21 i.e. M g is obtained by total symmetrization of the last g arguments of M g λ, f, X, X 2,..., X g = g p+ p= ν ν p = g m m p+ = p 2 ν j, m j where S m, T, T 2 2, and X are given above. S m T ν... Sm 2... S mp T νp... g S m p+, iii For g 3 the coefficients L g in the series expansion of L β are bounded as follows L g λ, f, X, X 2,..., X g R g d2 min λ, f 2 Under the stronger condition = 6 dmax λ, f d 2 min λ, f g X X 2... X g. K β β X + e < d2 min λ, f 6 d max λ, f, A.7 we therefore have the following bound on the remainder when the series expansion for L β is truncated at order G 2: L β G K g=2 =... K g= ɛ... ɛ g L g λ, f, X, X 2,..., X g R G + αg+ d 2 min λ, f 2 α 2, where α = 6 d maxλ, f d 2 min λ, f K β β X + e = <. iv The operator norm of the coefficient M g in the series expansion of Mˆλ β is bounded as follows, for g M g λ, f, X, X 2,..., X g g 2 6 dmax λ, f g X X d 2 min λ, f 2... X g. Under the condition A.7 we therefore have the following bound on operator norm 2

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Likelihood Expansion for Panel Regression Models with Factors

Likelihood Expansion for Panel Regression Models with Factors Likelihood Expansion for Panel Regression Models with Factors Hyungsik Roger Moon Martin Weidner May 2009 Abstract In this paper we provide a new methodology to analyze the Gaussian profile quasi likelihood

More information

Likelihood Expansion for Panel Regression Models with Factors

Likelihood Expansion for Panel Regression Models with Factors Likelihood Expansion for Panel Regression Models with Factors Hyungsik Roger Moon Martin Weidner April 29 Abstract In this paper we provide a new methodology to analyze the Gaussian profile quasi likelihood

More information

Linear regression for panel with unknown number of factors as interactive fixed effects

Linear regression for panel with unknown number of factors as interactive fixed effects Linear regression for panel with unknown number of factors as interactive fixed effects Hyungsik Roger Moon Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects Hyungsik Roger Moon Martin Weidner December 26, 2014 Abstract In this paper we study the least squares (LS) estimator

More information

Dynamic Linear Panel Regression Models with Interactive Fixed Effects

Dynamic Linear Panel Regression Models with Interactive Fixed Effects Dynamic Linear Panel Regression Models with Interactive Fixed Effects Hyungsik Roger Moon Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP47/4 Dynamic

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Random Matrix Theory and its Applications to Econometrics

Random Matrix Theory and its Applications to Econometrics Random Matrix Theory and its Applications to Econometrics Hyungsik Roger Moon University of Southern California Conference to Celebrate Peter Phillips 40 Years at Yale, October 2018 Spectral Analysis of

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Quaderni di Dipartimento. Estimation Methods in Panel Data Models with Observed and Unobserved Components: a Monte Carlo Study

Quaderni di Dipartimento. Estimation Methods in Panel Data Models with Observed and Unobserved Components: a Monte Carlo Study Quaderni di Dipartimento Estimation Methods in Panel Data Models with Observed and Unobserved Components: a Monte Carlo Study Carolina Castagnetti (Università di Pavia) Eduardo Rossi (Università di Pavia)

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Seung C. Ahn Arizona State University, Tempe, AZ 85187, USA Peter Schmidt * Michigan State University,

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Panel Threshold Regression Models with Endogenous Threshold Variables

Panel Threshold Regression Models with Endogenous Threshold Variables Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper

More information

Nuclear Norm Regularized Estimation of Panel Regression Models

Nuclear Norm Regularized Estimation of Panel Regression Models Nuclear Norm Regularized Estimation of Panel Regression Models Hyungsik Roger Moon Martin Weidner February 3, 08 Abstract In this paper we investigate panel regression models with interactive fixed effects.

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer SFB 83 The exact bias of S in linear panel regressions with spatial autocorrelation Discussion Paper Christoph Hanck, Walter Krämer Nr. 8/00 The exact bias of S in linear panel regressions with spatial

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Suppose again we have n sample points x,..., x n R p. The data-point x i R p can be thought of as the i-th row X i of an n p-dimensional

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Bootstrap prediction intervals for factor models

Bootstrap prediction intervals for factor models Bootstrap prediction intervals for factor models Sílvia Gonçalves and Benoit Perron Département de sciences économiques, CIREQ and CIRAO, Université de Montréal April, 3 Abstract We propose bootstrap prediction

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title itle On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime Authors Wang, Q; Yao, JJ Citation Random Matrices: heory and Applications, 205, v. 4, p. article no.

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Multiple integrals: Sufficient conditions for a local minimum, Jacobi and Weierstrass-type conditions

Multiple integrals: Sufficient conditions for a local minimum, Jacobi and Weierstrass-type conditions Multiple integrals: Sufficient conditions for a local minimum, Jacobi and Weierstrass-type conditions March 6, 2013 Contents 1 Wea second variation 2 1.1 Formulas for variation........................

More information

UTILIZING PRIOR KNOWLEDGE IN ROBUST OPTIMAL EXPERIMENT DESIGN. EE & CS, The University of Newcastle, Australia EE, Technion, Israel.

UTILIZING PRIOR KNOWLEDGE IN ROBUST OPTIMAL EXPERIMENT DESIGN. EE & CS, The University of Newcastle, Australia EE, Technion, Israel. UTILIZING PRIOR KNOWLEDGE IN ROBUST OPTIMAL EXPERIMENT DESIGN Graham C. Goodwin James S. Welsh Arie Feuer Milan Depich EE & CS, The University of Newcastle, Australia 38. EE, Technion, Israel. Abstract:

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

A Factor Analytical Method to Interactive Effects Dynamic Panel Models with or without Unit Root

A Factor Analytical Method to Interactive Effects Dynamic Panel Models with or without Unit Root A Factor Analytical Method to Interactive Effects Dynamic Panel Models with or without Unit Root Joakim Westerlund Deakin University Australia March 19, 2014 Westerlund (Deakin) Factor Analytical Method

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Estimation of random coefficients logit demand models with interactive fixed effects

Estimation of random coefficients logit demand models with interactive fixed effects Estimation of random coefficients logit demand models with interactive fixed effects Hyungsik Roger Moon Matthew Shum Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

12.4 Known Channel (Water-Filling Solution)

12.4 Known Channel (Water-Filling Solution) ECEn 665: Antennas and Propagation for Wireless Communications 54 2.4 Known Channel (Water-Filling Solution) The channel scenarios we have looed at above represent special cases for which the capacity

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

Porcupine Neural Networks: (Almost) All Local Optima are Global

Porcupine Neural Networks: (Almost) All Local Optima are Global Porcupine Neural Networs: (Almost) All Local Optima are Global Soheil Feizi, Hamid Javadi, Jesse Zhang and David Tse arxiv:1710.0196v1 [stat.ml] 5 Oct 017 Stanford University Abstract Neural networs have

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Econ 582 Fixed Effects Estimation of Panel Data

Econ 582 Fixed Effects Estimation of Panel Data Econ 582 Fixed Effects Estimation of Panel Data Eric Zivot May 28, 2012 Panel Data Framework = x 0 β + = 1 (individuals); =1 (time periods) y 1 = X β ( ) ( 1) + ε Main question: Is x uncorrelated with?

More information

Tests of the Co-integration Rank in VAR Models in the Presence of a Possible Break in Trend at an Unknown Point

Tests of the Co-integration Rank in VAR Models in the Presence of a Possible Break in Trend at an Unknown Point Tests of the Co-integration Rank in VAR Models in the Presence of a Possible Break in Trend at an Unknown Point David Harris, Steve Leybourne, Robert Taylor Monash U., U. of Nottingam, U. of Essex Economics

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

Estimating Gaussian Mixture Densities with EM A Tutorial

Estimating Gaussian Mixture Densities with EM A Tutorial Estimating Gaussian Mixture Densities with EM A Tutorial Carlo Tomasi Due University Expectation Maximization (EM) [4, 3, 6] is a numerical algorithm for the maximization of functions of several variables

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Lecture 7: Dynamic panel models 2

Lecture 7: Dynamic panel models 2 Lecture 7: Dynamic panel models 2 Ragnar Nymoen Department of Economics, UiO 25 February 2010 Main issues and references The Arellano and Bond method for GMM estimation of dynamic panel data models A stepwise

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Optimal spectral shrinkage and PCA with heteroscedastic noise

Optimal spectral shrinkage and PCA with heteroscedastic noise Optimal spectral shrinage and PCA with heteroscedastic noise William Leeb and Elad Romanov Abstract This paper studies the related problems of denoising, covariance estimation, and principal component

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Eigenvalues, random walks and Ramanujan graphs

Eigenvalues, random walks and Ramanujan graphs Eigenvalues, random walks and Ramanujan graphs David Ellis 1 The Expander Mixing lemma We have seen that a bounded-degree graph is a good edge-expander if and only if if has large spectral gap If G = (V,

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

A NOTE ON SPURIOUS BREAK

A NOTE ON SPURIOUS BREAK Econometric heory, 14, 1998, 663 669+ Printed in the United States of America+ A NOE ON SPURIOUS BREAK JUSHAN BAI Massachusetts Institute of echnology When the disturbances of a regression model follow

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

Small Ball Probability, Arithmetic Structure and Random Matrices

Small Ball Probability, Arithmetic Structure and Random Matrices Small Ball Probability, Arithmetic Structure and Random Matrices Roman Vershynin University of California, Davis April 23, 2008 Distance Problems How far is a random vector X from a given subspace H in

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information