Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments

Similar documents
Testing for Weak Instruments in Linear IV Regression

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Department of Econometrics and Business Statistics

Likelihood Ratio based Joint Test for the Exogeneity and. the Relevance of Instrumental Variables

Lecture 11 Weak IV. Econ 715

Imbens/Wooldridge, Lecture Notes 13, Summer 07 1

GMM, Weak Instruments, and Weak Identification

Instrumental Variable Estimation with Heteroskedasticity and Many Instruments

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

What s New in Econometrics. Lecture 13

A CONDITIONAL LIKELIHOOD RATIO TEST FOR STRUCTURAL MODELS. By Marcelo J. Moreira 1

Location Properties of Point Estimators in Linear Instrumental Variables and Related Models

Symmetric Jackknife Instrumental Variable Estimation

ORTHOGONALITY TESTS IN LINEAR MODELS SEUNG CHAN AHN ARIZONA STATE UNIVERSITY ABSTRACT

CIRJE-F-577 On Finite Sample Properties of Alternative Estimators of Coefficients in a Structural Equation with Many Instruments

Identification and Inference with Many Invalid Instruments

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

AN EFFICIENT GLS ESTIMATOR OF TRIANGULAR MODELS WITH COVARIANCE RESTRICTIONS*

Economics 583: Econometric Theory I A Primer on Asymptotics

Understanding Regressions with Observations Collected at High Frequency over Long Span

Weak Instruments and the First-Stage Robust F-statistic in an IV regression with heteroskedastic errors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Increasing the Power of Specification Tests. November 18, 2018

A Robust Test for Weak Instruments in Stata

Exogeneity tests and weak identification

Simultaneous Equations and Weak Instruments under Conditionally Heteroscedastic Disturbances

Inference with Weak Instruments

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Alternative Estimation Methods of Simultaneous Equations and Microeconometric Models

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

STAT 200C: High-dimensional Statistics

1 Appendix A: Matrix Algebra

Instrumental Variables Estimation in Stata

STAT 200C: High-dimensional Statistics

Instrumental Variables Regression, GMM, and Weak Instruments in Time Series

Estimation and Inference with Weak Identi cation

ECON Introductory Econometrics. Lecture 16: Instrumental variables

1 Motivation for Instrumental Variable (IV) Regression

Regression. Saraswata Chaudhuri, Thomas Richardson, James Robins and Eric Zivot. Working Paper no. 73. Center for Statistics and the Social Sciences

Title. Description. var intro Introduction to vector autoregressive models

The Estimation of Simultaneous Equation Models under Conditional Heteroscedasticity

The Case Against JIVE

Birkbeck Working Papers in Economics & Finance

The LIML Estimator Has Finite Moments! T. W. Anderson. Department of Economics and Department of Statistics. Stanford University, Stanford, CA 94305

Large Sample Properties of Estimators in the Classical Linear Regression Model

The properties of L p -GMM estimators

A more powerful subvector Anderson and Rubin test in linear instrumental variables regression. Patrik Guggenberger Pennsylvania State University

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Weak Identification in Maximum Likelihood: A Question of Information

A Primer on Asymptotics

Economics 241B Review of Limit Theorems for Sequences of Random Variables

A New Paradigm: A Joint Test of Structural and Correlation Parameters in Instrumental Variables Regression When Perfect Exogeneity is Violated

Missing dependent variables in panel data models

ESSAYS ON INSTRUMENTAL VARIABLES. Enrique Pinzón García. A dissertation submitted in partial fulfillment of. the requirements for the degree of

Partial Identification and Confidence Intervals

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

17/003. Alternative moment conditions and an efficient GMM estimator for dynamic panel data models. January, 2017

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

OPTIMAL TWO-SIDED INVARIANT SIMILAR TESTS FOR INSTRUMENTAL VARIABLES REGRESSION DONALD W. K. ANDREWS, MARCELO J. MOREIRA AND JAMES H.

Cointegrating Regressions with Messy Regressors: J. Isaac Miller

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Moreover, the second term is derived from: 1 T ) 2 1

EFFICIENT ESTIMATION USING PANEL DATA 1. INTRODUCTION

On the Sensitivity of Return to Schooling Estimates to Estimation Methods, Model Specification, and Influential Outliers If Identification Is Weak

Performance of conditional Wald tests in IV regression with weak instruments

Econometrics II - EXAM Answer each question in separate sheets in three hours

Split-Sample Score Tests in Linear Instrumental Variables Regression

Cross-fitting and fast remainder rates for semiparametric estimation

DYNAMIC PANEL GMM WITH NEAR UNITY. Peter C. B. Phillips. December 2014 COWLES FOUNDATION DISCUSSION PAPER NO. 1962

Nonstationary Panels

Improved Inference in Weakly Identified Instrumental Variables Regression

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

Advanced Econometrics

Comprehensive Examination Quantitative Methods Spring, 2018

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

11. Bootstrap Methods

Finite Sample Bias Corrected IV Estimation for Weak and Many Instruments

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi)

IV and IV-GMM. Christopher F Baum. EC 823: Applied Econometrics. Boston College, Spring 2014

QED. Queen s Economics Department Working Paper No Moments of IV and JIVE Estimators

(Part 1) High-dimensional statistics May / 41

Almost Unbiased Estimation in Simultaneous Equations Models with Strong and / or Weak Instruments

Nonlinear minimization estimators in the presence of cointegrating relations

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Unbiased Instrumental Variables Estimation Under Known First-Stage Sign

Simple Estimators for Monotone Index Models

Our point of departure, as in Chapter 2, will once more be the outcome equation:

Inference For High Dimensional M-estimates. Fixed Design Results

Exogeneity tests and weak-identification

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Jackknife Instrumental Variable Estimation with Heteroskedasticity

Instrumental Variables

Estimation of Dynamic Regression Models

Transcription:

CHAPTER 6 Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments James H. Stock and Motohiro Yogo ABSTRACT This paper extends Staiger and Stock s (1997) weak instrument asymptotic approximations to the case of many weak instruments by modeling the number of instruments as increasing slowly with the number of observations. It is shown that the resulting many weak instrument approximations can be calculated sequentially by letting first the sample size, and then the number of instruments, tend to infinity. The resulting distributions are given for k-class estimators and test statistics. 1. INTRODUCTION Most of the literature on the distribution of statistics in instrumental variables (IV) regression assumes, either implicitly or explicitly, that the number of instruments (K ) is small relative to the number of observations (T ); see Rothenberg s (1984) survey of Edgeworth approximations to the distributions of IV statistics. In some applications, however, the number of instruments can be large; for example, Angrist and Krueger (1991) had 178 instruments in one of their specifications. Sargan (1975), Kunitomo (1980), and Morimune (1983) provided early asymptotic treatments of many instruments. More recently, Bekker (1994) obtained first-order distributions of various IV estimators under the assumptions that K, T, and K /T c, 0 c < 1, when the so-called concentration parameter (µ ) is proportional to the sample size and the errors are Gaussian. Chao and Swanson (00) have explored the consistency of IV estimators with weak instruments when the number of instruments is large, in the sense that K is also modeled as increasing to infinity, but more slowly than T. This paper continues this line of research on the asymptotic distribution of IV estimators when there are many instruments. Our focus is on the case of many weak instruments, that is, when there are many instruments that are, on average, only weakly correlated with the included endogenous regressors. Specifically, we extend the weak instrument asymptotics developed in Staiger and Stock (1997) to the case of many instruments. The key technical device of the Staiger Stock (1997) weak instrument asymptotics is fixing the expected value of the concentration parameter, along with the number of instruments,

110 Stock and Yogo as the sample size increases. Here, we extend this to the case that the expected value of the concentration parameter is proportional to the number of instruments, and the number of instruments is allowed to increase slowly with the sample size, specifically, as T, K, E(µ )/K Λ (a fixed matrix), and K 4 /T 0. We refer to asymptotic limits taken under sequences satisfying these conditions as many weak instrument limits. (The term many should not be overinterpreted because while the number of instruments is allowed to tend to infinity, the condition K 4 /T 0 requires it to do so very slowly relative to the sample size.) Under these conditions, and some additional technical conditions stated in Section (including i.i.d. sampling and existence of fourth moments), it is shown that the limits of k-class IV statistics as K and T jointly tend to infinity can in general be computed using sequential asymptotic limits. Under sequential asymptotics, the fixed-k weak instrument limit is obtained first, then the limit of that distribution is taken as K. The advantage of this first T then K approach is that the sequential calculations are simpler than the calculations that arise along the joint sequence of (K, T ). A potential disadvantage of this approach is that this simplicity comes at the cost of a stronger rate condition than might be obtained along the joint sequence. We begin in Section by specifying the model, the k-class IV statistics of interest, and our assumptions. Section 3 justifies the sequential asymptotics by showing that, under these assumptions, a key uniform convergence condition holds. In Section 4, we derive the many weak instrument limits of k-class estimators and test statistics using sequential asymptotics. These many weak instrument limits are used in Stock and Yogo (004) to develop tests for weak instruments when the number of instruments is moderate. Some of these results might be of more general interest, however; for example, Chao and Swanson (00) show that LIML is consistent under these conditions, and in this paper we provide its K -limiting distribution. Section 5 provides some concluding remarks.. THE MODEL, STATISTICS, AND ASSUMPTIONS.1. Model and Notation We consider the IV regression model with n included endogenous regressors: y = Yβ + u, Y = ZΠ + V, (.1) (.) where y is the T 1 vector of T observations on the dependent variable, Y is the T n matrix of n included endogenous variables, Z is the T K matrix of K excluded exogenous variables to be used as instruments, and u and V are a T 1 vector and T n matrix of disturbances, respectively. The n 1

IV Statistics with Many Instruments 111 vector β and K n matrix Π are unknown parameters. Throughout this paper we exclusively consider inference about β. It is useful to introduce some additional notation. Let Z t = (Z 1t Z K t), V t = (V 1t V nt ), Y = [yy], Q ZZ = E(Z t Z t ), [( ) ut ( ) ] [ ] σ Σ = E ut V V t = uu Σ uv, (.3) t Σ Vu Σ ρ = Σ 1/ ΣVu σ uu 1/, (.4) C = T Π, and (.5) Λ K = T Σ 1/ Π Q ZZ ΠΣ 1/ The n n matrix Λ K / K = Σ 1/ C Q ZZ CΣ 1/ / K. (.6) is the expected value of the concentration parameter, divided by the number of instruments, K. Note that ρ ρ 1... k-class Statistics The k-class estimator of β is ˆβ(k) = [Y (I km Z )Y] 1 [Y (I km Z )y], (.7) where M Z = I Z(Z Z) 1 Z and k is a scalar. The Wald statistic, based on the k-class estimator, testing the null hypothesis β = β 0 is W (k) = [ ˆβ(k) β 0 ] [Y (I km Z )Y][ ˆβ(k) β 0 ], (.8) n ˆσ uu (k) where ˆσ uu (k) = û(k) û(k)/(t n) and û(k) = y Y ˆβ(k). Specific k-class estimators of interest include two-stage least squares (TSLS), the limited information maximum likelihood (LIML) estimator, Fuller s (1977) k-class estimator, and bias-adjusted TSLS (BTSLS; Nagar 1959; Rothenberg 1984). The values of k for these estimators are (cf. Donald and Newey 001): TSLS: k = 1, (.9) LIML: k = ˆk LIML is the smallest root of det (Y Y ky M Z Y) = 0, (.10) Fuller-k: k = ˆk LIML c/(t K ), where c is a positive constant, (.11) BTSLS: k = T/(T K + ), (.1) where det(a) is the determinant of matrix A.

11 Stock and Yogo.3. Assumptions We assume that the random variables are i.i.d. with four moments, the instruments are not multicollinear, and the errors are homoskedastic; that is, we assume: Assumption A (a) There exists a constant D 1 > 0 such that mineval(z Z/T ) D 1 a.s. for all K and for all T greater than some T 0. (b) Z t is i.i.d. with EZ t Z t = Q ZZ, where Q ZZ is positive definite, and EZ 4 it D <, where i = 1,...,K. (c) η t = [u t V t ] is i.i.d. with E(η t Z t ) = 0, E(η t η t Z t) = Σ, which is positive definite, and E( η it η jt η kt η lt Z t ) = E( η it η jt η kt η lt ) D 3 <, where i, j, k, l = 1,...,n+ 1. The next assumption is that the instruments are weak in the sense that the amount of information per instrument does not increase with the sample size, that is, the concentration parameter is proportional to the number of instruments. For fixed K, this assumption is achieved by considering the sequence of models in which C = Π/ T is fixed, so that Π is modeled as local to zero (Staiger and Stock 1997). We adopt this nesting here, specifically: Assumption B. max i, j C i, j D 4 <, where D 4 does not depend on T or K, and C C/K H as T, where H is a fixed n n matrix. Assumption B implies that Λ K Λ as T, where Λ is a fixed matrix with maxeval(λ ) <. When the number of instruments is fixed, this assumption is equivalent to the weak-instrument Assumption L in Staiger and Stock (1997). Our analysis focuses on sequences of K that, if they increase, do so slower than T. Specifically, we assume: Assumption C. K 4 /T 0asT. Note that Assumption C does not require K to increase, but it limits the rate at which it can increase. 3. UNIFORM CONVERGENCE RESULT This section provides the uniform convergence result (Theorem 3.1) that justifies the use of sequential asymptotics to compute the many weak instrument limiting representations. We adopt Phillips and Moon s (1999) notation in which (T, K ) seq denotes the sequential limit in which first T, then K ; the notation (K, T ) denotes the joint limit in which K is implicitly indexed by T.

IV Statistics with Many Instruments 113 Lemma 6 of Phillips and Moon (1999) provides general conditions under which sequential convergence implies joint convergence. Phillips and Moon (1999), Lemma 6 (a) Suppose there exist random vectors X K and X on the same probability space as X K,T satisfying, for all K, X K,T X K as T p and X K p X as K. Then X K,T p X as (K, T ) if and only if lim sup K,T Pr [ X K,T X K >ε] = 0 for all ε>0. (3.1) (b) Suppose there exist random vectors X K such that, for any fixed K, d d d X K,T X K as T and X K X as K. Then X K,T X as (K, T ) if and only if, for all bounded continuous functions f, lim sup K,T E[ f (X K,T )] E[ f (X K )] =0. (3.) Note that condition (3.) is equivalent to the requirement lim sup K,T sup x F X K,T (x) F X K (x) =0, (3.3) where F X K,T is the c.d.f. of X K,T and F X K is the c.d.f. of X K. The rest of this section is devoted to showing that the conditions of this lemma, that is, (3.1) and (3.3), hold under assumptions A, B, and C for the statistics that enter the k-class estimators and test statistics. To do so, we use the following Berry Esseen bound proven by Bertkus (1986): Berry Esseen Bound (Bertkus 1986). Let {X 1,...,X T } be an i.i.d. sequence in R K with zero means, a nonsingular second moment matrix, and finite absolute third moments. Let P T be the probability measure associated with T 1/ T t=1 X t, and let P be the limiting Gaussian measure. Then for each T, sup A C K P T (A) P(A) const (K/T ) 1/ E X 3 = O ( [K 4 /T ] 1/ ) (3.4) where C K is the class of all measurable convex sets in R K. We now turn to k-class statistics. First note that, for fixed K, under Assumptions A and B, the weak law of large numbers and the central limit theorem imply that the following limits hold jointly for fixed K : (T 1 u u, T 1 V u, T 1 V p V) (σ uu, Σ Vu, Σ ), (3.5) Π Z ZΠ p C Q ZZ C, (3.6) (Π Z u, Π Z d V) (C Ψ Zu, C Ψ ZV ), (3.7) (u P Z u, V P Z u, V d P Z V) ( Zu Q 1 ZZ Zu, ZV Q 1 ZZ Zu, ZV Q 1 ZZ ZV), (3.8)

114 Stock and Yogo where Ψ Zu and Ψ ZV are, respectively, K 1 and K n random variables and Ψ [Ψ Zu, vec( ZV) ] is distributed N(0, Q ZZ ). The following theorem shows that the limits in (3.5) (3.8) and related limits hold uniformly in K under the sampling assumption (Assumption A), the weak instrument assumption (Assumption B), and the rate condition (Assumption C). Let A =[tr(a A)] 1/ denote the norm of the matrix A and, as in (3.3), let F X denote the c.d.f. of the random variable X (etc.). Theorem 3.1. Under Assumptions A, B, and C, (a) lim sup K,T Pr[ (u u/t, V u/t, V V/T ) (σ uu, Σ Vu, Σ ) > ε] = 0 ε>0, (b) lim sup K,T Pr[ Π Z ZΠ /K C Q ZZ C/K >ε] = 0 ε>0, (c) lim sup K,T sup x F Π Z u(x) F C Zu (x) =0, (d) lim sup K,T sup x F Π Z V(x) F C ZV (x) =0, (e) lim sup K,T sup x F u P Z u(x) F Zu Q 1 ZZ Zu (f) lim sup K,T sup x F V P Z u(x) F ZV Q 1 ZZ Zu (g) lim sup K,T sup x F V P Z V(x) F ZV Q 1 ZZ ZV (x) =0, (x) =0, (x) =0. The proof of Theorem 3.1 is contained in the Appendix. Theorem 3.1 verifies the conditions (3.1) and (3.3) of Phillips and Moon s (1999) Lemma 6 for statistics that enter the k-class estimator and Wald statistic. Some of these objects converge in probability uniformly under the stated assumptions (parts (a) and (b)), while others converge in distribution uniformly (parts (c) (g)). It follows from the continuous mapping theorem that continuous functions of these objects also converge in probability (and/or distribution) uniformly under the stated assumptions. Because the k-class estimator ˆβ(k) and Wald statistic W (k) are continuous functions of these statistics (after centering and scaling as needed), it follows that the (K, T ) joint limit of these k-class statistics can be computed as the sequential limit (T, K ) seq. 4. MANY WEAK INSTRUMENT ASYMPTOTIC LIMITS This section collects calculations of the many weak instrument asymptotic limits of k-class estimators and Wald statistics. These calculations are done using sequential asymptotics (justified by Theorem 3.1), in which the fixed-k weak instrument asymptotic limits of Staiger and Stock (1997, Theorem 1) are analyzed as K. The limiting distributions differ depending on the limiting behavior of k. The main results are collected in Theorem 4.1, which is proven in the Appendix. Theorem 4.1. Suppose that Assumptions A, B, and C hold, and that K. Let x be an n-dimensional standard normal random variable. Then the following

IV Statistics with Many Instruments 115 limits hold as (K, T ): (a) TSLS: If T (k 1)/K 0, then ˆβ(k) β p σ 1/ uu Σ 1/ ( + I n ) 1 ρ and (4.1) p ρ (Λ + I n ) 1 ρ W (k)/k n[1 ρ (Λ + I n ) 1 ρ + ρ (Λ + I n ) ρ]. (4.) (b) BTSLS: If K [T (k 1)/K 1] 0 and mineval (Λ ) > 0, then K ( ˆβ(k) β) d N(0,σ uu 1/ Λ 1 (Λ + I n + ρρ )Λ 1 ) and (4.3) 1/ W (k) d x (Λ + I n + ρρ ) 1/ Λ 1 (Λ + I n + ρρ ) 1/ x/n. (4.4) (b) LIML, Fuller-k: If T (k k LIML )/ K 0 and mineval(λ ) > 0, then K [T (k 1)/K 1] d N(0, ), (4.5) K ( ˆβ(k) β) d N(0,σ uu 1/ Λ 1 (Λ + I n ρρ ) Λ 1 ) and (4.6) 1/ W (k) d x (Λ + I n ρρ ) 1/ 1 (Λ + I n ρρ ) 1/ x/n. (4.7) 5. DISCUSSION To simplify the proofs we have assumed i.i.d. sampling. Götze (1991) provides a Berry Esseen bound for i.n.i.d. sampling. The bound in the i.n.i.d. case is const (K1 /T )E X 3 = O([K 5/T ]1/ ), so the rate in Assumption C would be slower, K 5 /T 0. With this slower rate, the results in Section 3 would extend to the case where the errors and instruments are independently but not necessarily identically distributed. The many weak instrument representations in Theorem 4.1 for BTSLS, LIML, and the Fuller-k estimator rule out the partially identified and unidentified cases, for which mineval(λ ) = 0. This suggests that the approximations in Theorem 4.1, parts (b) and (c), might become inaccurate as Λ K becomes nearly singular. The behavior of the many weak instrument approximations in partially identified and unidentified cases remain to be explored. ACKNOWLEDGMENTS We thank an anonymous referee for helpful suggestions that spurred this research, and Whitney Newey for pointing out an error in an earlier draft. This work was supported by NSF grant SBR-014131.

116 Stock and Yogo APPENDIX This appendix contains the proofs of Theorems 3.1 and 4.1. The proof of Theorem 3.1 uses the following lemma. Lemma A.1. Let T = (Z Z/T ) 1 Q 1 ZZ. Under Assumptions A and C, (a) lim sup K,T Pr[ T 1 u Z T Z u >ε] = 0 ε>0, (b) lim sup K,T Pr[ T 1 V Z T Z u >ε] = 0 ε>0, (c) lim sup K,T Pr[ T 1 V Z T Z V >ε] = 0 ε>0. Proof of Lemma A.1. The strategy for proving each part is first to show that the relevant quadratic form (for example, in (a), the quadratic form T 1 u Z T Z u) has expected mean square that is bounded by const (K /T ), and then to apply Chebychev s inequality and the condition in Assumption C that K /T 0. The details of these calculations are tedious and are omitted; they can be found in an earlier working paper (Stock and Yogo 00, Lemma A.). Proof of Theorem 3.1. (a) This follows from the weak law of large numbers because (u u/t, V u/t, V V/T ) do not depend on K. (b) Note that E[Π Z ZΠ/K C Q ZZ C/K ] = 0. The (1,1) element of this matrix is (Π Z ZΠ C Q ZZ C) 1,1 /K T K K = (TK ) 1 C i1 C j1 (Z it Z jt q ij ), t=1 i=1 j=1 where q ij is the (i, j) element of Q ZZ. Because Z t is i.i.d. (Assumption A(b)) and the elements of C are bounded (Assumption B), the expected value of the square of this element is E{[( Z Z C Q ZZ C) 1,1 /K ] } [ 1 T K K = E C i1 C j1 (Z it Z jt q ij ) TK = 1 TK i=1 t=1 i=1 j=1 K K const K T K K C i1 C j1 C k1 C l1 E[(Z it Z jt q ij )(Z kt Z lt q kl )] j=1 k=1 l=1 ( 1 K K i=1 ] ) 4 C i1 const K T. By the same argument applied to the (1,1) element, the remaining elements of Z Z /K C Q ZZ C/K are also bounded in mean square by const (K /T ). The matrix Π Z ZΠ/K is n n and so the number of elements does not depend on K, and the result (b) follows by Chebychev s inequality and noting that, under Assumption C, K /T 0.

IV Statistics with Many Instruments 117 (c) Under Assumption B, Π Z u = T 1/ C Z u = C (T 1/ T t=1 Z tu t ). Let P T denote the probability measure associated with T 1/ Z u and let P denote the limiting probability measure associated with Ψ Zu. Define the convex set A(x) = {y R K : C y x}, so that P T (A(x)) = F Π Z u(x) and P(A(x)) = F C Ψ Zu (x). By Assumption A, Z t u t is an i.i.d., mean zero K - dimensional random variable with finite third moments, so the Berry Esseen bound (3.4) applies and sup x F Π Z u(x) F C Ψ Z u(x) const K 4 /T. The result (c) follows from Assumption C. We note that this line of argument is used in Jensen and Mayer (1975). (d) The proof is the same as for (c). (e) Write u P Z u = (T 1/ u Z)(T 1 Z Z)(T 1/ Z u) = ξ 1 + ξ, where ξ 1 = (T 1/ u Z)Q 1 ZZ (T 1/ Z u) and ξ = (T 1/ u Z) T (T 1/ Z u). As in the proof of (c), let P T denote the probability measure associated with T 1/ Z u and let P denote the limiting probability measure of Ψ Zu. Let B(x) be the convex set, B(x) = {y R K : y Q 1 ZZ y x}, so that P T (B(x)) = F ξ 1 (x) and P(B(x)) = F Zu Q 1 ZZ (x). It follows from (3.4) that sup Zu x F ξ 1 (x) F Zu Q 1 ZZ (x) Zu const K 4/T. By Lemma A.1(a), ξ p 0 uniformly as (K, T ), and the result (e) follows. (f) and (g). The dimensions of V P Z u and V P Z V do not depend on K, and the proofs of (f) and (g) are similar to that of (e). Proof of Theorem 4.1. We first state the fixed-k weak instrument asymptotic representations of the k-class estimators. Define the K 1 and K n random variables z u = Q 1/ ZZ Zu σ uu 1/ and z V = Q 1/ ZZ ZV 1/ ( Zu and ZV are defined following (3.8)), so that ( ) [ ] zu 1 ρ N(0, Σ I vec(z V ) K ), where Σ =. (A.1) ρ I n Also let ν 1 = (λ + z V ) (λ + z V ) and ν = (λ + z V ) z u, where λ = Q 1/ ZZ C 1/. Then under Assumptions A and B, with fixed K, (A.) (A.3) ˆβ(k) β d σ 1/ uu 1/ (ν 1 κi n ) 1 (ν κρ) and (A.4) W (k) d (ν κρ) (ν 1 κi n ) 1 (ν κρ) n[1 ρ (ν 1 κi n ) 1 (ν κρ) + (ν κρ) (ν 1 κi n ) (ν κρ)], (A.5) where (A.5) holds under the null hypothesis β = β 0. The representations (A.4) and (A.5) follow from Staiger and Stock (1997, Theorem 1) because Assumptions A and B imply Staiger and Stock s Assumptions M and L when K is fixed.

118 Stock and Yogo The following limits hold jointly as K : p ν 1 /K Λ + I n, (A.6) p ν /K ρ, (A.7) z u z u K K λ z 0 ρ u d N(0, B), whereb = 0 Λ 0, K z V z ρ 0 I n + ρρ u K ρ (A.8) K (ν K ρ)/ K N(0, Λ + I n + ρρ ). (A.9) The results (A.6) (A.9) follow by straightforward calculations using the central limit theorem, the weak law of large numbers, and the joint normal distribution of z u and z V in (A.1). We now turn to the proof of Theorem 4.1. (a) From (A.4), the fixed-k weak instrument approximation to the distribution of the TSLS estimator is ˆβ TSLS β σ uu 1/ 1/ ν 1 1 ν = σ uu 1/ 1/ (ν 1 /K ) 1 (ν /K ). The limit stated in the theorem for the estimator follows by substituting (A.6) and (A.7) into this expression. The many weak instrument limit for the TSLS Wald statistic follows by rewriting (A.5) as W TSLS (ν /K ) (ν 1 /K ) 1 (ν /K ) /K n[1 ρ (ν 1 /K ) 1 (ν /K ) + (ν /K ) (ν 1 /K ) (ν /K )] and applying (A.6) and (A.7). (b) The fixed-k weak instrument approximation to the distribution of a k-class estimator, given in (A.4), in general can be written as [ K [ ˆβ(k) β] σ 1/ ν1 K I n uu Σ 1/ [ ν K ρ K 1 ( ) ] κ 1 K I n K K K ( ) ] κ K ρ, (A.10) K where T (k 1) d κ for fixed K. The assumption K [T (k 1)/K 1] 0 implies that (κ K )/ K 0, so by (A.6) and (A.9) we have, as K, ν 1 K I n 1 ( ) κ K p I n Λ and K K K ν K ρ K ( κ K K ) ρ d N(0, Λ + I n + ρρ ), and the result (4.3) follows. The assumption mineval( ) > 0 is used to ensure the invertibility of Λ. The distribution of the Wald statistic follows.

IV Statistics with Many Instruments 119 (c) For fixed K, T (k LIML 1) d κ. We show below that, as K, κ K = z u z u K + o p (1). (A.11) K K The result (4.5) follows from (A.11) and (A.8). Moreover, applying (A.6), (A.8), (A.9), and (A.11) yields ( ν 1 K I n κ ) K p Λ and K K 1 K ν K ρ K I n ( κ ) K ρ = λ z u + z V z u K ρ K K K ( z ) u z u K ρ + o p (1) d N(0, + I n ρρ ), K where is invertible by the assumption mineval(λ ) > 0. The result (4.6) follows, as does the distribution of the Wald statistic. It remains to show (A.11). From (.11), κ is the smallest root of [( ) ( )] z 0 = det u z u ν κ 1 ρ. (A.1) ν ν 1 ρ I n Let φ = (κ K )/ K, a = (z u z u K )/ K, b = (ν K ρ)/ K, and L = (ν 1 K I n )/K. Then (A.1) can be rewritten so that φ is the smallest root of [ ] a φ (b φρ) 0 = det b φρ. (A.13) K L φi n We first show that K 1/4 φ p 0. Let φ = K 1/4 φ. By (A.6), (A.8), and (A.9), K 1/4 a p 0, K 1/4 b p 0, and L p Λ. By the continuity of the determinant, it follows that in the limit K, φ is the smallest root of the equation [ φ φρ ] 0 = det φρ φi n + O p (K 1/4, (A.14) ) from which it follows that φ = K 1/4 φ p 0. To obtain (A.11), write the determinantal equation (A.13) as 0 = [(a φ) (b φρ) (K 1/ L φi n ) 1 (b φρ)] det(k 1/ L φi n ) = K n/ {(a φ) [K 1/4 (b φρ)] (L K 1/ φi n ) 1 [K 1/4 (b φρ)] det(l K 1/ φi n ) = K n/ {[(a φ)] det(λ ) + o p (1)}, (A.15) where the final equality follows from K 1/4 b p 0, L p Λ, K 1/4 φ p 0, and det(λ ) > 0. By the continuity of the solution to (A.13), it follows that φ = a + o p (1), which, in the original notation, is (A.11).

10 Stock and Yogo References Angrist, J. D., and A. B. Krueger (1991), Does Compulsory School Attendance Affect Schooling and Earnings, Quarterly Journal of Economics, 106, 979 1014. Bekker, P. A. (1994), Alternative Approximations to the Distributions of Instrumental Variables Estimators, Econometrica, 6, 657 81. Bertkus, V. Y. (1986), Dependence of the Berry Esseen Estimate on the Dimension, Litovsk. Mat. Sb., 6, 05 10. Chao, J. C., and N. R. Swanson (00), Consistent Estimation with a Large Number of Weak Instruments, unpublished manuscript, University of Maryland. Donald, S. G., and W. K. Newey (001), Choosing the Number of Instruments, Econometrica, 69, 1161 91. Fuller, W. A. (1977), Some Properties of a Modification of the Limited Information Estimator, Econometrica, 45, 939 53. Götze, F. (1991), On the Rate of Convergence in the Multivariate CLT, Annals of Probability, 19, 74 39. Jensen, D. R., and L. S. Mayer (1975), Normal-Theory Approximations to Tests of Linear Hypotheses, Annals of Statistics, 3, 49 44. Kunitomo, N. (1980), Asymptotic Expansions of the Distributions of Estimators in a Linear Functional Relationship and Simultaneous Equations, Journal of the American Statistical Association, 75, 693 700. Morimune, K. (1983), Approximate Distributions of k-class Estimators when the Degree of Overidentifiability is Large Compared with the Sample Size, Econometrica, 51, 81 41. Nagar, A. L. (1959), The Bias and Moment Matrix of the General k-class Estimators of the Parameters in Simultaneous Equations, Econometrica, 7, 575 95. Phillips, P. C. B., and H. R. Moon (1999), Linear Regression Limit Theory for Nonstationary Panel Data, Econometrica, 67, 1057 111. Rothenberg, T. J. (1984), Approximating the Distributions of Econometric Estimators and Test Statistics, Chapter 15 in Handbook of Econometrics, Vol. II (ed. by Z. Griliches and M. D. Intriligator), Amsterdam: North Holland, pp. 881 935. Sargan, D. (1975), Asymptotic Theory and Large Models, International Economic Review, 16, 75 91. Staiger, D., and J. H. Stock (1997), Instrumental Variables Regression with Weak Instruments, Econometrica, 65, 557 86. Stock, J. H., and M. Yogo (00), Testing for Weak Instruments in Linear IV Regression, NBER Technical Working Paper 84. Stock, J. H., and M. Yogo (004), Testing for Weak Instruments in Linear IV Regression, Chapter 5 in this volume.