On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression

On the asymtotic sizes of subset Anderson-Rubin and Lagrange multilier tests in linear instrumental variables regression Patrik Guggenberger Frank Kleibergeny Sohocles Mavroeidisz Linchun Chen\ June 22 Deartment of Economics, UCSD, 95 Gilman Dr., La Jolla, CA 9293-58. Email: guggenberger@ucsd.edu. y Deartment of Economics, Box B, Brown University, Providence, RI 292. Email: Frank_Kleibergen@brown.edu. Homeage: htt://www.econ.brown.edu/fac/frank_kleibergen. z Deartment of Economics, Oxford University, Manor Road, Oxford OX 3UQ, United Kingdom. Email: sohocles.mavroeidis@economics.ox.ac.uk. Homeage: htts://sites.google.com/site/sohoclesmavroeidis. \ Deartment of Economics, UCSD, 95 Gilman Dr., La Jolla, CA 9293-58. Email: lic26@ucsd.edu. Guggenberger would like to thank the NSF for research suort under grant SES-2. Mavroeidis would like to thank the Euroean Commission for research suort under a FP7 Marie Curie Fellowshi CIG 293675. We would like to thank Jim Stock for valuable advice.

Abstract We consider tests of a simle null hyothesis on a subset of the coe cients of the exogenous and endogenous regressors in a single-equation linear instrumental variables regression model with otentially weak identi cation. Existing methods of subset inference (i) rely on the assumtion that the arameters not under test are strongly identi ed or (ii) are based on rojection-tye arguments. We show that, under homoskedasticity, the subset Anderson and Rubin (949) test that relaces unknown arameters by LIML estimates has correct asymtotic size without imosing additional identi cation assumtions, but that the corresonding subset Lagrange multilier test is size distorted asymtotically. Keywords: Asymtotic size, linear IV model, size distortion, subset inference, weak instruments. JEL Classi cation Numbers: C, C2, C2.

Introduction In the last decade we have witnessed an increase in the literature dealing with inference on the structural arameters in the linear instrumental variables (IVs) regression model. Its objective is to develo owerful tests whose asymtotic null rejection robability is controlled uniformly over a arameter sace that allows for weak instruments. For a simle full vector hyothesis, satisfactory rogress has been made and several robust rocedures were introduced, most notably, the AR test by Anderson and Rubin (949), the Lagrange multilier (LM) test of Kleibergen (22), and the conditional likelihood ratio (CLR) test of Moreira (23). An alied researcher is, however, tyically not interested in simultaneous inference on all structural arameters, but in inference on a subset, like one comonent, of the structural arameter vector. Tests of a subset hyothesis are substantially more comlicated than tests of a joint hyothesis since the unrestricted structural arameters enter the testing roblem as additional nuisance arameters. 2 Under the assumtion that the unrestricted structural arameters are strongly identi ed, the above robust full vector rocedures can be adated by relacing the unrestricted structural arameters by consistently estimated counterarts, see Stock and Wright (2), Kleibergen (24, 25), Guggenberger and Smith (25), Otsu (26), and Guggenberger, Ramalho, and Smith (23), among others, for such adatations of the AR, LM, and CLR tests to subset testing. Under the assumtion of strong identi cation of the unrestricted structural arameters, the resulting subset tests were roven to be asymtotically robust with resect to the otential weakness of identi cation of the hyothesized structural arameters and, trivially, have non-worse ower roerties than rojection tye tests. However, a long-standing question concerns the asymtotic size roerties of these tests without any identi cation assumtion imosed on the unrestricted structural arameters. The current aer rovides an answer to that question. We consider a linear IV regression model with a arameter sace that does not restrict the reduced form coe cient matrix and thus allows for weak instruments. The arameter sace imoses a Kronecker roduct structure on a certain covariance matrix, a restriction that is imlied, for examle, by conditional homoskedasticity. We study the asymtotic size of subset AR and LM tests when the unrestricted The latter test was shown to essentially achieve otimal ower roerties in a class of tests restricted by a similarity condition and certain invariance roerties, see Andrews, Moreira, and Stock (26). 2 A general method to do subset inference is to aly rojection techniques to the full vector tests. The resulting subset tests have asymtotic size smaller or equal to the nominal size. But a severe drawback is that they are usually very conservative, esecially if many dimensions of the structural arameter vector are rojected out. Tyically, this leads to subotimal ower roerties. In the linear IV model, a rojected version of the AR test has been discussed in Dufour and Taamouti (25). A re nement that imroves on the ower roerties of the latter test is given in Chaudhuri and Zivot (2).

structural arameters are relaced by the limited information maximum likelihood (LIML) estimator. The null hyothesis allows for simultaneous tests on subsets of the sloe arameters of the exogenous and endogenous regressors. As the main result of the aer, we rove that the subset AR test has correct asymtotic size. In contrast, we show that the asymtotic size of the subset LM test is distorted. We document this by deriving the asymtotic null rejection robability of the subset LM test under certain weak IV drifting arameter sequences. The robability can be substantially larger than the nominal size when the number of instruments is large. For examle, for nominal size = 5% and two right hand side endogenous variables, we obtain asymtotic null rejection robabilities under certain arameter sequences of 9:6, 5:5, and 9:5% when the number of instruments equals ; 2; and 3; resectively. Given that the LM statistic aears as a main element in the subset CLR test, these ndings indicate that the latter test also is asymtotically size distorted. The aer is structured as follows. Section 2 introduces the model and discusses the asymtotic size roerties of the subset AR test. Section 3 discusses the asymtotic size distortion of the subset LM test for the case with two endogenous regressors. An Aendix rovides the roof of the main theoretical result and some additional technicalities. We use the following notation. For a full column rank matrix A with n rows let P A = A(A A) A and M A = I n P A ; where I n denotes the n n identity matrix. If A has zero columns, then we set M A = I n : The chi square distribution with k degrees of freedom and its -quantile are written as 2 k and 2 k; : 2 Asymtotic size of the subset AR test We consider the linear IV model y = Y + W + "; (Y. W ) = Z( Y. W ) + (V Y. V W ); () where y 2 R n and W 2 R nm W are endogenous variables, Y 2 R nm Y consists of endogenous and/or exogenous variables, Z 2 R nk are instrumental variables, " 2 < n ; V Y 2 < nm Y and V W 2 < nm W are unobserved disturbances; V = [V Y : V W ]; and 2 < m Y, 2 < m W ; Y 2 < km Y and W 2 < km W ; with m = m Y + m W ; are unknown arameters and k m: We are interested in testing the subset null hyothesis H : = versus H : 6= : (2) 2

This setu also covers general linear restrictions on the coe cients of the structural equation, since these can be exressed as (2) by aroriate rearametrization. Since the variables in Y can consist of endogenous or exogenous variables, we allow for simultaneous tests on elements of the sloe arameters of the exogenous and endogenous regressors. For those variables in Y which are exogenous and are therefore art of the instrumental variables Z; the disturbances in their rst stage equation are all identical to zero which we allow for. To kee the exosition simle, we omit from the model stated in equation () any exogenous regressors whose coe cients remain unrestricted by the null hyothesis (2). When such exogenous regressors are resent in the model, our results remain valid if we relace the variables that are currently in () in the de nition of the various statistics by the residuals that result from a regression of those variables on the included exogenous variables. 3 Denote by Z i the i-th row of Z written as a column vector and analogously for other variables. We assume that the realizations (" i ; Vi ; Zi) ; i = ; :::; n; are i.i.d. with distribution F: The distribution F may deend on n but for the most art we write F rather than F n to simlify notation. Furthermore, E F (Z i (" i ; Vi )) = ; and by E F we denote exectation when the distribution of (" i ; Vi ; Zi) is F: As make exlicit below, we also assume homoskedasticity. The Anderson-Rubin (AR) statistic (times k), see Anderson and Rubin (949), for testing the joint hyothesis is de ned as AR n ( ; ) ; where H : = ; = ; (3) AR n (; ) = ^ ""(;) (y Y W ) P Z (y Y W ), ^ "" (; ) = (; ; ) ^ (; ; ) ; and ^ = T k (y. Y. W ) M Z (y. Y. W ): (4) With slight abuse of notation, we de ne the subset AR statistic for testing H as AR n ( ) = min AR n ( 2< m W ; ) : (5) For ~ = arg min AR n ( ; ) ; the subset AR statistic is then identical to AR n ( ) = ^ "" ( ; ~) (y Y W ~) P Z (y Y W ~): (6) 3 In articular, suose the structural equation is y = Y + W + X + "; where X 2 < nq denotes the matrix of included exogenous regressors. Then, we need to relace (y. Y. W. Z) in the de nitions (4), (5), (6), () and (7) by M X (y. Y. W. Z): 3

The joint AR statistic in (4) is a monotonic transformation of the concentrated log-likelihood of (; ) under i.i.d. normal errors, see, e.g., Moreira (23). Minimizing the AR statistic with resect to is therefore identical to maximizing the log-likelihood, so ~ is the constrained limited information maximum likelihood (LIML) estimator of under the null hyothesis (2). The k-class formulation of the LIML estimator reads, see Hausman (983): 4 ~ = W (P Z min n k M Z)W W (P Z min n k M Z)(y Y ); (7) where min equals the smallest root of the characteristic olynomial: ^ W (y Y. W ) P Z (y Y. W ) = ; (8) with ^ W = B @ C A B ^ @ C A : (9) I mw I mw If we substitute the k-class formulation of the LIML estimator (7), into the exression of the subset AR statistic (6), we obtain that the subset AR statistic equals the smallest root of the characteristic olynomial in (8): AR n ( ) = min : () It is well known, see e.g. Stock and Wright (2) and Startz et. al: (26), that when the unrestricted structural arameters are strongly identi ed, AR n ( ) has a 2 k m W limiting distribution. This nding motivates the choice of the critical value for the subset AR test. The nominal size subset AR test rejects the null in (2) if AR n ( ) > 2 k m W ; : () We next de ne the arameter sace for (; W ; Y ; F ) under the null hyothesis in (2). 4 For exository uroses, we slightly altered the usual exression of the k-class estimator which has P Z relaced by the identity matrix and uses the smallest root of the characteristic olynomial in (8) with ^ relaced by (y X. W ) (y X. W ): We use the notation of the k-class estimator in (7) because usage of its exression directly shows the equality of the subset AR statistic and the smallest characteristic root of (8) stated in (). 4

For U i = (" i ; V W;i ) ; 5 = f = (; W ; Y ; F ) : 2 R m W ; W 2 R km W ; Y 2 R km Y ; E F (jjt i jj 2+ ) M; for T i 2 fz i " i ; vec(z i V W;i); " i V W;i ; " i ; V W;i ; Z i g; E F (Z i (" i ; V i )) = ; E F (vec(z i U i)(vec(z i U i)) ) = (E F (U i U i) E F (Z i Z i)); min (A) for A 2 fe F (Z i Z i); E F (U i U i)gg (2) for some > and M < ; where min () denotes the smallest eigenvalue of a matrix, the Kronecker roduct of two matrices, and vec() the column vectorization of a matrix. The arameter sace does not lace any restrictions on the arameter W and thus allows for weak identi cation. Aroriate moment restrictions are imosed that allow for the alication of Lyaunov central limit theorems (CLTs) and weak law of large numbers (WLLNs). As in Staiger and Stock (997), it is assumed that the covariance matrix E F (vec(z i U i)(vec(z i U i)) factors into the Kronecker roduct (E F (U i U i) E F (Z i Z i)); which holds, for examle, under conditional homoskedasticity. Note that U i = (" i ; V W;i ) does not include the reduced form error V Y;i for which no assumtions need to be imosed for the subset AR test. This also exlains why V Y;i can be identical to zero which is the case when Y is exogenous and an element of Z: The asymtotic size of the subset AR test is de ned as AsySz AR; = lim su n su P (AR n ( ) > 2 k m W ; ); (3) 2 where P denotes robability of an event when the null data generating rocess is inned down by 2 : The main result of the aer can now be formulated as follows. Theorem Let < < : Then the asymtotic size of the subset AR test equals : AsySz AR; = : By de nition, the nominal size rojected AR test, see e.g. Dufour and Taamouti (25), rejects the null in (2) if the joint AR statistic AR n ( ; ) in (4) exceeds 2 k; for all 2 < m W : This is equivalent to AR n ( ) > 2 k; ; where AR n ( ) is the subset AR statistic (5). Therefore, the nominal size subset AR and rojected AR test are based on the same test statistic but the former test uses a strictly smaller critical value if m W > : We therefore have 5 Regarding the notation (; W ; Y ; F ) and elsewhere, note that we allow as comonents of a vector column vectors, matrices (of di erent dimensions), and distributions. 5

the following corollary. Corollary 2 The nominal size rojected AR test has asymtotic size strictly smaller than : It is strictly less owerful than the nominal size subset AR test in (). Comments.. Theorem and Corollary 2 combined imly that the subset AR test controls the asymtotic size and rovides ower imrovements over the rojected AR test. 2. Theorem imlies, in articular, that the limiting distribution of AR n ( ) under strong IV asymtotics rovides a stochastic bound on its limiting distribution under weak IV asymtotics. 3. The results in Theorem are seci c to using the LIML estimator to estimate the unrestricted structural arameters. When we use another estimator to estimate them, like, for examle, 2SLS, Theorem no longer holds and the resulting subset AR test may be asymtotically size distorted. 6 4. When m W = ; AR n ( ) equals a version of the J statistic that is based on the LIML estimator, see e.g. Sargan (958) and Hansen (982). Theorem imlies that asymtotically the J statistic is bounded by a 2 (k m) distribution and that the resulting J test has correct asymtotic size irresective of the degree of identi cation. Again, this robustness roerty does not hold if the J statistic is evaluated at the 2SLS rather than the LIML estimator. 5. The roof of Theorem involves a number of stes. Some of these stes are discussed in Lemmas 3 and 4 which are in the Aendix. First, in Lemma 3, we construct an uer bound on the subset AR statistic. This uer bound is a nite samle one so it holds for every n. The concetual idea behind the roof is that if the asymtotic size of an -level test based on this uer bound statistic using the 2 k m W ; critical value is equal to and the uer bound is shar for some drifting sequences of the arameter W then the asymtotic size of the subset AR statistic is equal to as well. We therefore roceed, in Lemma 4, by constructing the asymtotic behavior of the uer bound of the subset Anderson-Rubin statistic. This uer bound equals a ratio so we searately derive the asymtotic behavior of the numerator and denominator. With resect to the numerator, we show that its asymtotic behavior for a given drifting sequence of W is 2 k m W : For the denominator, we show that its asymtotic behavior is such that it is greater than or equal to one. Combining, we obtain that the uer bound for a given drifting sequence of W is bounded by a 2 k m W random variable. The next (main) technical hurdle that is addressed in the roof of Theorem is that this 2 k m W bound 6 It can be shown that the subset AR statistic that is based on the 2SLS estimator of is asymtotically size distorted. 6

alies over all ossible drifting sequences of W : The bound therefore even alies for drifting sequences which are such that the asymtotic distribution of the subset AR statistic does not exist. The size of the subset AR statistic in such sequences is, however, still controlled because the nite samle bound on the subset AR statistic still alies and we have shown that its maximal rejection frequency over all ossible drifting sequences of W is controlled. 6. In linear IV, it is, for exository uroses, common to analyze the case of xed instruments, normal errors and a known covariance matrix, see e:g: Moreira (23,29) and Andrews et: al: (26). In that case, the bound on the subset AR statistic simli es as well: AR ( ) z "M ( W +Z VW ) z" + h (W +z VW ) ( W +z VW ) i z "M (W +z VW ) z " 2 k m W ; (4) with z " ; z VW and indeendent standard normal distributed k ; k m W and m W dimensional random vectors/matrices and W = (Z Z) 2 W 2 W W:" ; with W W:" = W W W " "" W " ; for = E(U iui) = "" "W W " W W : When m W = and the length of W goes to in nity, the distribution of the subset AR statistic is 2 k m W which coincides with the bound in (4) for a (very) large value of the length of W : 7. To gain some further intuition for the result in Theorem, we note that the subset AR statistic is identical to Anderson s (95) canonical correlation statistic which tests if a matrix is of reduced rank. A test of H : = using the subset AR statistic is therefore identical to a test of H : rank() = m W using the Anderson s (95) canonical correlation statistic in the model (y Y. W ) = Z + (u. V W ); (5) with u = " + V W and 2 < k(m W +). The value for imlied under H and () is = W (. I mw ); (6) which is a k (m W + ) dimensional matrix of rank m W : The exression of the uer bound in the known covariance matrix case in (4) shows that the distribution of the subset AR statistic is non-decreasing in the length of the normalized exression of W ; W ; when m W = : The length of W re ects the strength of identi cation so the distribution of the subset AR statistic is non-decreasing in the strength of identi cation. This roerty can be understood using the analogy with the statistic testing the rank of discussed above. When the length of W is large, the smallest value of the rank statistic is attained at the reduced rank structure of shown in (6). When the length of W is small, the 7

smallest value of the rank statistic can be attained at a reduced rank value of which results from a reduced rank structure in W : This imlies that this value of the rank statistic is less than the value attained at the reduced rank structure corresonding with (6). In the latter case, the rank statistic has a 2 (k m W ) distribution so for small values of the length of W ; the distribution of the rank statistic is dominated by the 2 (k m W ) distribution. 3 Size distortion of the subset LM test The joint AR test is known to have relatively oor ower roerties when the degree of overidenti cation is large. Recently, other tests were introduced that imrove on the ower roerties, in articular, the LM test, Kleibergen (22) and the CLR test, Moreira (23). The urose of this section is to show that the subset version of the LM test, Kleibergen (24), su ers from asymtotic size distortion. Because the LM statistic is an integral art of the CLR statistic, the subset CLR test quite certainly also su ers from asymtotic size distortion. Therefore, given the results in this section, if one attemts to imrove further on the ower roerties of the subset AR test, the subset LM and CLR tests o er no easy solution. To document the asymtotic size distortion, it is enough to show asymtotic overrejection of the null hyothesis under certain arameter sequences n = ( n ; W;n ; Y;n ; F n ): Overrejection of the null of the subset LM test is ervasive under weak IV sequences and we focus on just one articular choice below. For simlicity, we consider only the case where m Y = m W = ; i.e. (2) tests a hyothesis on the scalar coe cient of the endogenous variable Y: In that case the subset LM test statistic is given by LM n ( ) = ^ "" ( ; ~) (y Y W ~) P Z e ( ) (y Y W ~); (7) where 2 ~( ) = (Z Z) Z 6 4 (Y. W ) (y Y W e) ^ ""( ;~) B @ C A ^ e I m 3 7 7 5 : (8) When m Y = m W = ; the nominal size subset LM test rejects the null in (2) if LM n ( ) > 2 ; : (9) 8

The arameter sace is de ned in this section as in (2) with U i relaced by (" i ; V i ) and with the additional restrictions E F (jjt i jj 2+ ) M; for T i 2 fz i V Y;i ; " iv Y;i ; V Y;i g: These restrictions are needed for the subset LM test for the alication of WLLNs and CLTs when constructing its limiting distribution. To document asymtotic overrejection of the test in (9), we focus on arameter sequences n = n;h that are such that n =2 Q =2 Y = Y Y h 2 R k ; n =2 Q =2 W = W W h 2 2 R k ; ( E F (" i V Y;i ) "" Y Y ; E F (" i V W;i ) "" W W ; E F (V Y;i V W;i ) W W Y Y ) h 2 2 [ ; ] 3 ; (2) where Q = E(Z i Z i); Y Y = E F (V 2 Y;i ); W W = E F (V 2 W;i ); and h = (h ; h 2; h 2 ). The Aendix derives the limiting distribution LM h ( ) of LM n ( ) under such sequences n ; see (66), when IVs are weak, i.e. jjh jj < and jjh 2 jj < : The limiting distribution only deends on the arameters h = (h ; h 2) and h 2 : In fact, it only deends on h through jjh jj; jjh 2 jj; h h 2 : For examle, when k = 5; ; 5; 2; 25; and 3 then under n;h with, for examle, jjh jj = ; jjh 2 jj = ; h h 2 = 95; h 2 = ; h 22 = :95; and h 23 = :3; the asymtotic null rejection robability is 5:7, 9:6, 2:9, 5:5, 7:7, 9:5%, resectively, for nominal size = 5%: These robabilities are obtained by simulation using 5, simulation reetitions. They rovide a lower bound for the asymtotic size of the subset LM test. The test is therefore size distorted and the distortion can be substantial when the number of instruments k is large. Aendix The Aendix rovides the roof of Theorem and the derivation of the limiting distribution of the subset LM statistic. We rst state two lemmas that are helful to rove Theorem. Their roofs are given after the roof of Theorem below. Lemma 3 Under the null (2) we have wa AR n ( ) = and AR n ( ) can be bounded by min d2r +m W d ( =2 ^ =2 ) NnL n N n ( =2 ^ =2 )d ; (2) d d AR n ( ) z ";nm n z ";n n ; (22) 9

where n = (; n( n n ) =2 )( =2 ^ =2 )(; n( n n ) =2 ) ; (23) where N n = ( n n ) =2 n I mw ; L n = z ";nm n z ";n n n ; (24) with n = n + z VW ;n 2 R km W ; (25) n = ( n n ) =2 nz ";n 2 R m W (26) and z ";n = (Z Z) 2 Z "? 2 "" 2 R k ; z VW ;n = (Z Z) 2 Z V W " "" "W =2 W W:" 2 Rkm W ; n = (Z Z) 2 W =2 W W:" 2 Rkm W : (27) The next lemma derives limiting exressions for n and z ";nm n z ";n under sequences n = ( n ; W;n ; Y;n ; F n ) of null data generating rocesses in such that the factors of a singular value decomosition of (n) = Q =2 n =2 W =2 W W:" 2 Rkm W (28) with Q = E(Z i Zi); W W:" = W W W " "" W " ; for = E(U iui) "" "W = ; W " W W converge. More recisely, by the singular value decomosition theorem, see e.g. Golub and Van Loan (989), (n) can be decomosed into a roduct (n) = G n D n R n; (29) where G n and R n are k k and m W m W dimensional real orthonormal matrices, resectively, and D n is a k m W dimensional rectangular real diagonal matrix with nonnegative elements. The latter matrix is unique u to ordering of the diagonal elements. Let R = R [ f+g. Lemma 4 Let n = ( n ; W;n ; Y;n ; F n ) be a sequence of null data generating rocesses in and n a subsequence of n and G n D n R n a singular value decomosition of ( n ). Assume G n G for an orthonormal matrix G 2 R kk and D n D for a diagonal matrix D 2 R km W : Then, under n we have (i) n ( + n ) = o () for some sequence of random

variables n that satisfy n with robability and (ii) z "; n M n z ";n d 2 k m W. Proof of Theorem. Lemma 3 shows that AR n ( ) only deends on n and wa is bounded by There exists a worst case sequence n rocesses such that AsySz AR; = lim su n AR n ( ) z ";nm n z ";n n : (3) = lim su n = ( n ; W;n ; Y;n ; F n ) 2 of null data generating su P (AR n ( ) > 2 k m W ; ) 2 P n (AR n ( ) > 2 k m W ; ) lim sup n ( z ";nm n z ";n > 2 k m n W ; ); (3) n where the rst equality in (3) holds by de nition of AsySz AR; (3), the second equality by the choice of the sequence n ; n ; and the inequality holds by (3). Furthermore, one can always nd a subsequence n of n such that along n we have G n G for an orthonormal matrix G 2 R kk and D n D for a diagonal matrix D 2 R km W and lim sup n ( z ";nm n z ";n > 2 k m n W ; ) = lim sup n ( z "; n M n z ";n > 2 k m n n W ; ); (32) n where G n D n R n is a singular value decomosition of ( n ): But, under any sequence of null data generating rocesses n = ( 2n ; W;n ; Y;n ; F n ) in and subsequence n of n such that D n D and G n G under n ; we have as shown in Lemma 4(i) and (ii), z "; n M n z ";n n z "; n M n z ";n + o () d 2 k 2 : (33) This together with (3) and (32) shows that AsySz AR; : Under strong IV sequences, the asymtotic null rejection robability of the subset AR test equals ; see Stock and Wright (2). Thus, AsySz AR; = : Proof of Lemma 3. The subset AR statistic AR n ( ); equals the smallest root of the

characteristic olynomial (8). From (), we have that P Z (y Y.W ) = P Z Z W (. I mw ) + (". V W ). I mw : (34) Substituting this in (8), re-multilying by. I mw and ost-multilying by. I mw yields: ^ (". Z W + V W ) P Z (". Z W + V W ) = ; (35) with ^ ^ "" ^ "W =. ^W I mw. I mw = : Now, secify = E(U i Ui) = ^ W " ^W W "" "W and use W " W W 2 = 2 "" "" "W 2 W W:" 2 W W:" ^ 2 "" ^ ; ^ 2 2 = "" ^ "W ^ 2 ^ W W:" W W:" (36) with W W:" = W W W " "" W " ; ^ W W:" = ^ W W ^ W "^ "" ^ W " and the elements of ^ 2 exist with robability aroaching (wa). We re- and ost-multily (35) by 2 and 2 ; resectively, to get 2 ^ 2 2 W (". Z W + V W ) P Z (". Z W + V W ) 2 W = (37) or 2 ^ 2 (z ";n. n + z VW ;n) (z ";n. n + z VW ;n) = (38) for z ";n = (Z Z) 2 Z " 2 "" 2 R k ; z VW ;n = (Z Z) 2 Z V W " "" "W =2 W W:" 2 Rkm W ; n = (Z Z) 2 W =2 W W:" 2 Rkm W : (39) Using the moment restrictions in (2), an alication of Lyaunov CLTs and WLLNs imlies 2

that under any drifting arameter sequence n = ( n ; W;n ; Y;n ; F n ) (z ";n; vec(z VW ;n) ) d N(; I k(+mw )); =2 ^ =2 I +mw ; Q (n Z Z) I k ; (4) for Q = E F (Z i Z i): (4) Therefore, z ";n and z VW ;n are asymtotically indeendent. We now use that (z ";n. n + z VW ;n) (z ";n. n + z VW ;n) = z ";nz ";n z ";n( n + z VW ;n) ( n + z VW ;n) z ";n ( n + z VW ;n) ( n + z VW ;n) NnL n N n = (42) for N n = ( n n ) =2 n I mw ; L n = z ";nm n z ";n n n ; (43) with 7 n = n + z VW ;n 2 R km W ; n = ( n n ) =2 nz ";n 2 R m W : (44) to re and ostmultily the elements in the characteristic olynomial in (38) by ( 2 ^ 2 ) and 2 ^ 2 ; which exists wa : I mw + ( 2 ^ 2 ) N nl n N n ( 2 ^ 2 ) = : (45) The smallest root of the characteristic olynomial in (45) is with robability one equal to min d2r +m W d ( =2 ^ =2 ) NnL n N n ( =2 ^ =2 )d : (46) d d 7 We do not index ; "" ; etc. by F to simlify notation. Likewise for Q. 3

If we now use a value of d such that d = (^ =2 2 ) ; (47) ( n n ) =2 n the bottom m W rows of N n cancel out in the numerator and we obtain a bound on the subset AR statistic: with n = AR n ( ) n z ";nm n z ";n ; (48) ( n n) =2 n ( =2 ^ =2 ( n n) =2 n : Proof of Lemma 4. For ease of resentation, we assume n = n: Assume wlog that the j-th diagonal element D j of D is nite for j and D j = for j > for some m W : De ne a full rank diagonal matrix B n 2 R m W m W j and equal to D nj B n are bounded by. (i) We can write with j-th diagonal element equal to for otherwise for j >. Note that for all large enough n; the elements of n = (n Z Z) =2 Q =2 (n) = (n Z Z) =2 Q =2 G n D n R n: (49) Then, noting that (n Z Z) =2 Q =2 I k under n ; we have n R n B n GD; where D 2 R km W is a rectangular diagonal matrix with diagonal elements D j = D j < for j and D j = for j >. Noting that =2 b =2 = I + + o () we have n = (; n( n n ) =2 )( =2 b =2 )(; n( n n ) =2 ) = + n( n n ) n + (; n( n n ) =2 ) =2 ( b ) =2 )(; n( n n ) =2 ) (5) = + n( n n ) n + (; e n )o ()(; e n ) (5) for e n = z ";n( n R n B n )(( n R n B n ) ( n R n B n )) (R n B n ) (52) Note that n R n B n = n R n B n +z VW ;nr n B n ; n R n B n GD and, using (4) and D nj for j > ; z VW ;nr n B n converges in distribution to a random matrix, z VW say, whose j-th columns for j > are zero and the rst columns are random and linearly indeendent of z ";n with robability. This and the restrictions in (2) then imly that (( n R n B n ) ( n R n B n )) = O () and given that R n B n = O() we have e n = O (): This and (5) then roves the claim with n = n( n n ) n : 4

(ii) Note that because R n B n 2 R m W m W has full rank, we have M n = M nr nb n : As established in (i), we have n R n B n d GD + z VW ; where by (4), this limit is indeendent of the limit distribution, z " N(; I k ) say, of z ";n : Therefore, z ";nm n z ";n d z "M GD+zVW z " under n : Given indeendence of z " and z VW ; it follows that conditional on z VW we have z "M GD+zVW z " 2 k m W whenever GD + z VW has full column rank. Therefore, also unconditionally, z "M GD+zVW z " 2 k m W : Limiting Distribution of the Subset LM Statistic We next derive the limiting distribution of the subset LM statistic under the drifting sequence n = n;h in (2) in the weak IV case, where jjh jj < and jjh 2 jj < : Recall that by WLLNs and CLTs we have under n for ^Q = n Z Z B @ ( ^Q) =2 n =2 Z "= "" ( ^Q) =2 n =2 Z V Y = Y Y ( ^Q) =2 n =2 Z V W = W W n ( " " ; V "" Y V Y Y Y ; V W V W W W ; C A d " V Y "" Y Y ; B @ z ";h z VY ;h z VW ;h " V W "" W W ; C A N B B @; @ h 2 h 22 h 2 h 23 h 22 h 23 C C A I k A ; V Y V W Y Y W W ) (; ; ; h 2 ; h 22 ; h 23 ); Q ^Q I k ; n Z [" : V ] ; and (53) where z ";h ; z VY ;h; z VW ;h 2 R k. De ne v ;h v 2;h = W W W P Z W ""W W W P Z " = (z V W ;h + h 2 ) (z VW ;h + h 2 ) (z VW ;h + h 2 ) z ";h (54) It is easily shown that (v ;h ; v 2;h ) only deends on h 2h 2 and h 22 and not on the other elements in h. By Theorem (a) and Theorem 2 in Staiger and Stock (997) we have W W "" 2 (~ ) d h = v 2;h h h 22 ;h h ; (55) where for LIML, h is the smallest root of the characteristic olynomial j(z ";h ; z VW ;h + h 2 ) (z ";h ; z VW ;h + h 2 ) h j = (56) in and h 2 R 22 with diagonal elements and o diagonal elements h 22 : By Theorem (b) 8 8 Note that it does not change the asymtotic results if one de nes ^ "" ( ; ~) with M Z relaced by I n as in Staiger and Stock (997). 5

in Staiger and Stock (997) we have ^ "" ( ; ~)= "" d 2 "h = 2h 22 h + 2 h: (57) We have from (53) (n Z Z) =2 n =2 Z Y= Y Y d z VY ;h + h ; (58) (n Z Z) =2 n =2 Z W= W W d z VW ;h + h 2 ; Combining (55)-(58), we obtain bs = (n Z Z) =2 n =2 Z (y Y W ~)= "" d s h = (z VW ;h + h 2 ) h + z ";h : (59) By (53) we have b "Y =( "" Y Y ) = n k (y Y W ~) M Z Y=( "" Y Y ) = (n k) (W ( ~) + ") M Z Y=( "" Y Y ) = (n k) (V W ( ~) + ") M Z Y=( "" Y Y ) W 2 W = ( ~)(n k) VY V W + (n k) " V Y + o (): W W Y Y "" Y Y "" (6) b "W =( "" W W ) = = n k (y Y W ~) M Z W=( "" Y Y ) 2 ( ~)(n k) V W V W + (n k) " V W + o (): W W "" W W W W "" (6) Therefore, by (55) b "Y =( "" Y Y ) d h h 23 + h 2 and b "W =( "" W W ) d h + h 22 : (62) Next secify ~ ( ) = (e Y. e W ) and consider b Y = (Z Z) =2 e Y = Y Y 2 R k ; b W = (Z Z) =2 e W = W W 2 6

R k : That is, b Y = (n Z Z) =2 n =2 Z [Y (y Y W ~) ) ^ "" ( ; ~) ]= Y Y = (n Z Z) =2 n =2 Z Y= Y Y bs b "Y =( "" Y Y ) ^ "" ( ; ~)= "" b "Y 2 R k b W = (n Z Z) =2 n =2 Z Y= W W bs b "W =( "" W W ) ^ "" ( ; ~)= "" 2 R k (63) Using (57), (58), (59), and (62) we have b Y d Y;h = z VY ;h + h s h h h 23 + h 2 2 "h and By simle calculations 9, h + h 22 b W W;h = z VW ;h + h 2 s h : (64) d 2 "h ^"" ( LM n ( ) = ; ~) bs P (by ;bw )bs (65) "" and therefore by the continuous maing theorem LM n ( ) d LM h = s hp (Y;h ; W;h )s h = "h : (66) References Anderson, T.W. (95): Estimating linear restrictions on regression coe cients for multivariate normal distributions, The Annals of Mathematical Statistics, 22, 327 35. Anderson, T.W. and H. Rubin (949): Estimation of the Parameters of a Single Equation in a Comlete Set of Stochastic Equations, The Annals of Mathematical Statistics, 2, 46 63. 9 Note that the numerical value of LM n ( ) is not a ected if one relaces ~ ( ) by ~ ( )T for any invertible matrix T 2 R 22 : Here we take T as a diagonal matrix with diagonal elements 2 Y Y ; 2 W W : 7

Andrews, D.W.K., M. Moreira, and J.H. Stock (26): Otimal Invariant Similar Tests for Instrumental Variables Regression, Econometrica, 74, 75 752. Chaudhuri, S. and E. Zivot (2): A New Method of Projection-Based Inference in GMM With Weakly Identi ed Nuisance Parameters, Journal of Econometrics, 64, 239 25. Dufour, J.-M. and M. Taamouti (25): Projection-Based Statistical Inference in Linear Structural Models With Possibly Weak Instruments, Econometrica, 73, 35 365. Golub, G.H. and C.F. van Loan (989): Matrix Comutations. The John Hokins University Press (Baltimore), 989. Guggenberger, P., J.J.S. Ramalho, and R.J. Smith (23): GEL statistics under weak identi cation, forthcoming Journal of Econometrics. Guggenberger, P. and R.J. Smith (25): Generalized Emirical Likelihood Estimators and Tests Under Partial, Weak and Strong Identi cation, Econometric Theory, 2, 667 79. Hansen, L. (982): Large samle roerties of generalized method of moments estimators, Econometrica, 5, 29 54. Hausman, J.A. (983): Seci cation and estimation of simultaneous equations systems. In Z. Griliches and M.D. Intrilligator, editors, Handbook of Econometrics, Volume. Elsevier Science (Amsterdam). Hausman, J., R. Lewis, K. Menzel, and W. Newey (2): Proerties of the CUE estimator and a modi cation with moments, Journal of Econometrics, 65, 45 57. Kleibergen, F. (22): Pivotal Statistics for Testing Structural Parameters in Instrumental Variables Regression, Econometrica, 7, 78 84. (24): Testing Subsets of Structural Parameters in the IV Regression Model, Review of Economics and Statistics, 86, 48 423. (25): Testing Parameters in GMM Without Assuming That They Are Identi ed, Econometrica, 73, 3 23. Moreira, M.J. (23): A Conditional Likelihood Ratio Test for Structural Models, Econometrica, 7, 27 48. 8

(29): Tests With Correct Size When Instruments Can Be Arbitrarily Weak, Journal of Econometrics, 52, 3 4. Otsu, T. (26): Generalized Emirical Likelihood Inference for Nonlinear and Time Series Models under Weak Identi cation, Econometric Theory, 22, 53 527. Sargan, J. (958): The estimation of economic relationshis using instrumental variables, Econometrica, 26, 393 45. Staiger, D. and J.H. Stock (997): Instrumental Variables Regression With Weak Instruments, Econometrica, 65, 557 586. Startz, S., C. Nelson and E. Zivot (26): Imroved inference in weakly identi ed instrumental variables regression, Frontiers in Analysis and Alied Research: Essays in Honor of P.C.B. Phillis. Cambridge University Press. Stock, J.H. and J.H. Wright (2): GMM with weak identi cation, Econometrica, 68, 55 96. 9