Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction
|
|
- Hector Dalton
- 6 years ago
- Views:
Transcription
1 Instrumental Variables Estimation and Weak-Identification-Robust Inference Based on a Conditional Quantile Restriction Vadim Marmer Department of Economics University of British Columbia vadim.marmer@gmail.com and Shinichi Sakata Department of Economics University of Southern California shinichi.sakata@gmail.com August 17, 2011 Extending the L 1-IV approach proposed by Sakata 1997, 2007), we develop a new method, named the ρ τ -IV estimation, to estimate structural equations based on the conditional quantile restriction imposed on the error terms. We study the asymptotic behavior of the proposed estimator and show how to make statistical inferences on the regression parameters. Given practical importance of weak identification, a highlight of the paper is a proposal of a test robust to the weak identification. The statistics used in our method can be viewed as a natural counterpart of the Anderson and Rubin s 1949) statistic in the ρ τ -IV estimation. 1
2 1 Introduction In this paper, we develop a new method, named the ρ τ -IV estimation, to estimate structural equations based on the conditional quantile restriction imposed on the error terms, extending the L 1 -IV approach proposed by Sakata 1997, 2007). We study the large sample behavior of the new estimator and show how to make statistical inferences on the regression parameters. In particular, we pay attention to the statistical inference under weak identification, as the weak identification is as important a possibility in the regression based on a conditional quantile restriction as in that based on the conditional mean restriction. We propose a weak-identification-robust test that can be viewed as a natural counterpart of the Anderson and Rubin s 1949) statistic in ρ τ -IV estimation. The conventional instrumental variables IV) estimator is based on the identification of the structural parameters through the conditional mean restriction that the mean of the structural error term conditional on a set of instrumental variables is zero. The conditional mean restriction may look appealing, because, unlike the independence between the error term and the instruments, it does not impose restrictions on other features of the conditional distribution of the error term such as the variance of it. Nevertheless, the conditional mean restriction is considered unsuitable in some applications. The conditional mean of a random variable critically depends on the tails of the conditional distribution of the variable. A small change in the tails can cause a large change in the conditional mean. In many applications, on the other hand, we know little about the part of the population distribution that correspond to the tails of the error distribution. This often makes it difficult to justify the conditional mean restriction. The conditional mean restriction is not the only natural way to identify the parameters of structural equations. In many applications, the conditional mean restriction comes from an informal intuition that the location of the conditional distribution of the error term given a suitably chosen set of instruments should be constant. When we are faced by the above-mentioned concern about the conditional mean restriction, one would desire to capture the location of the conditional distribution of the error term by a measure that does not depend on tails. The conditional quantiles of the error term are examples of such location measures. Sakata 1997, 2007) proposes identifying and estimating the regression parameters based on the conditional median restrictions. Chernozhukov and Hansen 2001, 2006) also consider identification of the regression parameters based on the conditional quantile restrictions and propose an estimation method, tak- 2
3 ing an approach related to but different from Sakata s. In the current paper, we extend the estimator of Sakata 1997, 2007) to propose an method called the ρ τ -IV method to estimate regression models with the conditional τ-quantile restriction. Being based on the same identification condition, our estimator is closely related to Chernozhukov and Hansen s estimator. The computation burden of the two estimators are also comparable, as should be clear from the discussion in Section 3 of the current paper. A benefit of our approach is that the objective function to be maximized in the ρ τ -IV estimation takes a form similar to the variance ratio in the normal) limited information maximum likelihood LIML) estimation. This allows us to formulate a statistic analogous to the Anderson-Rubin AR) statistic, with which we can make weak-identification-robust inference on the regression parameters of interest. In the IV regression literature, many researchers have been paying attention to possible identification issues. Sargan 1983) points out that near violation of identifiability is problematic. The analysis of Phillips 1984, 1985) on the exact finite sample distribution of LIML clearly shows that lack of identifiability in structural equation estimation keeps the LIML estimator from consistently estimating the coefficient of the structural equation. Hillier 1990) also shows analogous results in considering the directional estimation of the coefficients of structural equations. Choi and Phillips 1992) further explores the behavior of the IV estimator under lack of identifiability. When instrumental variables are poorly correlated with endogenous explanatory variables in linear regression, the asymptotic distribution of the IV estimator is quite different from what the standard large sample theory suggests, as demonstrated by Nelson and Startz 1990b, 1990a) and Bound, Jaeger, and Baker 1995). Staiger and Stock 1997) propose an alternative way to approximate the distribution of the IV estimator with weak instruments. Stock and Wright 2000) then establish a way to approximate the distribution of generalized-method-moments GMM) estimators under weak identification. The proposed approximation methods are useful in theoretically studying the nature of the IV and GMM estimators under weak identification. Nevertheless, they do not offer a way to approximate the distribution of the estimators based on data, involving some unidentifiable nuisance parameters. Given the absence of a convenient and reliable approximation to the distribution of the IV estimator with weak instruments, it is difficult to perform tests of hypotheses on regression parameters in the usual style i.e., the t-test, the Wald test, etc.). On the other hand, the AR test originally proposed in Anderson 3
4 and Rubin 1949) is not affected by weakness of instruments. For this reason, Staiger and Stock 1997) and Dufour 1997) recommend the use of the AR test. The AR test even has nice power properties if the number of instruments is equal to the number of endogenous explanatory variables Moreira 2001, Andrews, Moreira, and Stock 2004). The weak identification is also an important possibility in regression based on a conditional quantile restriction. To this end, we propose a test that has asymptotically correct size regardless of whether the identification is strong or not. The hypothesis we consider is that some regression parameters are equal to prespecified values. If we apply the ρ τ -IV method imposing the constraints of the null hypothesis, the objective function in the ρ τ -IV estimation maximized subject to the parameter constraints of the null hypothesis tends to be close to one under the null. The constraint maximum of the objective function is similar to the Anderson and Rubin 1949) statistic in the sense that it captures how much of the fitted structural error can be explained by the instruments. It, ranging between zero and one, is closed to one if the fitted structural residuals cannot be fitted by the instruments in the sample. Its value far from one is thus taken as an evidence against the null in our test. If the conditions in the null hypothesis include the coefficients of all regressors potentially weakly related to the instruments excluded from the regression function, then the proposed test involves no weak identification problem, so that our test is robust to weakidentification. Our test is closely related to Chernozhukov and Hansen 2008). They formulate a test in a way convenient in the estimation framework of Chernozhukov and Hansen 2001, 2006), while we propose a test convenient in the ρ τ -IV estimation. Another paper related to our test is Jun 2008). Jun formulates a test adapting the approach of Kleibergen 2005). The rest of the paper is organized as follows. We first describe the basic setup and define the ρ τ -IV estimator in Section 2. Then, after briefly discussing the computation of the ρ τ -IV estimator in Section 3, we establish the consistency and asymptotic normality of the ρ τ -IV estimator and explains how to consistently estimate the asymptotic covariance matrix of ρ τ -IV estimator in Section 4. In Section 5, we develop a weakidentification-robust method to test hypotheses on the regression parameters. Throughout the paper, denotes the Euclidean norm for vectors and the Frobenius norm for matrices, and limits are taken along the sequence of sample sizes growing to infinity, unless otherwise indicated. 4
5 2 ρ τ -IV Estimator Assumption 1: Let Ω, F, P ) be a probability space. The data are a realization of an independently and identically distributed stochastic process {X t y t, Y t, Z t) : Ω R R g R k } t N such that E[ X 1 ] <, and for each c R 1+g+k \{0}, P [c X t = 0] < 1. Partition Z t as Z t = Z t,1, Z t,2). where Z t,1 is k 1 1, and Z t,1 is k 2 1 so that k 1 +k 2 = k). The parameter of interest is the coefficients in regression of y t on Y t and Z t,1 described in the next assumption. Assumption 2: The subset B of R g is nonempty and compact. There exists a unique θ 0 β 0, α 0) B R k1 such that the conditional τ-quantile of U 1 y 1 Y 1β 0 Z 1,1α 0 given Z 1 is zero, where τ is a known real constant in 0, 1). If instead the conditional τ-quantile of U 1 given Y 1 and Z 1,1 is known to be zero, β 0 and α 0 could be consistently estimated by the estimator of Koenker and Bassett 1978). In our current setup, Koenker and Bassett s estimator is inconsistent in general. We here propose an estimator of the structural regression coefficients, following the approach described in Section 11 of Sakata 2007). Define ρ τ : R R by ρ τ v) τ 1v < 0)) v, v R, where 1A) is the indicator function that becomes one if and only if the condition A is true. Also define functions R : B R k1 R k R and Q : B R k1 R by Rβ, α, γ) E[ρ τ y 1 Y 1β Z 1,1α Z 1γ)], β, α, γ) B R k1 R k and Qβ, α) inf γ Rk Rβ, α, γ), β, α) B R k1, Rβ, α, 0) where Rβ, α, 0) > 0 by the linear independence of the elements of X t = y t, Y t, Z t) required by Assumption 1. Because of the conditional τ-quantile restriction imposed on U 1 in Assumption 2, we have that Qβ 0, α 0 ) = 1, so that for each β, α) B R k1 0 Qβ, α) Qβ 0, α 0 ) = 1. 1) 5
6 It follows that θ 0 β 0, α 0) is the maximizer of Q over Θ B R k1. Our estimator is the maximizer of the sample counterpart of Q, which is given by a sequence of random functions { ˆQ n : B R k1 Ω R} n N defined by inf γ R k 2 ˆRnβ,α,γ,ω), if inf ˆR ˆQ n β, α, ω) nβ,α,0,ω) b,a) B R k 1 ˆRn b, a, 0, ω) > 0, 1, otherwise, β, α) B R k1, ω Ω, n N, where ˆR n β, α, γ, ω) n 1 n ρ τ y t ω) Y t ω) β Z t,1 ω) α Z t ω) γ), β, α, γ) B R k1 R k, ω Ω, n N. We now define our estimator. Definition 1 The ρ τ -IV estimator): Given Assumption 1, a sequence of random vectors {ˆθ ˆβ n, ˆα n) : Ω B R k1 } n N is called the ρ τ -IV estimator if for each n N, ˆQ n ˆβ n, ˆα, ) = sup β,α) B R k 1 ˆQn β, α, ). For each β, α) B R k1, we have that inf ˆRn β, α, γ, ) = inf n 1 γ R k γ 1,γ 2) R k 1 R k 2 = inf n 1 γ 1,γ 2) R k 1 R k 2 n n ρ τ y t Y t β Z t,1α + γ 1 ) Z t,2γ 2 ) ρ τ y t Y t β Z t,1γ 1 Z t,2γ 2 ) = inf γ R k ˆRn β, 0, γ, ). Given this fact, it holds that whenever ˆR n β, α, 0, ) > 0 for every β, α) B R k1, sup α R k 1 inf ˆQ n β, α, ) = sup ˆR γ R k n β, 0, γ, ) = inf ˆR γ R k n β, 0, γ, ), β B. 2) α R k 1 ˆR n β, α, 0, ) inf α R k 1 ˆRn β, α, 0, ) Because the numerator and denominator of the ratio on the right-hand side of 2) are continuous in β, sup α R k 1 ˆQn β, α, ) is continuous in β in all realizations, whenever ˆR n β, α, 0, ) > 0 for every β, α) B R k1. The continuity of sup α R k 1 ˆQn β, α, ) in β is also satisfied when ˆR n β, α, 0, ) can touches zero, because ˆQ n β, α, ) = 1 in such case. Thus, given the compactness of B, ρ τ -IV estimator ˆβ n of β 0 exists by the standard result on the existence of extremum estimators such as Gallant and White 1988, Theorem 2.2). 6
7 Further, ˆα n is the solution of inf α R k 1 ˆR n ˆβ n, α, 0, ) = inf n 1 α R k 1 n ρ τ y t Y t ˆβ n ) Z t,1α). That is, it is the Koenker and Bassett s 1978) quantile regression estimator taking y t Y t ˆβ n ) for the dependent variable and Z t,1 for the regressors, which surely exists. Theorem 2.1: Given Assumption 1, the ρ τ -IV estimator exists. Remark. We could avoid the compactness requirement of B by first defining the ρ τ -IV directional estimator, as Sakata 2007) does, and then deriving the slope estimator in Definition 1 from it. We, however, directly define the slope estimator by imposing compactness on B for saving space in this paper. 3 Computation of the ρ τ -IV Estimator We could calculate the ρ τ -IV estimator, adapting the algorithm described in Sakata 2007) for the case τ = 0.5 in the straightforward manner. Sakata s algorithm is, however, slow if k 1 is large, because it uses a global search algorithm to minimizes ˆQ n over B R k1. Given a β, however, the ratio on the right-hand side of 2) can be quickly calculated, because the minimization problems appearing in both the numerator and denominator of the ratio can be rewritten as linear programming problems, as Koenker and Bassett 1978) explains. Thus, the ρ τ -IV estimator can be calculated by maximizing the ratio in terms of β over B. Because the ratio may have local maximum, it is advisable to use a global search algorithm such as the simulated annealing algorithm in calculating ˆβ n, while ˆα n is the solution of the minimization problem in the denominator calculated with ˆβ n. 4 Large Sample Properties of the ρ τ -IV Estimator In investigating the consistency of the ρ τ -IV estimator, it is convenient to consider the population counterpart of 2), i.e., inf γ R k Rβ, 0, γ) sup Qβ, α) = sup α R k 1 α R k 1 Rβ, α, 0) = inf γ Rk Rβ, 0, γ), β B. 3) inf α R k 1 Rβ, α, 0) 7
8 By Assumption 2, β sup α R k 1 Qβ, α) : B R is a continuous function uniquely maximized at β 0. We can also show that {inf α R k 1 ˆQn β, α, )} n N converges to inf α R k 1 Qβ, α) uniformly in β B a.s.-p Lemma A.3). By a standard result on consistency of extremum estimators e.g., Pötscher and Prucha 1991, Lemma 4.2), we can establish the consistency of { ˆβ n } n N for β 0. The estimator ˆα n, on the other hand, minimizes ˆR n ˆβ n, α, 0, ) with respect to α over R k1. Given the strong consistency { ˆβ n } for β 0, we can verify the a.s.-p convergence of ˆR n ˆβ n, α, 0, ) to Rβ 0, α, 0) for each α R ki and utilize the convexity of ˆR n ˆβ n, α, 0, ) in α to establish the strong consistency of ˆα n for α 0. Theorem 4.1: Under Assumptions 1 and 2, {ˆθ n = ˆβ n, ˆα n) } n N converges to θ 0 = β 0, α 0). In establishing the asymptotic normality of the ρ τ -IV estimator, we impose the additional conditions stated in the next theorem. Assumption 3: a) The minimizer of Rβ 0, α 0, ) : R k R over R is unique hence, it is uniquely minimized at the origin). b) The vector β 0 is interior to B. Also, a neighborhood B 0 B of β 0, a neighborhood A 0 R k1 of α 0, and a neighborhood Γ 2,0 R k2 of the origin satisfy the following conditions: i) The conditional distribution y 1 given Y 1 and Z 1 has a continuous probability density function pdf) f Y 1, Z 1 ) at Y 1β + Z 1,1α + Z 1,2γ 2 for each β, α, γ 2 ) B 0 A 0 Γ 2,0 a.s.-p. ii) There exists a random variable D : Ω R with a finite absolute moment such that for each β B 0, each α A 0, and each γ 2 Γ 2,0, fy 1β + Z 1,1α + Z 1,2γ 2 Y 1, Z 1 ) Y Z 1 2 ) < D. 4) c) Let J be the Hessian of R at β 0, α 0, 0 k 1 ) and partition it as J ββ J βα J βγ J J αβ J αα J αγ, J γβ J γα J γγ where J ββ is g g, J αα is k 1 k 1, and J γγ is k k. Then the matrix J θθ J ββ J αβ J βα J αα 8
9 is positive definite, and J γθ J γβ, J γα ) is of full column rank. d) E[ Y Z 1 2 ] <. Assumption 3b) ensures the twice continuous differentiability of R in a neighborhood of β 0, α 0, 0) in R g R k1 R k, which then implies the twice continuous differentiability of Q in a neighborhood of θ 0 = β 0, α 0). The first condition in Assumption 3c) ensures that the Hessian of Rβ 0, α 0, ) : R k R at its minimum is negative definite. Under these conditions, the Hessian of β, α) log Qβ, α) : B R k1 R at β 0, α 0) is guaranteed to be positive definite, being equal to K, where K Rβ 0, α 0, 0) 1 J θγ J 1 γγ J γθ. The full column rankness of J γβ means that within a neighborhood of β 0, α 0), moving β, α ) away from β 0, α 0) causes the gradient of Rβ, α, ) : R k R to be bounded away from zero uniformly in all directions, so that we can choose γ to make Rβ, α, γ) smaller than Rβ, α, 0), once β, α ) deviates from β 0, α 0). Assumption 3d) ensures that the Lindeberg-Levy Central Limit Theorem Rao 1973, p. 127) applies to the generalized score of the ρ τ -IV estimator. The moment requirements in Assumptions 3b,d) are mild. If fy 1 Y 1β Z 1,1α Z 1γ Y 1, Z 1 ) is bounded, they merely require that each element of Y 1 and Z 1 has a finite second moment, while the asymptotic normality of the conventional IV estimator is typically established under the assumption that the fourth moments of the dependent variable, the regressors, and the instruments are finite. Lemma 4.2: Suppose that Assumptions 1 3 hold. Let {b n } n N and {a n } n N be sequences of B- and R k1 - valued random vectors, respectively. Then there exists a sequence of k 1 random vectors c n such that for each n N, ˆRn b n, a n, c n, ) = inf γ R k ˆR n b n, a n, γ, ). If, in addition, Assumptions 3 hold, and b n β 0 and a n α 0 in probability-p, then n 1/2 c n = Cn 1/2 b n β 0 + J 1 a n α 0 and C J 1 γγ J γθ. γγ n 1/2 n τ 1U t < 0)) Z t + o P n 1/2 b n β 0 + n 1/2 a n α ), Using this lemma, we can now approximate log ˆQ n. Lemma 4.3: Suppose that Assumptions 1 3 hold and let {b n } n N and {a n } n N be sequences of B- and 9
10 R k1 -valued random vectors, respectively, that converge to β 0 and α 0. Write θ n b n, a n). Then n log ˆQ 1 n ) n ) n b n, a n, ) = n 1/2 τ 1U t < 0)) Z t Jγγ 1 n 1/2 τ 1U t < 0)) Z t 2Rβ 0, α 0, 0) 1 n Rβ 0, α 0, 0) n 1/2 τ 1U t < 0)) Z tcn 1/2 θ n θ 0 ) 1 2 n1/2 θ n θ 0 ) Kn 1/2 θ n θ 0 ) + o P n 1/2 b n β 0 + n b n β ). 5) Given this lemma, it is natural to expect that the minimizer of the the second and third terms on the righthand side of 5) approximates ˆθ = ˆβ n, ˆα n). The next theorem confirms that such approximation bears an o P 1) approximation error, and derives the asymptotic distribution of {ˆθ n } n N based on the approximation. and Theorem 4.4: Suppose that Assumptions 1 3 hold. Then n 1/2 1 ˆθ n θ 0 ) = Rβ 0, α 0, 0) K 1 C n 1/2 D 1/2 n 1/2 ˆθ n θ 0 ) A N0, I l ), n 1U t < 0) τ) Z t + o P 1), where D K 1 C V CK 1, K = Rβ 0, α 0, 0) 1 J θγ J 1 γγ J γθ as introduced earlier), and V τ1 τ)rβ 0, α 0, 0) 2 E[Z 1 Z 1]. To estimate the asymptotic covariance matrix D consistently, we need to estimate V, K, and C consistently. For consistent estimation of V, we can use its sample analogue, ˆV n τ1 τ) ˆR n ˆβ n, ˆα n, 0, ) 2 n 1 n Z t Z t. On the other hand, K and C are more complicated, depending on J, the Hessian of R. The Hessian of ˆR n is zero at each point in B R k1 R k, at which it is differentiable. This rules out estimation of J by using of the Hessian of ˆR n. A way to overcome the difficulty in estimation of K and C is to employ the numerical differentiation approach described in Newey and McFadden 1994, Section 7.3). Because K is the Hessian of β, α) log Qβ, α) : B R k1 R at β 0, α 0), 1 times a second-order numerical derivative of log ˆQβ, α, ) at ˆβ n, ˆα n) is our estimator of K. Let e m i denote the unit vector along the ith axis of the Cartesian coordinate system in R m. Assume: 10
11 Assumption 4: The sequence {h n } n N consists of positive possibly random) numbers such that h n 0 and n 1/2 h n in probability-p. Then our estimator of K is ˆK n, whose i, j)-element is given by ˆK nij 1 4h 2 log ˆQ n ˆθ n + h n e i + h n e j, ) log ˆQ n ˆθ n h n e i + h n e j, ) n log ˆQ n ˆθ n + h n e i h n e j, ) + log ˆQ n ˆθ n h n e i h n e j, )), i, j) {1,..., g + k 1 )} 2, n N. For C, we utilize the result of Lemma 4.2, which suggests that perturbation in ˆθ n = ˆβ n, ˆα n) would change ˆγ n approximately by C times the change in ˆθ n. Let ˆγ ni θ) denote the ith element in the usual quantile regression estimator in regression of y t Y t, Z t,1)θ on Z t i {1, 2,..., k}). Then our estimator of C is Ĉn whose i, j)-element is given by Ĉ nij 1 2h n ˆγ ni ˆθ n + h n e j ) ˆγ ni ˆθ n h n e j )), i {1,..., k}, j {1,..., g + k 1 )}. Given the estimators of K and C, we estimate D by ˆD n ˆK + n Ĉ n ˆV n Ĉ n ˆK+ n, n N, where ˆK + n is the Moore-Penrose MP) inverse of ˆK n we use the MP inverse instead of the regular inverse to ensure that this estimator is well defined for every realization). Theorem 4.5: Suppose that Assumptions 1 4 hold. Then: a) { ˆK n } n N is weakly consistent for K. b) {Ĉn} n N is weakly consistent for C. c) { ˆV n } n N is weakly consistent for V. d) { ˆD n } n N is weakly consistent for D. Remark. The same step size h n is used in each element of ˆK n and Ĉn just for simplicity. One could use a different step size for each element in ˆK n and Ĉn without affecting the consistency results in Theorem 4.5, as long as the step size satisfies the requirements in Assumption 4. 11
12 5 Testing on the Regression Coefficients under Possible Weak Identification When β, α) log Qβ, α) is flat in some directions from β 0, α 0 ), compared with the size of the error in approximating log Q by log ˆQ n, the large sample distribution of the ρ τ -IV estimator established in Section 4 can be unreliable, because the estimator can easily go astray. In other words, we may experience the so-called weak identification problem in the ρ τ -IV estimation. The flatness of β, α) log Qβ, α) described above implies near singularity of K, which is 1 times the Hessian of log Qβ, α). Because the large sample analysis in Section 4 involves the inverse of K, the near singularity of K makes the results in Section 4 unreliable unless the sample size is extremely large. To verify that the nearly singular K can arise in practice, suppose that Y t is related to Z t through Y t = Π 0 Z t + V t, t N, where Π 0 is a g k constant matrix, and V t is a g 1 zero-mean random vector independent from Z t. Let f U Z 1 ) denote the conditional pdf of U 1 given Z 1. Then, under our current assumption, we have that [ J θγ = 2E f U 0 Z) Y ] [ 1 Z 1 = 2E f U 0 Z 1 ) Π ] 0Z 1 Z 1 Z 1,1 If the last k 2 columns of Π 0 is close to zero, each of the first g rows of J θγ can be well approximated by a linear combination of the last k 1 rows of J θγ ; i.e., the columns of J θγ becomes nearly dependent. This causes K = Rβ 0, α 0, 0) 1 J θγ J 1 γγ J γθ to be nearly singular and raises concerns about inference on β 0 and α 0, relying on the asymptotics in Section 4. Z 1,1 Suppose that we are interested in the hypothesis that H 0 : β 0 = β, where β is a known g 1 constant vector. In the usual IV regression based on the zero-conditional mean restriction imposed on the error term, the AR test is known to be robust to weakness of instruments. Given the structural equation estimated under the constraint of the null hypothesis, the AR test regresses the null-restricted fitted structural error term on all instruments and checks if R 2 is close to zero. If R 2 is high enough, it rejects the null hypothesis. Because the AR test rejects the null when 1 R 2 is close to zero, we can view the AR test as rejecting the null hypothesis when the null-restricted fitted structural error term can be well explained by the instruments. Note that 1 R 2 is equal to the ratio of the two sample second moments. The denominator in the ratio 12
13 is the sample second moment of the fitted structural error term, while the numerator is the sample second moment of the residuals in regression of the structural error term on the instruments. This view gives us a way to adapt Anderson and Rubin s 1949) approach in our problem setup. Namely, we replace the sample second moment in 1 R 2 with the corresponding average check functions. The resulting statistic is ˆQ n β, ˆα 0 n, ), where ˆα 0 n is the ρ τ -IV estimator obtained imposing the constraint of H 0, which is exactly equal to the Koenker and Bassett s 1978) estimator in regression of y t Y t β on Z 1,t. For convenience, we take the logarithm of it and multiply it by 2n to define a test statistic J n. J n 2n log ˆQ n β, ˆα n, 0 ) = 2n log inf ˆR γ R k n β, 0, γ, ) inf α R k 1 ˆRn β,, n N. 6) α, 0, ) Let ᾱ be a k 1 1 vector such that Z 1,1ᾱ be the ρ τ -metric projection of y 1 Y 1 β on the linear space spanned by the elements of Z 1,1. Then the standard large sample analysis on extremum estimation shows that n 1 J n 2 sup log Q β, α) = 2 log Q β, ᾱ) in probability-p. α R k 1 Under H 0, the right-hand side of this equality is zero, because sup Q β, α) = sup Qβ 0, α) = Qβ 0, α 0 ) = 1. α R k 1 α R k 1 Under the alternative, on the other hand, the limit of {n 1 J n } n N is strictly positive, because Q β, ᾱ) < Qβ 0, α 0 ) = 1. Thus, a test based on J n should reject H 0 if J n exceeds a suitably chosen critical value. We will discuss how to find the critical value below. Define C 0 J 1 γγ J γα and L Rβ 0, α 0, 0) 1 J αα. Lemma 5.1: Suppose that Assumptions 1 3 hold. If in addition H 0 is true, then J n A η L 1 C 0 C 0 L C 0 ) 1 C 0 ) η, where η is a k 1 random vector distributed with N0, V ), Thus, {J n } n N has a non-degenerate limiting distribution, though it is not asymptotically pivotal. Among the unknown parameters in the formula for the asymptotic distribution of {J n } n N, C 0 can be consistently 13
14 estimated by applying {Ĉn} n N under the null Theorem 4.5). Write ˆθ 0 n β, ˆα 0 n ). Then our estimator of C 0 is Ĉ0 n whose i, j)-element is equal to Ĉ 0 nij 1 2h n ˆγ ni ˆθ 0 n + h n e g+j ) ˆγ ni ˆθ 0 n h n e g+j )), i {1,..., k}, j {1,..., k 1 }. Analogously, V can be estimated by ˆV 0 n ˆR n ˆβ 0 n, ˆα 0 n, 0, ) 2 n 1 n Z tz t. The matrix L is the Hessian of γ log Rβ 0, α 0, γ) : R k R at the origin. We take a second-order numerical derivative of the sample counterpart of this function to estimate L. The resulting estimator ˆL n of L is the k k matrix with i, j)-element equal to ˆL nij 1 4h 2 log ˆR n β, ˆα n, 0 ˆγ n 0 + h n e i + h n e j, ) log ˆR n β, ˆα n, 0 ˆγ n 0 h n e i + h n e j, ) n log ˆR n β, ˆα 0 n, ˆγ 0 n + h n e i h n e j, ) + log ˆR n β, ˆα 0 n, ˆγ 0 n h n e i h n e j, )), i, j = 1, 2,..., k, where ˆγ 0 n is the τ-quantile regression estimator in regressing y t Y t β Z 1,t ˆα 0 n on Z t. for L. Lemma 5.2: Suppose that Assumptions 1 3 and 4 hold. If in addition H 0 holds, {ˆL 0 n} n N is consistent The limiting distribution of {J n } n N is that of a positive random variable whose distribution function is positively sloped at each positive point. Let cp, C, L, Ṽ ) denote the 1 α)-quantile of η L + C C L C) + C ) η for each k l matrix C, each k k symmetric matrix L, and each k k symmetric matrix Ṽ, where η is a k 1 random vector distributed with N0, Ṽ ), and p 0, 1), where a, b) denotes the open interval between real numbers a and b. We here propose a test that rejects H 0 if and only if J n exceeds cp, Ĉn, ˆL n, ˆV n ), where p is the desired size of the test. This test has the correct asymptotic size and it is consistent, as stated in the next theorem. Theorem 5.3: Suppose that Assumptions 1 4 hold. Then: a) If in addition H 0 holds, for each p 0, 1), P [J n > cp, Ĉ0 n, ˆL 0 n, ˆV 0 n )] p. b) Suppose instead that H 0 is violated, that R β,, 0) : R k1 R has a unique maximizer on R k1, and that R β, 0, ) : R k R has a unique minimizer on R k. Then for each p 0, 1), P [J n > cp, Ĉ0 n, ˆL 0 n, ˆV 0 n )] 1. 14
15 Because each quadratic form of normal random variables can be easily rewritten as a linear combination of χ 2 random variables using the eigenvalue decomposition, cp, Ĉn, ˆL n, ˆV n ) is the 1 α)-quantile of a linear combination of χ 2 random variables. To compute cp, Ĉn, ˆL n, ˆV n ), we can numerically find the 1 α)-quantile of the distribution of the linear combination, evaluating the distribution function by using Farebrother s 1984) algorithm. 6 Power under weak instruments According to Theorem 5.3b), our test proposed in the previous section is consistent in the regular asymptotic framework with strong instruments. In this section, we discuss the power properties of the test when the instruments are weak. For this purpose, we need a model describing how weak instruments arise in our problem. Before formalizing the notion of weak instruments in our problem setup, we first review the concept of weak instruments in the conventional IV regression. Staiger and Stock 1997) introduces weak instruments in a thought experiment in which the correlation between the endogenous regressors and the instruments becomes weaker as the sample size grows. More concretely, they relate the k 1 instrument vector Z t to the g 1 endogenous regressor vector Y n) t through Y n) = n 1/2 ΛZ t + V t, t {1, 2,..., n}, n N where Λ is a g k constant matrix, and V t is a unobservable g 1 random vector such that Z t is exogenous to V t. The superscript n) in Y n) t thought experiment is then indicates dependence of Y n) t on n. The structural equation in the y n) t = Y n) t β 0 + Z t,1α 0 + U t, t {1, 2,..., n}, n N, where Z t,1 is a k 1 1 subvector of Z t, and the regression error U t is orthogonal to Z t. In this setup, Staiger and Stock investigates the asymptotic behavior of tests of the hypothesis that H 0 : β 0 = β, where β is a known constant in R g. Define W t U t V t β β 0 ) t N). residual, i.e., the residual evaluated with coefficients β, α 0) is equal to Then it is straightforward to verify that the null-restricted y n) t Y n) t β Z t,1 α 0 = W t Z tn 1/2 Λ β β 0 ). 7) 15
16 Because E[Z t W t ] = 0, 8) it follows that E[Z t y t Y t β Z t,1α 0 )] = E[Z t Z t]n 1/2 Λ β β 0 ). Thus, the null restricted residual violates the moment condition underlying the conventional IV estimator, but only in the order of n 1/2. This is the essential feature of the setup that Staiger and Stock used to demonstrate that the behavior of the conventional tests of H 0 may be very different from what the conventional asymptotic analysis indicates, and why the AR test can be a better choice. Note that, while the fact that the null-restricted residual violates the moment condition in the order of n 1/2 hinges on 7) and 8), it does not matter for it what W t is or where Λ comes from. Also, note that there is no natural universally agreeable reduced-form equation in our setup, unlike the conventional IV regression setup. In analyzing our test of H 0 with weak instruments, we therefore take as basis 7) and 8) suitably modified for constructing an environment with weak instruments in our setup, as found in the next assumption. Assumption 5: The triangle array {X n) t y n) t, Y n) t, Z t,1, Z t,2 ) : t {1, 2,..., n}, n N} consists of random vectors on a probability space Ω, F, P ), where y n) t, Y n) t Z t,1, and Z t,2 are 1 1, g 1, k 1 1, and k 2 1, respectively; β is a constant vector in B that is a nonempty and compact subset of R g ; and τ is a known constant in 0, 1). There exists β 0 B, a g k matrix Λ, ᾱ R k1, and a sequence of random variables {W t } t N that satisfy that y n) t Y n) t β Z t,1 ᾱ = W t Z tn 1/2 Λ β β 0 ), t {1, 2,..., n}, n N, 9) and that for each t N, τ-quantile of W t given Z t is zero. In Assumption 5, β 0 appears as some vector satisfying the required condition, rather than the true coefficient of Y n) t, because our mathematical results do not depend on what β 0 is. Of course, our results are most useful when Assumption 5 holds with β 0 set equal to the true true coefficient of Y n) t. If β 0 = β, the conditional quantile restriction imposed upon {W t } t N is essentially the same as Assumption 2. The 16
17 equivalence of the two conditions can be achieved by setting ᾱ = α 0 and W 1 = U 1, in particular when we require that {Z t, W t)} t N is i.i.d., as we will do below. When β 0 β, the assumption implies that the conditional τ-quantile of the null-restricted residual given Z 1 is local-to-zero. In general, the distribution of W t depends on β β 0. Assumption 5 is clearly satisfied, if U t, V t ) is independent from Z 1 in the setup of Staiger and Stock 1997) discussed above. The matrix Λ captures the strength of the instruments. For example, the instruments are irrelevant when Λ = 0. In addition to Assumption 5, we impose the following conditions similar to Assumption 3: Assumption 6: a) Eρ τ W 1 Z 1γ) is uniquely minimized at γ = 0 k 1. b) A neighborhood Γ 0 R k of the origin satisfies the following conditions: i) The conditional distribution W 1 given Z 1 denoted by F Z 1 ) has a pdf f Z 1 ) at Z 1γ for each γ Γ 0 a.s.-p. ii) There exists a random variable D : Ω R with a finite second moment such that for each γ Γ 0, fz 1γ Z 1 ) Z 1 2 < D a.s.-p. c) J γγ = 2 R β, ᾱ, 0 k 1 )/ γ γ is positive definite, and J γα = 2 R β, ᾱ, 0 k 1 )/ γ α is of full column rank. d) E[ Z 1 2 ] <. e) {W t, Z t) : t = 1,..., n} are independent and identically distributed. The following theorem describes the asymptotic distribution of J n in the case of fixed alternatives β β 0 is a fixed vector) and the weak IVs design assumed in Assumption 5. Theorem 6.1: Suppose that Assumptions 5 and 6 hold. Then J n A Eρτ W 1 )) 1 η + J γγ Λ β β 0 ) ) J 1 γγ C 0 C 0 J γγ C 0 ) 1 C 0 ) η + J γγ Λ β β 0 ) ), where η is a k 1 random vector distributed with N 0, τ1 τ)e[z 1 Z 1] ). In the case of weak instruments and under fixed alternatives, the asymptotic distribution of J n is a noncentral mixed-χ 2 random variable. The power of the test that rejects H 0 : β = β 0 when J n > cα, Ĉ0 n, ˆL 0 n, ˆV 0 n ) 17
18 depends on the magnitude of the the non-centrality parameter given by β β0 ) Λ Jγγ J γγ C 0 C 0 J γγ C 0 ) 1 C 0 J γγ ) Λ β β0 ), where J γγ J γγ C 0 C 0 J γγ C 0 ) 1 C 0 J γγ is a positive definite matrix by Assumption 6c). Under H 0, β β 0 = 0 and the test rejects asymptotically with probability α. Thus, the test has correct size regardless of the strength of the instruments. Under the fixed alternatives, the asymptotic rejection probability depends on the distance between β and β 0 and the strength of the instruments Λ. For example, the test has no power when the instruments are irrelevant and Λ = 0. The test also lacks power in certain directions if Λ 0 however its rank is less than g. Appendix A Mathematical Proofs Given Assumption 1, write ξ ρτ E[ρ τ ξ)] for each ξ L 1 Ω, F, P ). Then ρτ is a pseudo norm on L 1 Ω, F, P ). Using ρτ, R can be written as Rβ, α, γ) = y 1 Y 1β Z 1,1α Z 1γ ρτ, β, α, γ) B R k1 R k. It follows that the minimization in the numerator of the ratio on the right-hand side of 3) is the ρτ -metric projection of y 1 Y 1β on Z 1, while the minimization in the denominator is the ρτ -metric projection of y 1 Y 1β on Z 1,1. The norm ρτ equivalent topologies, because 1 ξ ρτ ξ 1 min{τ, 1 τ} ξ ρ τ is closely related to the L 1 norm 1. They actually generate the An important implication of the equivalence is that ξ ρτ = 0 if and only if ξ 1 = 0. Our analysis uses the equivalence of the two norms, mostly without mentioning it explicitly. We show below that {sup α R k 1 ˆQn β, α, )} n N converges to sup α R k 1 Qβ, α) uniformly in β on the compact set B. We can then conclude that { ˆβ n } n N is consistent for β 0, because β 0 is the unique maximizer of β sup α R k 1 Qβ, α) on B. Once the consistency of ˆβ n is established, we can also prove that {ˆα n } n N converges a.s.-p to α 0, at which Rβ 0,, 0) : R k1 R is minimized, by utilizing the convexity of ˆR n ˆβ n, α, 0, ) in α and the pointwise convergence of { ˆR n ˆβ n, α, 0, )} n N to Rβ 0, α, 0) for each α. We first establish a few lemmas. For later conveniences, some lemmas have more generality than we need for proving Theorem 4.1. The generality will be useful in our proof of
19 Lemma A.1: Suppose that Assumptions 1 holds. Then for each β B, and inf ˆRn β, 0, γ, ) inf Rβ, 0, γ) 0 a.s.-p, γ R k γ R k inf α R k 1 ˆR n β, α, 0, ) inf α R k 1 Rβ, α, 0) 0 a.s.-p. Proof of Lemma A.1: The two convergence results can be proved in similar manners. We only prove the first one. Let β be an arbitrary point in B. Then the ρτ -metric projection of y 1 Y 1β on the linear subspace spanned by Z 1 exists and is in general a compact set. By the linear independence of Z 1 Assumption 1), this further means that Γ 1 arg min γ R k Rβ, 0, γ) is compact. It follows that there exists a closed ball Γ 2 containing Γ 1 in its interior. Now fix a point γ 1 in Γ 1. By the Kolmogorov law of large numbers Rao 1973, p. 115), { ˆR n β, 0, γ 1, )} n N converges to Rβ, 0, γ 1 ) a.s.-p. Also, by Jennrich s uniform law of large numbers Jennrich 1969, Theorem 2), { ˆR n β, 0, γ, ) Rβ, 0, γ)} n N converges to zero uniformly in γ on the boundary Γ 2 of Γ 2 a.s.-p. Because ˆR n β, 0, γ, ) is convex as a function of γ, and Rβ, 0, γ 1 ) < inf Rβ, 0, γ), γ Γ 2 it follows from the above-mentioned facts that ˆR n β, 0, γ 1, ) < inf ˆRn β, 0, γ, ) γ R k \Γ 2 for almost all n N a.s.-p. On the other hand, by Jennrich s uniform law of large numbers, { ˆR n β, 0, γ, ) Rβ, 0, γ)} n N converges to zero uniformly in γ Γ 2 a.s.-p, so that inf ˆRn β, 0, γ, ) Rβ, 0, γ 1 ) a.s.-p. γ Γ 2 The desired result therefore follows. Lemma A.2: Suppose that Assumptions 1 holds. Then sup inf ˆRn β, 0, γ, ) inf Rβ, 0, γ) 0 a.s.-p, γ R k γ R k β B and sup inf β B α R k 1 ˆR n β, α, 0, ) inf α R k 1 Rβ, α, 0) 0 a.s.-p. 19
20 Proof of Lemma A.2: We only prove the first convergence result, as the second one can be shown in an analogous manner. Because Lemma A.1 has shown the corresponding pointwise a.s. convergence, and B is compact, it suffices to show that the series in question is strongly stochastically equicontinuous Andrews 1992, Theorem 2). Let β 1 and β 2 be arbitrary points in B. Also, let g nj be Koenker and Bassett s 1978) estimator in τ-quantile regression of y t Y t β j on Z t, i.e., ˆR n β j, 0, g nj, ) = inf γ R k ˆRn β j, 0, γ, ), for j = 1, 2. Then we have that for each n N, ˆR n β 1, 0, g n1, ) ˆR n β 2, 0, g n2, ) = ˆR n β 1, 0, g n1, ) ˆR n β 1, 0, g n2, )) + ˆR n β 1, 0, g n2, ) ˆR n β 2, 0, g n2, )) ˆR n β 1, 0, g n2, ) ˆR n β 2, 0, g n2, ), where the inequality holds, because ˆR n β 1, 0, g n1, ) ˆR n β 1, 0, g n2, ) for each n N. We further have that ˆR n β 1, 0, g n2, ) ˆR n β 2, 0, g n2, ) =n 1 It follows that for for each n N n n n 1 ˆR n β 1, 0, g n1, ) ˆR n β 2, 0, g n2, ) β 1 β 2 n 1 Analogously, we can also show that for each n N ˆR n β 2, 0, g n2, ) ˆR n β 1, 0, g n1, ) β 1 β 2 n 1 Thus, it holds that for each n N inf γ R k ρτ y t Y t β 1 Z tg n2 ) ρ τ y t Y t β 2 Z tg n2 ) ) Y t β 1 Y t β 2 β 1 β 2 n 1 n n Y t. Y t. n Y t. ˆRn β 2, 0, γ, ) inf ˆRn β 1, 0, γ, ) = ˆR n β 2, 0, g n2, ) ˆR n β 1, 0, g n1, ) β 1 β 2 n 1 γ R k n Y t. Because {n 1 n Y t } n N converges to E[ Y 1 ] a.s.-p by the Kolmogorov strong law of large numbers, the desired result follows by Andrews 1992, Lemma 2). 20
21 Lemma A.3: Suppose that Assumptions 1 and 2 hold. For each β B, {inf ˆQ α R k n β, α, )} converges to inf α R k Qβ, α) uniformly in β B a.s.-p Proof of Lemma A.3: Because the linear independence of the elements of X 1 = y 1, Y 1, Z 1) in Assumption 1 implies that for each β B, the distance between y 1 Y 1β and the ρτ -metric projection of y 1 Y 1β on the subspace spanned by Z 1 is positive, i.e., inf α R k 1 Rβ, α, 0) > 0. Because β inf α R k 1 Rβ, α, 0) : B R is continuous, it is bounded away from zero on B. The desired results from this fact and Lemma A.2, because r 1, r 2 ) r 1 /r 2 : R a, ) R is a Lipschitz function if a > 0. Lemma A.4: Suppose that Assumptions 1 and 2 hold. Let {b n } n N be a sequence of B-valued random vectors on Ω, F, P ) converging to β 0 a.s.-p in probability-p ). Let {a n } n N be sequences of k 1 1 vectors on Ω, F, P ) satisfying that for each n N, ˆR n b n, a n, 0, ) = inf α R k 1 ˆRn b n, α, 0, ). Then: a) Then a n α 0 a.s.-p in probability-p ). b) Let {c n } n N be a sequence of k 1 random vectors on Ω, F, P ) satisfying that for each n N ˆR n b n, a n, c n, ) = inf γ R k ˆR n b n, a n, γ, ). Then c n 0 a.s.-p in probability-p ), provided that the minimizer of Rβ 0, α 0, ) : R k R over R k is unique. Proof of Lemma A.4: We only prove the result for {a n } n N. The result for {c n } n N can be established in an analogous way. Suppose that b n β 0 a.s.-p. Then for each α R k1, { ˆR n b n, α, 0, )} n N converges to Rβ 0, α, 0) a.s.-p, because for each α R k1, { ˆR n β, α, 0, )} n N converges to Rβ, α, 0) uniformly in β B by Jennrich s uniform law of large numbers Jennrich 1969, Theorem 2). Further, we can apply Rockafellar 1970, Theorem 10.8) to show that the convergence is uniform in α over any compact subset of R k1, because for each n N, ˆR n b n, α, 0, ) is convex in α over R k1. Take an arbitrary compact subset A 1 of R k1 that contain α 0 in its interior. Then { ˆR n b n, α 0, 0, )} converges to Rβ 0, α 0, 0) a.s.-p ; { ˆR n b n, α, 0, )} converges to Rβ 0, α, 0) uniformly on α A 1 a.s.-p ; and Rβ 0, α 0, 0) < inf α A1 Rβ 0, α, 0), because α 0 is the unique minimizer of Rβ 0,, 0) on R k1 by Assumption 2. Because ˆR n b n, α, 0, ) is convex in α, it follows that ˆR n b n, α 0, 0, ) < inf α R k 1 \A 1 ˆRn b n, α, 0, ) 21
22 for almost all n N a.s.-p. That is, a n A 1 for almost all n N a.s.-p. Because A 1 is an arbitrary compact subset containing α 0 in its interior, this establishes the a.s.-p convergence of {a n } n N to α 0. The convergence of {a n } n N in probability in the current lemma immediately follows from the result of the a.s. convergence of {a n } n N by using the subsequence theorem. Proof of Theorem 4.1: By Assumption 2, β sup α R k Qβ, α) : B R is uniquely maximized at β 0. Because ˆβ n maximizes sup α R k ˆQ n β, α, ) with respect to β over the compact subset B, and {sup α R k ˆQ n β, α, )} n N converges to sup α R k Qβ, α) uniformly in β B a.s.-p, it follows by Pötscher and Prucha 1991, Lemma 4.2) that { ˆβ n } n N converges to β 0 a.s.-p. Further, applying Lemma A.4a) by setting b n = ˆβ n and a n = ˆα n establishes that the strong consistency of ˆα n for α 0. The result therefore follows. In proving Lemmas 4.2, 4.3 and Theorem 4.4, we use the following lemma. Lemma A.5: Suppose that Assumptions 1 3 hold, and let { d nj b nj, a nj, g nj ) : Ω B R k1 R k } n N be a sequence of random vectors that converges in probability-p to d 0 β 0, α 0, 0 1 k ), j = 1, 2. Then ˆR n b n2, a n2, g n2, ) ˆR n b n1, a n1, g n1, ) n = n 1 τ 1U t < 0)) X t d n2 d n1 ) d n2 d 0 ) J d n2 d 0 ) 1 2 d n1 d 0 ) J d n1 d 0 ) + o P n 1/2 d n2 d n1 + d n1 d d n2 d 0 2 )), 10) where X t Y t, Z t,1, Z t), t N. Proof of Lemma A.5: Define r : R R g+k1+k R g+k1+k R g+k1+k R by ry, x, d 1, d 2 ) 1 ) ρ τ y x d 2 ) ρ τ y x d 1 ) + τ 1y x d 0 < 0)) x d 2 d 1 ), d 2 d 1 y, x, d 1, d 2 ) R R g+k1+k R g+k1+k R g+k1+k, with the rule that devision by zero is zero. Also, following Pollard 1985), let ν n denote the standardized sample average operator such that for each function f : R R l+k R with E[ fy 1, X 1 ) ] < ν n f, ) = n 1/2 n fyt, X t ) E[fY 1, X 1 )] ), n N. 22
23 By the definition of r, we obtain that ˆR n d 2, ) ˆR n d 1, ) =Rd 2 ) Rd 1 ) l + n 1 n + n 1/2 d 2 d 1 ν n r,, d 1, d 2 ) τ 1U t < 0)) X t ) d 2 d 1 ) for each d 1, d 2 ) R l+k R l+k, where l is the gradient of R at β 0, α 0, 0 1 k ), which is equal to E[τ 1U 1 < 0)) X 1 ]. Taking the second-order Taylor expansion of Rd 1 ) and Rd 2 ) about d 0 on the right-hand side of this equality and replacing d 1 with d n1 and d 2 with d n2 in the resulting equality yields the desired result, if {ν n r,, d n1, d n2 )} n N converges to zero in probability-p. It thus suffices to show the convergence of {ν n r,, d n1, d n2 )} to zero in probability-p. It is straightforward to verify that ry 1, X 1, θ 1, θ 2 ) 2 X 1, from which it follows that [ ] E sup ry 1, X 1, d 1, d 2 ) 2 d 1,d 2) R g+k 1 +k R g+k 1 +k 4E[ X 1 2 ] <. Also, {r,, d 1, d 2 ) : d 1, d 2 ) R g+k1+k R g+k1+k } can be expressed as a sum of a fixed member of functions from a polynomial class. These facts imply that {ν n r,, d 1, d 2 )} n N is stochastically equicontinuous at d 0, d 0 ) Pollard 1985, pp ). Further, ry 1, X 1, d 1, d 2 ) 2 converges to zero as d 1, d 2 ) d 0, d 0 ) a.s.-p, and ry 1, X 1, d 1, d 2 ) 2 is dominated by 4 X 1 2 with a finite moment. It follows by the dominated convergence theorem that E[ry 1, X 1, d 1, d 2 ) 2 ] 0 as d 1, d 2 ) d 0, d 0 ). Now let {U n R g+k1+k R g+k1+k } n N be an arbitrary sequence of balls centered at d 0, d 0) that shrinks down to d 0, d 0). Then, as Pollard 1985, page. 309) explains, it follows from the above-mentioned facts that sup d1,d 2) U n ν n r,, d 1, d 2 ) 0 in probability-p. Thus, {ν n r,, d n1, d n2 )} converges to zero in probability-p, given that { d nj } n N converges to d 0 in probability-p, j = 1, 2. Lemma A.6: Let Ω, F, P ) be a probability space. Suppose that a sequence of random vectors {η n : Ω R m } n N and a sequence of random variables {ξ n : Ω R} n N satisfy that η naη n + ξ n 0 for each n N, where A is a positive definite m m symmetric matrix. Also, let {ζ n : Ω R} n N be a sequence of random variables. Suppose that ξ n = o P η n + η n 2 + ζ n ) as n. Then η n = o P ζ n 1/2 + 1) as n. We now prove Lemma 4.2. Proof of Lemma 4.2: The existence of {c n } follows immediately from the fact that the minimization of 23
24 ˆR n b n, a n, γ, ) in terms of γ is the ρτ -metric projection of y 1 Y 1b n Z 1,1a n, y 2 Y 2b n Z 1,2a n,..., y n Y nb n Z 1,na n ) on the space spanned by the rows of Z 1, Z 2,..., Z n ). To prove the second result, we first show that {c n } converges to 0 in probability-p, and then apply Lemmas A.5 and A.6. For each fixed γ R k, ˆR n β, α, γ, ) is convex in β and α. By the Kolmogorov strong law of large numbers and Hjort and Pollard 1993, Lemma 1), { ˆR n β, α, γ, )} n N converges to Rβ, α, γ) uniformly in β, α ) in each neighborhood of β 0, α 0) in probability-p. Because {b n, a n) } n N converges to β 0, α 0) in probability-p by the assumption, it follows that { ˆR n b n, a n, γ, )} n N converges to Rβ 0, α 0, γ) for each γ R k. Under Assumptions 1 3c), this fact implies by Hjort and Pollard 1993, Lemma 2) that {c n } converges to 0 in probability-p. We now set b n to both b n1 and b n2, c n to g n1, g n C b n β 0 n + Jγγ 1 n 1 τ 1U t < 0)) Z t. a n α 0 to g n2 in 10) and multiply the resulting equality by n to obtain that 0 n ˆR n b n, a n, g n, ) ˆR n b n, a n, c n, )) = 1 2 n1/2 c n g n) J γγ n 1/2 c n g n) + o P n 1/2 c n g n + n b n β n a n α n c n 2 + n g n 2) = 1 2 n1/2 c n g n) J γγ n 1/2 c n g n) + o P n 1/2 c n g n ) + n c n g n 2 + n b n β n a n α ), where the second equality holds because g n = O P b n β 0 + a n α 0 + 1) and c n = O P c n g n + g n ). The result follows from this inequality by Lemma A.6. Proof of Lemma 4.3: Let {c n } n N be as in Lemma 4.2. Note that the difference between { ˆR n b n, a n, c n, )} n N and { ˆR n b n, a n, 0, )} n N converges to zero in probability-p. Applying the delta method with this fact, we 24
25 obtain that n log ˆQ n b n, a n, ) = nlog ˆR n b n, a n, c n, ) log ˆR n b n, a n, 0, )) 1 = Rβ 0, α 0, 0) n ˆR n b n, a n, c n, ) ˆR n b n, a n, 0, )) 11) 1 2Rβ 0, α 0, 0) 2 n ˆR n b n, a n, c n, ) Rβ 0, α 0, 0)) Rβ 0, α 0, 0) 2 n ˆR n b n, a n, 0, ) Rβ 0, α 0, 0)) 2 + o P n ˆRn b n, a n, c n, ) Rβ 0, α 0, 0)) 2 + n ˆR n b n, a n, 0, ) Rβ 0, α 0, 0)) 2). We apply Lemma A.5 to each of the non-remainder terms on the right-hand side of this equality: n ˆR n b n, a n, c n, ) ˆR n b n, a n, 0, )) = 1 2 n1/2 c nj γγ n 1/2 c n + o P n 1/2 c n + n b n β n a n α c n 2) = 1 2 n1/2 c nj γγ n 1/2 c n + o P n 1/2 b n β 0 + n 1/2 a n α 0 + n b n β n a n α ), and n 1/2 ˆR n b n, a n, c n, ) Rβ 0, α 0, 0)) = n 1/2 ˆR n b n, a n, c n, ) ˆR n β 0, α 0, 0, )) + n 1/2 ˆR n β 0, α 0, 0, ) Rβ 0, α 0, 0)) = n n 1 τ 1U t < 0)) Y t, Z t,1) ) n 1/2 θ n θ 0 ) + n 1/2 ˆR n β 0, α 0, 0, ) Rβ 0, α 0, 0)) + o P n 1/2 b n β 0 + n 1/2 a n α 0 + n b n β n a n α ), n 1/2 ˆR n b n, a n, 0, ) Rβ 0, α 0, 0)) = n 1/2 ˆR n b n, a n, 0, ) ˆR n β 0, α 0, 0, )) + n 1/2 ˆR n β 0, α 0, 0, ) Rβ 0, α 0, 0)) n ) = n 1 τ 1U t < 0)) Y t, Z t,1) n 1/2 b n β 0 ) + n 1/2 ˆR n β 0, α 0, 0, ) Rβ 0, α 0, 0)) + o P n 1/2 b n β 0 + n 1/2 a n α 0 + n b n β n a n α ). Substituting these into 11) and applying Lemma 4.2 yields the desired result. Proof of Theorem 4.4: θ n θ 0 Let 1 Rβ 0, α 0, 0) K 1 C n 1 n τ 1U t < 0)) Z t, n N, 25
26 and let b n and a n denote the vectors containing the first g elements and the remaining elements of θ n, respectively. Then, by Lemma 4.3, we have that 0 n log ˆQ n ˆβ n, ˆα n, ) n log ˆQ n b n, a n, ) = 1 2 n1/2 ˆθ n θ n) Kn 1/2 ˆθ n θ n) + o P n 1/2 ˆβ n β 0 + n 1/2 ˆα n α 0 + n ˆβ n β n ˆα n α ). The first result follows from this equality by Lemma A.6. For the second result, apply the central limit theorem CLT) for i.i.d. random vectors Rao 1973, p. 128) to show that {n 1/2 n τ 1U t < 0)) Z t } n N is asymptotically distributed with N0, Rβ 0, α 0, 0) 2 V ), and then apply the continuous mapping theorem. Proof of Theorem 4.5: To prove a), let { θ n} n N be as in the proof of Theorem 4.4 and {δ n } n N an arbitrary sequence of g + k 1 ) 1 random vectors that converges to the origin in probability-p. Recall that the expression consisting of the second and third terms on the right-hand side of 5) is minimized when b n, a n) = θ n, and that {n 1/2 ˆθ n θ n)} n N converges to zero in probability-p by Theorem 4.4. Using these facts with Lemma 4.3, we can show that n log ˆQ n ˆθ n, ) n log ˆQ n θ n, ) = o P 1) and n log ˆQ n ˆθ n + δ n, ) n log ˆQ n θ n, ) = 1 2 n1/2 ˆθ n θ n + δ n ) Kn 1/2 ˆθ n θ n + δ n ) + o P n 1/2 δ n + δ n ) = 1 2 n1/2 δ nkn 1/2 δ n + o P n 1/2 δ n + δ n ) 12) By taking each of τ n e i + τ n e j, τ n e i + τ n e j, τ n e i τ n e j, and τ n e i τ n e j for δ n in this equality and using the resulting equalities in the definition of ˆK nij i, j = 1, 2,..., l), we obtain that 4nτ 2 n ˆK nij = 4τ 2 nk ij + o P n 1/2 τ n + τ 2 n + 1 ). Dividing both sides of this equality by 4nτ 2 n and applying Assumption 4 yields the desired result. To prove b), let δ be an arbitrary g +k 1 ) 1 vector. By Lemma 4.2, we have that for each i = 1, 2,..., k and each j = 1, 2,..., l, ˆγ ni ˆθ n + τ n δ) ˆγ ni ˆθ n τ n δ) = 2τ n Cδ n + o P τ n ). It follows that 1 2τ n ˆγ n ˆβ n + τ n δ) ˆγ n ˆβ n τ n δ)) = Cδ + o P 1). Taking e j for δ for each j = 1, 2,..., k in this equality completes the proof. 26
INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction
INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental
More informationApproximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments
CIRJE-F-466 Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments Yukitoshi Matsushita CIRJE, Faculty of Economics, University of Tokyo February 2007
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationExponential Tilting with Weak Instruments: Estimation and Testing
Exponential Tilting with Weak Instruments: Estimation and Testing Mehmet Caner North Carolina State University January 2008 Abstract This article analyzes exponential tilting estimator with weak instruments
More informationSupplement to Quantile-Based Nonparametric Inference for First-Price Auctions
Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract
More informationLecture 11 Weak IV. Econ 715
Lecture 11 Weak IV Instrument exogeneity and instrument relevance are two crucial requirements in empirical analysis using GMM. It now appears that in many applications of GMM and IV regressions, instruments
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest
More informationWhat s New in Econometrics. Lecture 13
What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids
More informationSpecification Test for Instrumental Variables Regression with Many Instruments
Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing
More informationIV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors
IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral, IAE, Barcelona GSE and University of Gothenburg U. of Gothenburg, May 2015 Roadmap Testing for deviations
More informationEstimation and Inference with Weak Identi cation
Estimation and Inference with Weak Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania First Draft: August, 2007 Revised: March
More informationInference for Identifiable Parameters in Partially Identified Econometric Models
Inference for Identifiable Parameters in Partially Identified Econometric Models Joseph P. Romano Department of Statistics Stanford University romano@stat.stanford.edu Azeem M. Shaikh Department of Economics
More informationProofs for Large Sample Properties of Generalized Method of Moments Estimators
Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my
More informationRegression. Saraswata Chaudhuri, Thomas Richardson, James Robins and Eric Zivot. Working Paper no. 73. Center for Statistics and the Social Sciences
Split-Sample Score Tests in Linear Instrumental Variables Regression Saraswata Chaudhuri, Thomas Richardson, James Robins and Eric Zivot Working Paper no. 73 Center for Statistics and the Social Sciences
More informationDepartment of Econometrics and Business Statistics
ISSN 440-77X Australia Department of Econometrics and Business Statistics http://wwwbusecomonasheduau/depts/ebs/pubs/wpapers/ The Asymptotic Distribution of the LIML Estimator in a artially Identified
More informationUniversity of Pavia. M Estimators. Eduardo Rossi
University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample
More informationLikelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables
Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and
More informationROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract
ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics Abstract This paper considers instrumental variable regression with a single endogenous variable
More informationChapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE
Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over
More informationConditional Linear Combination Tests for Weakly Identified Models
Conditional Linear Combination Tests for Weakly Identified Models Isaiah Andrews JOB MARKET PAPER Abstract This paper constructs powerful tests applicable in a wide range of weakly identified contexts,
More informationEconomics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models
University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe
More informationIntroduction Let Y be a ( + l) random vector and Z a k random vector (l; k 2 N, the set of all natural numbers). Some random variables in Y and Z may
Instrumental Variable Estimation Based on Mean Absolute Deviation Shinichi Sakata University of Michigan Department of Economics 240 Lorch Hall 6 Tappan Street Ann Arbor, MI 4809-220 U.S.A. February 4,
More informationAsymptotics for Nonlinear GMM
Asymptotics for Nonlinear GMM Eric Zivot February 13, 2013 Asymptotic Properties of Nonlinear GMM Under standard regularity conditions (to be discussed later), it can be shown that where ˆθ(Ŵ) θ 0 ³ˆθ(Ŵ)
More informationSpecification testing in panel data models estimated by fixed effects with instrumental variables
Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions
More informationA CONDITIONAL LIKELIHOOD RATIO TEST FOR STRUCTURAL MODELS. By Marcelo J. Moreira 1
Econometrica, Vol. 71, No. 4 (July, 2003), 1027 1048 A CONDITIONAL LIKELIHOOD RATIO TEST FOR STRUCTURAL MODELS By Marcelo J. Moreira 1 This paper develops a general method for constructing exactly similar
More informationWeak Identification in Maximum Likelihood: A Question of Information
Weak Identification in Maximum Likelihood: A Question of Information By Isaiah Andrews and Anna Mikusheva Weak identification commonly refers to the failure of classical asymptotics to provide a good approximation
More informationOnline Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)
Online Appendix Proof of Lemma A.. he proof uses similar arguments as in Dunsmuir 979), but allowing for weak identification and selecting a subset of frequencies using W ω). It consists of two steps.
More information1 Procedures robust to weak instruments
Comment on Weak instrument robust tests in GMM and the new Keynesian Phillips curve By Anna Mikusheva We are witnessing a growing awareness among applied researchers about the possibility of having weak
More informationGeneralized Method of Moments (GMM) Estimation
Econometrics 2 Fall 2004 Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen of29 Outline of the Lecture () Introduction. (2) Moment conditions and methods of moments (MM) estimation. Ordinary
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationEstimation and Inference with Weak, Semi-strong, and Strong Identi cation
Estimation and Inference with Weak, Semi-strong, and Strong Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania This Version:
More informationEstimation of Dynamic Regression Models
University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.
More informationLECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH
LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationLocation Properties of Point Estimators in Linear Instrumental Variables and Related Models
Location Properties of Point Estimators in Linear Instrumental Variables and Related Models Keisuke Hirano Department of Economics University of Arizona hirano@u.arizona.edu Jack R. Porter Department of
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao
More informationGoodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach
Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationIV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors
IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard
More informationA more powerful subvector Anderson and Rubin test in linear instrumental variables regression. Patrik Guggenberger Pennsylvania State University
A more powerful subvector Anderson and Rubin test in linear instrumental variables regression Patrik Guggenberger Pennsylvania State University Joint work with Frank Kleibergen (University of Amsterdam)
More informationNUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction
NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationAsymptotic Distributions of Instrumental Variables Statistics with Many Instruments
CHAPTER 6 Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments James H. Stock and Motohiro Yogo ABSTRACT This paper extends Staiger and Stock s (1997) weak instrument asymptotic
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationGreene, Econometric Analysis (6th ed, 2008)
EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived
More informationMore Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction
Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order
More informationGARCH Models Estimation and Inference. Eduardo Rossi University of Pavia
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function
More informationA Bayesian perspective on GMM and IV
A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract
ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics Abstract This paper considers instrumental variable regression with a single endogenous variable
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationExogeneity tests and weak-identification
Exogeneity tests and weak-identification Firmin Doko Université de Montréal Jean-Marie Dufour McGill University First version: September 2007 Revised: October 2007 his version: February 2007 Compiled:
More informationUniformity and the delta method
Uniformity and the delta method Maximilian Kasy January, 208 Abstract When are asymptotic approximations using the delta-method uniformly valid? We provide sufficient conditions as well as closely related
More informationComparison of inferential methods in partially identified models in terms of error in coverage probability
Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationTesting for Weak Identification in Possibly Nonlinear Models
Testing for Weak Identification in Possibly Nonlinear Models Atsushi Inoue NCSU Barbara Rossi Duke University December 24, 2010 Abstract In this paper we propose a chi-square test for identification. Our
More informationFinite Sample Performance of A Minimum Distance Estimator Under Weak Instruments
Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator
More informationGARCH Models Estimation and Inference
Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function
More informationUnderstanding Regressions with Observations Collected at High Frequency over Long Span
Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University
More informationInstrumental Variables Estimation in Stata
Christopher F Baum 1 Faculty Micro Resource Center Boston College March 2007 1 Thanks to Austin Nichols for the use of his material on weak instruments and Mark Schaffer for helpful comments. The standard
More informationChapter 1. GMM: Basic Concepts
Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating
More informationAnalysis of least absolute deviation
Analysis of least absolute deviation By KANI CHEN Department of Mathematics, HKUST, Kowloon, Hong Kong makchen@ust.hk ZHILIANG YING Department of Statistics, Columbia University, NY, NY, 10027, U.S.A.
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationSection 9: Generalized method of moments
1 Section 9: Generalized method of moments In this section, we revisit unbiased estimating functions to study a more general framework for estimating parameters. Let X n =(X 1,...,X n ), where the X i
More informationA Robust Test for Weak Instruments in Stata
A Robust Test for Weak Instruments in Stata José Luis Montiel Olea, Carolin Pflueger, and Su Wang 1 First draft: July 2013 This draft: November 2013 Abstract We introduce and describe a Stata routine ivrobust
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationThe Influence Function of Semiparametric Estimators
The Influence Function of Semiparametric Estimators Hidehiko Ichimura University of Tokyo Whitney K. Newey MIT July 2015 Revised January 2017 Abstract There are many economic parameters that depend on
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationConditional Inference With a Functional Nuisance Parameter
Conditional Inference With a Functional Nuisance Parameter The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher
More informationPanel Data Models. James L. Powell Department of Economics University of California, Berkeley
Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel
More informationInference for identifiable parameters in partially identified econometric models
Journal of Statistical Planning and Inference 138 (2008) 2786 2807 www.elsevier.com/locate/jspi Inference for identifiable parameters in partially identified econometric models Joseph P. Romano a,b,, Azeem
More informationNonlinear minimization estimators in the presence of cointegrating relations
Nonlinear minimization estimators in the presence of cointegrating relations Robert M. de Jong January 31, 2000 Abstract In this paper, we consider estimation of a long-run and a short-run parameter jointly
More informationMaximum Likelihood (ML) Estimation
Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear
More informationTesting Overidentifying Restrictions with Many Instruments and Heteroskedasticity
Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,
More informationY t = ΦD t + Π 1 Y t Π p Y t p + ε t, D t = deterministic terms
VAR Models and Cointegration The Granger representation theorem links cointegration to error correction models. In a series of important papers and in a marvelous textbook, Soren Johansen firmly roots
More informationStatistics and econometrics
1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample
More informationExogeneity tests and weak identification
Cireq, Cirano, Départ. Sc. Economiques Université de Montréal Jean-Marie Dufour Cireq, Cirano, William Dow Professor of Economics Department of Economics Mcgill University June 20, 2008 Main Contributions
More informationEconomics 582 Random Effects Estimation
Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0
More informationEmpirical Processes: General Weak Convergence Theory
Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated
More informationSubsampling Tests of Parameter Hypotheses and Overidentifying Restrictions with Possible Failure of Identification
Subsampling Tests of Parameter Hypotheses and Overidentifying Restrictions with Possible Failure of Identification Patrik Guggenberger Department of Economics U.C.L.A. Michael Wolf Department of Economics
More informationSplit-Sample Score Tests in Linear Instrumental Variables Regression
Split-Sample Score Tests in Linear Instrumental Variables Regression Saraswata Chaudhuri, Thomas Richardson, James Robins and Eric Zivot Working Paper no. 73 Center for Statistics and the Social Sciences
More informationEconomic modelling and forecasting
Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation
More informationThe Numerical Delta Method and Bootstrap
The Numerical Delta Method and Bootstrap Han Hong and Jessie Li Stanford University and UCSC 1 / 41 Motivation Recent developments in econometrics have given empirical researchers access to estimators
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More information1 Introduction. Conditional Inference with a Functional Nuisance Parameter. By Isaiah Andrews 1 and Anna Mikusheva 2 Abstract
1 Conditional Inference with a Functional Nuisance Parameter By Isaiah Andrews 1 and Anna Mikusheva 2 Abstract his paper shows that the problem of testing hypotheses in moment condition models without
More informationMeasuring the Sensitivity of Parameter Estimates to Estimation Moments
Measuring the Sensitivity of Parameter Estimates to Estimation Moments Isaiah Andrews MIT and NBER Matthew Gentzkow Stanford and NBER Jesse M. Shapiro Brown and NBER May 2017 Online Appendix Contents 1
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationImbens/Wooldridge, Lecture Notes 13, Summer 07 1
Imbens/Wooldridge, Lecture Notes 13, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 13, Wednesday, Aug 1st, 2.00-3.00pm Weak Instruments and Many Instruments 1. Introduction In recent
More informationQuantile Processes for Semi and Nonparametric Regression
Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response
More informationCENTER FOR LAW, ECONOMICS AND ORGANIZATION RESEARCH PAPER SERIES
Maximum Score Estimation of a Nonstationary Binary Choice Model Hyungsik Roger Moon USC Center for Law, Economics & Organization Research Paper No. C3-15 CENTER FOR LAW, ECONOMICS AND ORGANIZATION RESEARCH
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationMissing dependent variables in panel data models
Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More information