Introduction Let Y be a ( + l) random vector and Z a k random vector (l; k 2 N, the set of all natural numbers). Some random variables in Y and Z may

Size: px
Start display at page:

Download "Introduction Let Y be a ( + l) random vector and Z a k random vector (l; k 2 N, the set of all natural numbers). Some random variables in Y and Z may"

Transcription

1 Instrumental Variable Estimation Based on Mean Absolute Deviation Shinichi Sakata University of Michigan Department of Economics 240 Lorch Hall 6 Tappan Street Ann Arbor, MI U.S.A. February 4, 200 We propose a general estimation principle based on the assumption that instrumental variables (IV) do not explain the error term in a structural equation. The estimators based on this principle is independent of the normalization constraint, unlike the standard IV estimators such as the two-stage least squares estimator. Using the new principle, we propose the L IV estimator, which is an IV estimation counterpart of the least absolute deviation estimator. We investigate the asymptotic properties of this estimator, and propose a consistent estimator of its asymptotic covariance matrix and a consistent specication test based the L IV estimator. We also discuss the identiability in L IV estimation. Keywords: IV estimation, mean absolute deviation, least absolute deviation estimation The author is grateful to Professor Curt T. McMullen, who kindly provided a sketch of the proof of Lemma 4. in personal communications.

2 Introduction Let Y be a ( + l) random vector and Z a k random vector (l; k 2 N, the set of all natural numbers). Some random variables in Y and Z may be the same random variables. Suppose that an economic theory predicts that a certain (unknown) linear combination of the random variables in Y is unrelated to Z. We naturally want to know which linear combination it is. A statistical interpretation of the theory's prediction is that given a prediction accuracy measure, there exists ( + l) constant vector 0 such that the best linear predictor for 0 0Y by using Z is zero. The mean square error (MSE) is often employed as the accuracy measure. In terms of MSE, the linear predictor d 0 0 Y for 0 0 Y is the best if and only if it satises the orthogonality condition that E[(0 0 Y? d 0 0 Y )Z] = 0. By taking this orthogonality condition for the moment conditions in the generalized method-of-moments (GMM) estimation approach and imposing the normalization restriction that one of the elements in be one, the instrumental variable estimators such as the two-stage least squares (2SLS) estimator are formed. The IV estimators are widely used in econometric applications. Nevertheless, the IV estimators have a well-known problem. An IV estimator can deliver very dierent estimated equation depending on the normalization constraint, when k > l (i.e., the equation is \overidentied"). It is certainly annoying to see a big dierence between the equation that is estimated setting the rst coecient to one and that estimated setting the second coecient to one. We here propose an alternative approach, which yields estimators without this dependence on the normalization constraint. Choose a dispersion measure for univariate distributions and take the ratio: Q() = inf 2R k Dispersion of (Y 0? Z 0 ) about the origin Dispersion of Y 0 ; about the origin where R k is the k-dimensional real space. This ratio ranges between zero and one. The higher this ratio is, the less Y 0 is related to Z, because a high ratio means that the performance of the linear predictor based on Z is close to that of the predictor constantly equal to zero. If the theory's claim that 0 0Y is not related to Z is correct, we have that Q( 0 ) =. The parameter value 0 is thus the maximizer of Q. Even if the claim is not correct, the maximizer of Q can be viewed as representing the linear relationship among variables in Y that is closest to the theory's claim. Although the letter Y often denotes the vector that containing endogenous variables in the literature, Y may contain both endogenous and exogenous variables in this paper. 2

3 An estimator based on this approach can be dened as the maximizer of the sample analogue of Q. We can form various estimators by taking various dispersion measures in Q. A leading example of the dispersion measure is the standard deviation. The function Q with the standard deviation is: Q 2 (), inf 2R k E[(Y 0? Z 0 ) 2 ] =2 E[(Y 0 ) 2 ] =2 ; 2 R +l nf0g; () provided that both Y and Z have nite second moments, where, means \is dened to be". By using the standard results for the best linear predictor in terms of the mean square error (MSE), we can show that inf 2R k E[(Y 0? Z 0 ) 2 ] = 0 (M Y Y? M Y Z M? ZZ M ZY ); where M Y Y, E[Y Y 0 ], M Y Z, E[Y Z 0 ], and M ZY, M 0 ZY. We thus have that Q 2 () 2 =? 0 M Y Z M? ZZ M ZY 0 ; 2 R +l nf0g: M Y Y The eigen vector corresponding to the minimum eigen value of M Y Z M? ZZ M ZY in the metric of M Y Y maximizes Q 2, provided that M Y Y is nonsingular 2. The function Q 2 can be viewed as a utility function of our preferences over the values of, which represent the linear relationships among the random variables in Y. Because Q 2 is homogeneous of degree zero, is judged to be as good as c for each 2 R +l and each nonzero real number c. This feature of Q 2 is consistent with the view that if a structural equation is obtained by multiplying another structural equation by a scalar constant, the two equations represent the same relationship. To avoid having multiple values for that represent the same linear relationship, it is convenient to normalize. A commonly employed normalization is to set an element of to one. Let (Y ; Z t ); : : : ; (Y n ; Z n ) be a random sample drawn from the distribution of (Y; Z). Then the sample analogue of Q 2 () is ^Q2n dened by ^Q 2n () 2,? 0 ^M n Y Z ^M n ZZ + ^M n ZY 0 ^M n Y Y ; where ^M n Y Z, n? P n Y tz 0 t; ^M n ZY, ^M n Y Z 0 ; ^M n ZZ, n? P n Z tz 0 t ; ^M n Y Y, n? P n Y ty 0 t ; 0. 2 The eigen values of MY Z M? ZZ M ZY in the metric of MY Y are the values of that satises detfmy Z M? ZZ M ZY? MY Y g = 3

4 and A + denotes the Moore-Penrose (MP) generalized inverse of a matrix A. The estimator based on Q 2 is dened to be the minimizer of ^Q2n with the normalization restriction that sets one of the elements of to one. Interestingly, this estimator is the same as the IV estimator obtained by the canonical correlation approach in Sargan (958). As Sargan shows, this estimator is also the limited information maximum likelihood (LIML) estimator employing the joint normal distribution for the reduced-form disturbances when all explanatory variables in the structural equation are included in the instruments vector Z. Further, it can be viewed as an estimator based on the least variance ratio principle (see Kmenta (986, p. 690) and Schmidt (976, pp.70{77)), and it coincides with a special case of the GMM estimator with continuously updated weights considered in Hansen, Heaton, and Yaron (996). The homogeneity of Q is not limited to the implementation that uses the standard deviation for the dispersion measure. Given any dispersion measure, the dispersion of cu is equal to c times the dispersion of U for each random variable U and real constant c 0. It follows that Q is always homogeneous of degree zero. A reasonably chosen sample analogue of Q inherits this homogeneity property. The estimators generated by our approach are thus normalization-invariant in the sense that two dierent normalization restrictions result in the the same estimated linear relationship. This makes our approach attractive in practice. Another feature of our approach is that the estimator based on the proposed approach is meaningful even under misspecication. The maximizer of Q represents the relationship among the variables in Y that are least related to Z in terms of Q. In this sense, can be regarded as the most favorable parameter value for the theory's claim. It is thus sensible to estimate even under misspecication. The estimator generated in our strategy would be consistent for this under mild conditions. Given these attractive features of the proposed approach, a natural question arises. What dispersion measure should we use in Q? In choosing a dispersion measure, one may want to take into account outlierrobustness of the resulting estimator. The sample standard deviation is known to be highly sensitive to outlying values. This means that the estimator generated by taking the standard deviation in Q can be highly sensitive to the outlying observations on Y and Z. If this property could cause problems in a project, one can instead use the mean absolute deviation (MAD), which is known to be less sensitive to the outlying observations than the standard deviation. We call the resulting estimator the L IV estimator, because the MAD about the origin of a random variable is the same as the L norm of it. In the rest of this paper, we 4

5 focus on this estimator. If one prefers more outlier-robustness in our estimation strategy, one can employ a high breakdown point dispersion measure such as the S-dispersion measure of Rousseeuw and Yohai (984) or -dispersion measure Yohai and Zamar (988) to replace MAD at the possibly higher computational costs. Attempts to robustify the IV estimation method have been made in dierent ways in the past. Krasker and Welsch (985) propose use of reweighted IVs to reduce the eect of outliers in IV estimation. Krasker (986) instead considers an estimator similar to the conventional 2SLS that uses a bounded-inuence estimator in each stage instead of the OLS estimator. Also, Glahe and Hunt (970), Amemiya (982) and Powell (983) investigate the two-stage LAD estimators that use the LAD estimator instead of the OLS estimator in both stages or in one stage in 2SLS estimation. The estimators in these attempts inherits the property of the conventional IV estimator that the estimator is not normalization-invariant. In the context of estimating one or more structural equations in a simultaneous equation system, one may have a reduced-form system. The structural equations can be viewed as imposing constraints on the parameters of the reduced-form system. A natural way to form a robust estimator in this context is to incorporate these constraints in a robust estimation method for the reduced-form system. Prucha and Kelejian's (984) full information maximum likelihood (FIML) estimator employing the generalized t-distribution for the reduced-form disturbances is an example of this. Krishnakumar and Ronchetti (997) also take a similar approach, using an estimator of the reduced-form system with a bounded-inuence function. Because the parameter space for the reduced-form system under the constraints given by the structural equations is invariant to normalization of the structural equations, these estimators are normalization-invariant. A possible drawback of this approach is that we have to specify the reduced-form system even if we are only interested in a single structural equation. Unlike the previously developed estimators mentioned above, the L IV estimator is invariant to normalization and requires no reduced-form specication. In what follows, we analyze the properties of the L IV estimator and consider some related topics. After giving a formal denition of the L IV estimator and establishing its existence (Section 2), we investigate the consistency (Section 3), the identiability (Section 4), and the asymptotic normality (Section 5) in the L IV estimation. We also propose a consistent estimator for the asymptotic covariance matrix of the L IV estimator (Section 6), and a consistent specication test based on the L IV estimator (Section 7). We conclude this paper with the summary and remarks (Section 8). All formal assumptions are collected in Appendix A, and the proofs of theorems and lemmas are 5

6 given in Appendix B. 2 Denition We assume that the data are a realization of an i.i.d. sequence of q random vectors, X t, t 2 N, which are drawn from the distribution of a random vector X (Assumption A.). X t is partitioned as X t (Y 0 t ; Z 0 t) 0, where Y t is ( + l) and Z t is k, t = ; 2; : : : (l; k 2 N, q = + l + k). X is similarly partitioned as X (Y 0 ; Z 0 ). To ensure that each linear combination of the elements of X has nite absolute moment, we assume that each element of X has a nite absolute moment (Assumption A.2). If Y 0 = 0 almost surely (a.s.) for some 2 R +l nf0g, this value of would represent the linear relationship we are trying to estimate. We could nd this exact value of based on the observed values of Y a.s. This, however, seldom occurs in practice. We assume that the random variables in Y are linearly independent in the sense that for any 2 R +l nf0g, P [Y 0 = 0] <, and that the random variables in Z are linearly independent in the same sense (Assumption A.3). Let k k denote the L norm (i.e., kuk = E[jUj] for each random variable U). Then the function Q based on the MAD is given by Q(), inf 2R k ky 0? Z 0 k ky 0 k ; 2 R +l nf0g: (2) By using the basic results about normed linear spaces, we can show: Lemma 2.: Suppose that Assumptions A. hold. Then (a) For each 2 R +l there exists y 2 R k such that ky 0? Z 0 y k = inf 2R k ky 0? Z 0 k : (b) The mapping from R +l to R is continuous. 7! inf 2R k ky 0? Z 0 k (c) Q is continuous on R +l nf0g. 6

7 The function Q is homogeneous of degree zero. It follows that sup Q() = sup Q(); 2R +l nf0g 2@B +l where B +l is the unit ball centered at the origin in the ( + l)-dimensional Euclidean space +l is the boundary of it. +l is compact, it follows by the theorem of maximum that Q attains its supremum +l. Thus: Lemma 2.2: Suppose that Assumptions A.{A.3 hold. Then there exists 2 R +l nf0g such that Q( ) = Our estimation problem is to reveal of Lemma 2.2. sup Q(): 2R +k nf0g We dene our estimator as the minimizer of the sample analogue of Q with a normalization restriction. With no loss of generality, we set the rst element of to one. Definition (L IV estimator): For each sample size n 2 N dene ^Qn : (R +l nf0g)! R by ^Q n (;!), 8 >< >: inf 2R k n? P n jyt(!)0?z t(!) 0 j n? P n jyt(!)0 j otherwise if (Y (!); : : : ; Y n (!)) is of full row rank; for each (;!) 2 (R +l nf0g), where j j denotes the Euclidean norm. Let B be a nonempty subset of R l. If there exists an l random vector ^ n such that ^Q n (;? ^ n ; ) = sup ^Q n (;?; ) a.s.-p; 2B we call ^n the L IV estimator associated with parameter space B, or simply the L IV estimator. The (a.s.) continuity of the objective function is crucial to ensure existence of this estimator. When (Y ; : : : ; Y n ) is not of full row rank, the objective function would be discontinuous at some points in the parameter space, due to the division by zero, without the \otherwise" part of the denition of ^Qn. The event that (Y ; : : : ; Y n ) is not of full row rank is usually a rare event, if the sample size is reasonably large. The \otherwise" part thus has no impact in practice, while it makes our theoretical analysis simpler. To establish the existence of the L IV estimator, we use the next lemma. 7

8 Lemma 2.3: (a) For each n 2 N (; x n ) 7! inf 2R k n? nx jy 0 t? z 0 tj from R +l R nv to R is continuous, where x n is partitioned as x n, (x 0 ; : : : ; x0 n) 0 ; x t is further partitioned as x t, (y 0 t ; z0 t) 0 for t 2 N; and y t 2 R +l and z t 2 R k for each t 2 N. (b) Given Assumption A., ^Qn is measurable-(b nv F )=B for each n 2 N, and ^Qn (;!) is continuous on R +l nf0g for each! 2, where B m denotes the Borel -eld of R m and denotes the operator for the product -eld. It thus follows that (!; ) 7! ^Qn (;?;!) : B! R is a measurable function such that for any! 2 ^Q n (; ;!) : B! R is continuous. By assuming that B is compact (Assumption A.4), we can apply the standard result for existence of extremum estimators such as Gallant and White (988, Theorem 2.2) to establish the existence of the L IV estimator. Theorem 2.4: Suppose that Assumptions A. and A.4 hold. Then there exists an L IV estimator ^ n associated with parameter space B. 3 Consistency In this section, we establish consistency of the L IV estimator ^ n in the broad sense that ^ n tends to be close to the set of the maximizers of the function 7! Q(;?) : B! R when the sample size is large. The key step is the proof of the convergence of f ^Qn (;?; )g n2n to Q(;?) uniform in 2 B. We consider the behavior of the numerator and denominator of ^Qn separately. First we establish the uniform convergence of the average of jy 0 t? Z0 t j. (a) Lemma 3.: Suppose that Assumptions A. and A.2 hold. Then: sup (;)2R +l R k nf(0;0)g (b) For any nonempty, bounded subset of R +l R k, sup (;)2 n? P n? n jy 0 t? Ztj 0? ky 0? Z 0 k! 0 as n! a.s.-p: jj + jj nx jy 0 t? Z 0 tj? ky 0? Z 0 k! 0 as n! a.s.-p: 8

9 We can next show the uniform convergence of the numerator of ^Qn. (a) Lemma 3.2: Suppose that Assumptions A. and A.2 hold. Then: sup 2R +l nf0g P inf2r k n? n jy 0 t? Ztj 0? inf 2R k ky 0? Z 0 k! 0 as n! a.s.-p: jj (b) For any nonempty, bounded subset A of R +l nf0g, sup 2A2R inf n? k nx jy 0 t? Z 0 tj? inf 2R k ky 0? Z 0 k! 0 as n! a.s.-p: We now combine the results of Lemmas 3.(a) and 3.2(a) to obtain the uniform convergence of ^Qn. Lemma 3.3: Suppose that Assumptions A.{A.3 hold. Then It immediately follows from Lemma 3.3 that sup j ^Qn (; )? Q()j! 0 as n! a.s.-p: 2R +l nf0g sup j ^Qn (;?; )? Q(;?)j! 0 as n! a.s.-p: (3) 2B Thus, Lemma 4.2 of Potscher and Prucha (99) applies to our estimation problem. Theorem 3.4: Suppose that Assumptions A.{A.4 hold. Then the L IV estimator f ^ n g n2n is strongly consistent for B in the sense that d 2 ( ^ n ; B )! 0 as n! a.s.-p; where B, ( ) 2 B : Q(;? ) = sup Q(;?) ; 2B and d 2 is the Euclidean metric so that d 2 (; B ) = inf 2B j? j: 9

10 4 Identiable Uniqueness In establishing the asymptotic normality of the L IV estimator in the next section, we assume that B is a singleton, i.e., the maximizer of the function 7! Q(;?) : B! R is unique. In this section, we examine when this uniqueness holds. The uniqueness of the maximizer is sometimes called the identiability in the literature. Nevertheless, the consistency of ^ n established in Section 3 means that the maximizers of Q(;?) on B is identiable in a sense. So, we here use the term \identiable uniqueness" or simply uniqueness instead of identiability. In the LIML and 2SLS estimation, the order condition that l k is known to be necessary for the uniqueness. Is a similar result available for the L IV estimator? To answer this question, we rewrite Q as Q() = inf 2R k Y 0 = inf 2R k Y 0 ky 0 k? Z 0 ky 0 k? Z 0 ky 0 k ; 2 R +l nf0g: From this expression, we see that Q() is the distance between the unit-length vector Y 0 =ky 0 k and the space spanned by the elements of Z. A maximizer of Q on R +l thus gives the point on the unit sphere on the linear subspace spanned by Y that is furthest from the linear subspace spanned by Z. The following lemma gives us a useful insight about this geometry. Lemma 4.: Suppose that (E; k k) is a nite dimensional smooth normed linear space (see Beauzamy (985, Part III, Chapter ) for the smoothness of normed spaces.) and that M and N are linear subspaces of E with dimensions m and n, respectively, such that m > n > 0. Also let S(M) be the unit sphere in M. Then there exists a point x 0 in S(M) such that inf 2N kx 0? k = kx 0 k, so that the distance between x 0 and N is one. Suppose that Y 0? Z 0 6= 0 a.s.-p for each (; ) 2 (R +l R k )nf0g: (4) Under Assumption A.2, this implies that ky 0? Z 0 k is dierentiable with respect to and at each (; ) 6= 0. Under the dierentiability, the space spanned by Y and Z with norm k k is a smooth normed space. An implication of Lemma 4. is that if ( + l) > k, there exists 0 2 R +l nf0g such that the 0

11 distance between Y 0 0 =ky 0 0 k and the linear subspace spanned by Z is one. That is, if l k, there exists 0 2 R +l nf0g such that Q( 0 ) =. Further, when l > k, Q attains one even if we drop a variable from Y, because the dimension of the linear subspace spanned by the remaining variables in Y is l?, which is still no less than k. Note that which variable is dropped from Y does not matter for this result. So, there are nonzero ( + l) vectors () 0 ; : : : ; (+l) 0 such that Q( (j) 0 ) =, and the jth element of (j) 0 is zero, j = ; 2; : : : ; (+l). Because () 0 is nonzero, at least one of the second through last elements of it is nonzero. Let the ith element of () 0 is not zero (i > ). Then () 0 and (i) 0 are linearly independent. Thus, Q attains one on R +l nf0g if l k, and there are two or more linearly independent vectors in R +l that yield one for Q if l > k. The condition (4) is of course quite restrictive. It is violated, if Y shares one or more variables with Z. It also fails, if a variable contained in Y or Z has a discrete distribution. Fortunately, the same result holds without condition (4), as is found in the next theorem. Theorem 4.2: Suppose that Assumptions A.{A.3. Then: (a) If l k, then there exists 0 in R +l nf0g such that Q( 0 ) =. (b) If further l > k, then there exist two linearly independent vectors () 0 and (2) 0 in R +l nf0g that satises that Q( () 0 ) = Q((2) 0 ) =. The uniqueness of the optimum parameter value thus requires that the number of instrumental variables be no less than the number of the dependent variables minus one, as in LIML and 2SLS estimation. We now assume that the elements of Z are linearly independent for simplicity and that l k (Assumption A.5) to avoid the violation of the \order condition". As in the LIML and 2SLS estimation, the condition that l k does not imply that the optimum parameter value is unique up to scale. It seems that it is dicult to give an intuitively appealing sucient condition for the uniqueness without imposing further structure on X. The next example demonstrates a situation in which the uniqueness holds. Example 4.: In addition to Assumptions A.{A.3 and A.5, suppose that there exists a ( + l) k constant matrix 0 that satises the following conditions: (a) Letting, Y? 0 Z, for each 2 R +l nf0g, the conditional median of 0 is zero, and the conditional distribution function of 0 is strictly increasing at the origin.

12 (b) The rank of 0 is l, so that the dimension of the left null space of 0 is one. Then Q() = if and only if is a nonzero vector that belongs to the left null space of 0 ; so, the maximizer of Q on R +l nf0g is unique up to scalar multiple. To verify this fact, let be an arbitrary vector in R +l. An implication of condition (a) is the constant zero minimizes the mean absolute prediction error (MAPE) in predicting 0 based Z, and any predictor dierent from it with a positive probability yields a larger MAPE. Because 0 Y = 0 ( 0 Z + ) = 0 0 Z + ; it follows that 0 0 Z is the best-mape predictor of 0 Y, and any predictor that is dierent from 0 0 Z with nonzero probability yields a larger MAPE. We thus have that for any 2 R +l nf0g k 0 Y k k 0 Y? 0 0 Zk ; where the equality holds if and only if 0 0 Z = 0 a.s.-p. By Assumption A.3, it also holds that 0 0 Z = 0 a.s.-p if and only if 0 0 = 0. The latter condition holds if and only if belongs to the left null space of 0. 5 Asymptotic Normality We now investigate the asymptotic distribution of the L IV estimator, assuming that Q has a unique maximizer (i.e., B = f g) interior to B (Assumption A.6(a)). By Theorem 4.2, this requires that k l (Assumption A.5). In investigating the asymptotic distribution of an extremum estimator, we typically use the stochastic mean value theorem (Jennrich (969), Lemma 3) to approximate the appropriately standardized estimator by a linear transformation of a random vector that is asymptotically normally distributed. Nevertheless, the absolute value function in the expression in ^Qn (;?; ) creates non-dierentiable points on the parameter space B for typical realizations. This feature makes it impossible to use the stochastic mean value theorem in our problem. An approach alternative to the linearization of the rst order condition is to approximate the objective function by a quadratic function of the parameters in the neighborhood of in an appropriate sense. Huber (967), Pollard (985) and Potscher and Prucha (997, Chapter 9) describe this approach. In the L IV estimation, the objective function is ^Qn (;?; ), or equivalently, log ^Qn (;?; ). We approximate 2

13 log ^Qn (;?; ) by a quadratic function in our investigation. Note that ^Qn is not an average of random functions, and that the numerator of ^Qn is the inmum of the average of random functions. We need to take several steps to reach our goal due to these features of our problem. In addition to Assumptions A.{A.6(a), we impose a few mild conditions. For convenience, let V denote the rst element of Y and W the second through last elements of Y. Also let r and r 0 be the gradient operator and the Jacobian operator with respect to, and let r, r r 0. First, we require that R( ; ) have a unique minimum denoted, where R(; ), kv? W 0? Z 0 k ; (; ) 2 R l R k : (5) Because k k is convex, R is a convex function on R l R k ; so, R( ; ) : R k! R is also convex. Our requirement is thus that R( ; ) is strictly convex at (Assumption A.6(b)). The next lemma gives a convenient result in investigating the behavior of Q around. that Lemma 5.: Suppose that Assumptions A.{A.5 hold. Then there exists a compact subset ~? of R k such is interior to ~?, and where ^R n (; ;!), n? n X inf R(; ) = inf R(; ); 2 B; (6) 2? ~ 2R k sup inf ^R n (; ; )? inf ^Rn (; ; ) = 0 a.a. n 2 N a.s.-p (7) 2B 2? ~ 2R k jv t (!)? W t (!) 0? Z t (!) 0 j; (; ;!) 2 R l R k ; n 2 N; and a.a. stands for \almost all", so that \for a.a. n 2 N means \for all n 2 N with at most nite exceptions." It follows from this lemma that for each 2 B, there exists y 2 ~? such that R(; y ) = inf 2R k R(; ). Further, we have: Lemma 5.2: Suppose that Assumptions A.{A.6 hold. Let f j g j2n be an arbitrary sequence in B converging to, ~? a compact subset of R k mentioned in Lemma 5., and f j g j2n a sequence in ~? such that Then f j g converges to. R( j ; j ) = inf 2R k R( j ; ); j 2 N: 3

14 This result is used below to establish the quadratic approximation to inf 2R k R(; ). Second, we require that R be twice continuously dierentiable on an open neighborhood of ( 0 ; 0 ) 0 and an open neighborhood of ( 0 ; 0 0 ) 0 2 R l R k (Assumption A.7(a)). Despite that ^Rn (; ; ) is nondierentiable at some points for typical realizations, the dierentiability requirement for R is not restrictive. As easily veried, a sucient condition for this requirement is that V has the continuous conditional density function f(jw; Z) (with respect to the Lebesgue measure) given W and Z, and random variables in W and Z has nite second moments. Under this condition, we have that for each ( 0 ; 0 ) 0 2 R l R k, r R(; ) = E[(2F (W 0 + Z 0 jw; Z)? )W ]; r R(; ) = E[(2F (W 0 + Z 0 jw; Z)? )Z]; r R(; ) = 2E[f(W 0 + Z 0 jw; Z)W W 0 ]; r R(; ) = 2E[f(W 0 + Z 0 jw; Z)W Z 0 ]; (8) and r R(; ) = 2E[f(W 0 + Z 0 jw; Z)ZZ 0 ]; (9) where F (jw; Z) is the conditional distribution function of V given W and Z. Given the dierentiability of R, let l, r R( ; ); l, r R( ; ); J, r R( ; ); J, r R( ; ); J, r R( ; ); J, r R( ; ); l 0, r R( ; 0); J 0, r R( ; 0): (0) Because the dierentiable function R( ; ) : R k! R is minimized at, l r is a zero vector. Using this fact with the second order Taylor series expansion, we obtain that R(; ) =R( ; ) + l 0 (? ) + 2 (? ) 0 J (? ) + (? ) 0 J (? ) + 2 (? ) 0 J (? ) + o(j? j 2 + j? j 2 ) as (; )! ( ; ) () 4

15 and that R(; 0) =R( ; 0) + l 0 0 (? ) + 2 (? ) 0 J 0 (? ) + o(j? j 2 ) as! : (2) By imposing a condition that J is positive denite (Assumption A.7(b)), we obtain: Lemma 5.3: Suppose that Assumptions A.{A.6 and A.7(a)(b) hold. Then: inf 2R k R(; ) =R( ; ) + l 0 (? ) + 2 (? ) 0 (J? J J? J )(? ) + o(j? j 2 ) as! : We now combine Lemma 5.3 with (2) to obtain the quadratic approximation to Q. (a) (b) Lemma 5.4: Suppose that Assumptions A.{A.6 and A.7(a)(b) hold. Then: l R( ; )? l 0 R( ; 0) = 0: log Q(;?) = log R( ; ) R( ; 0)? 2 (? ) 0 K(? ) + o(j? j 2 ) as! ; where K,? ( J? J J?J R( ;? J 0 ) R( ; 0)? l l 0 R( ; ) 2 + ) l0 l0 0 R( : (3) ; 0) 2 Because Q is uniquely maximized at, K is positive semi-denite. We here assume that K is positive denite (Assumption A.7(c)) so that the second order term does not vanish in any direction. The assumptions made so far guarantees that Q is well approximated by a quadratic function in the neighborhood of. We next impose mild conditions to ensure that we can capture the random error in approximating Q by ^Qn. This requirement is important, because the random error is the source of the randomness of the estimator. Our new requirements are that each element of X has a nite second moment (Assumption A.8), that P [V? W 0? Z 0 = 0] = 0; (4) 5

16 and that P [V? W 0 = 0] = 0 (5) (Assumption A.9). In investigating the asymptotic normality, we often assume that the fourth moment of each random variable in X are nite. Our moment condition is substantially weaker. Also note that we do NOT require that (4) and (5) hold for arbitrary values of and. If V has a continuous conditional density given W and Z, (4) and (5) obviously hold. Even if V? W 0? Z 0 = 0 with positive probability for some and, (4) and (5) may still hold. Lemma 5.5: Suppose that Assumptions A.{A.9 hold and let, ( 0 ; 0 ) 0 and, B ~?. (a) Dene r : R q! R by r(x; ), 8 >< >: 0 if = ; j? j (jv? w0? z 0 j? jv? w 0? z 0 j +sgn(v? w 0? z 0 )(w 0 (? ) + z 0 (? ))) where x = (v; w 0 ; z 0 ) 0 2 R R l R k, = ( 0 ; 0 ) 0 2 R l R k, and sgn(a) = 8 >< Also for each n 2 N dene n : R l R k! R by >: if a > 0; 0 if a = 0;? if a < 0: otherwise, n (; ;!), n? n X fr(x t (!); ; )? E[r(X; ; )]g; (; ;!) 2 R l R k ; n 2 N: Then for each sequence of Euclidean balls fb n g n2n in that shrinks down to ( ; ), sup (;)2B n jn =2 n (; ; )j! 0 as n! prob-p: (6) (b) Dene r 0 : R q B! R by r 0 (x; ), 8 >< >: 0; if = ; j? j (jv? w0 j? jv? w 0 j + sgn(v? w 0 )w 0 (? )) otherwise, 6

17 where x = (v; w 0 ) 0 2 R R l. Also for each n 2 N dene 0 n : R l! R by 0 n(;!), n? n X fr 0 (X t (!); )? E[r 0 (X; )]g; (;!) 2 R l ; n 2 N: Then for each sequence of Euclidean balls f ~ Bn g n2n in B that shrinks down to, An implication of Assumption A.9 is: sup 2 ~ B n jn =2 0 n(; )j! 0 as n! prob-p: (7) Lemma 5.6: Suppose that Assumptions A., A.2, and A.9 hold. Then l = E[?sgn(V? W 0? Z 0 )W ] l = E[?sgn(V? W 0? Z 0 )Z] and l 0 = E[?sgn(V? W 0 )W ]: Dene (x),?sgn(v? w 0? z 0 )w? l ; x = (v; w 0 ; z 0 ) 0 2 R R l R k ; (8) (x),?sgn(v? w 0? z 0 )z? l (9) =?sgn(v? w 0? z 0 )z; x = (v; w 0 ; z 0 ) 0 2 R R l R k ; (20) and 0 (x),?sgn(v? w 0 )w? l 0 ; x = (v; w0 ; z 0 ) 0 2 R R l R k : (2) Then all (X), (X) and 0 (X) have zero mean vectors. By the denition of f ng n2n in Lemma 5.5, we have that for each n 2 N ^R n (; ; )? ^Rn ( ; ; ) = fr(; )? R( ; )g + n? n X (X t ) 0 (? ) +n? n X (X t ) 0 (? ) + n?=2 j? jn =2 n (; ; ): (22) 7

18 We also obtain from the denition of f 0 n g n2n that for each n 2 N ^R n (; 0; )? ^Rn ( ; 0; ) = fr(; 0)? R( ; 0)g + n? n X 0 (X t ) 0 (? ) We combine () with (22) and (2) with (23) to derive: +n?=2 j? jn =2 0 n(; ) (23) Lemma 5.7: Suppose that Assumptions A.{A.9 hold. Then: (a) For any sequence of random vectors f(b n ; g n ) :! R l R k g n2n that converges to ( ; ) prob-p, ^R n (b n ; g n ; )? ^Rn ( ; ; ) = l 0 (b n? ) + 2 (b n? ) 0 J (b n? ) + (b n? ) 0 J (g n? ) + 2 (g n? ) 0 J (g n? ) + n? n X (X t ) 0 (b n? ) +n? n X (X t ) 0 (g n? ) + o p (jb n? j 2 ) + o p (jg n? j 2 ) + o p (n? ) as n! : (b) For any sequence of random vectors fb n :! R l g n2n that converges to a.s.-p, ^R n (b n ; 0; )? ^Rn ( ; 0; ) = l 0 0 (b n? ) + 2 (b n? ) 0 J 0 (b n? ) +n? n X 0 (X t ) 0 (b n? ) + o p (jb n? j 2 ) + o p (n? ) as n! : To relate Lemma 5.7 to finf 2R k ^R n (b n ; )g n2n, the following result is convenient. Lemma 5.8: Suppose that Assumptions A.{A.6 hold. Let? be a compact subset of R k to which is interior. Also let fb n :! Bg n2n be a sequence of l random vectors that converges to a.s.-p. Then there exists a sequence of k random vectors fg n :!?g n2n such that Further, let ^R n (b n ; g n ; ) = inf 2R k ^Rn (b n ; ; ) a.a. n 2 N a.s.-p: (24) g n,? J? ( J (b n? ) + n? n X (X t ) ) ; n 2 N: 8

19 The sequence f(g n ; g n )g n2n satises that jg n? g n j = o p (n?=2 ) + o p (jb n? j) as n! ; (25) and that inf ^Rn (b n ; ; ) = ^Rn (b n ; g n ; ) + o p (n? ) + o p (jb n? j 2 ) as n! : (26) 2R k Lemma 5.8 suggests that we can obtain a quadratic approximation to inf 2R k ^Rn (b n ; ; ) by deriving a quadratic approximation to ^Rn (b n ; g n ; ). We decompose the dierence between ^Rn (b n ; ; ) and R( ; ) as ^R n (b n ; g n ; )? R( ; ) = ( ^Rn (b n ; g n ; )? ^Rn ( ; ; )) + ( ^Rn ( ; ; )? R( ; )); n 2 N: (27) Then we apply Lemma 5.7(a) to the rst term on the right-hand side of this equality and substitute the denition of fg n g n2n to the resulting expression. By combining the result with (26) of Lemma 5.8, we obtain: Lemma 5.9: Suppose that Assumptions A.{A.6 hold and let fb n :! Bg n2n be a sequence of l random vectors that converges to a.s.-p. Then inf 2R k ^Rn (b n ; ; )? R( ; ) =? 2 n? n X (X t )! 0 J? n? + l + n? n X nx (X t ) f (X t )? J J? (X t )g! 0 (b n? )! + ^Rn ( ; ; )? R( ; ) + 2 (b n? ) 0 (J? J J? J )(b n? ) + o p (n? ) + o p (jb n? j 2 ) as n! : (28) The objective function in the L IV estimation can be written as log ^Qn (;?b n ; ) = log Q(;? ) + flog ^Qn (;?b n ; )? log Q(;? )g = log Q(;? ) + log inf ^Rn (b n ; ; )? log R( ; ) 2R n k o? log ^Rn (b n ; 0; )? log R( ; 0) ; n 2 N: (29) 9

20 We now apply the second-order Taylor expansion of the logarithmic function to the second and third term on the right-hand side of this equality and apply Lemmas 5.7 and 5.9 to obtain: Lemma 5.0: Suppose that Assumptions A.{A.6 hold and let fb n :! Bg n2n be a sequence of l random vectors that converges to a.s.-p. Then log ^Qn (;?b n ; ) = n + 0 n(b n? )? 2 (b n? ) 0 K(b n? ) + o p (n? ) + o p (jb n? j 2 ) as n! ; (30) where n, log Q(;? )? ( X n 0 ( ) X n 2R( ; n? (X t )) J? n? (X t ) ) + R( ; ) f ^Rn ( ; ; )? R( ; )g? R( ; 0) f ^Rn ( ; 0; )? R( ; 0)g? 2R( ; ) f ^Rn ( ; ; )? R( ; )g R( ; 0) f ^Rn ( ; 0; )? R( ; 0)g 2 ; n 2 N; 2 n, n? n X ( ) (X t )? J J? (X t ) R( ;? 0 (X t) ) R( ; 0)? f ^Rn ( ; ; )? R( ; )gl R( ; ) 2 + f ^Rn ( ; 0; )? R( ; 0)gl 0 R( ; 0) 2 ; n 2 N; and K is dened by (3). Ignoring o p (n? ) + o p (jb n? j 2 ), the right hand side of (30) is minimized by setting + K? n to b n. This leads to the conjecture that K? n should accurately approximate ^ n, and this conjecture is indeed correct. Lemma 5.: Suppose that Assumptions A.{A.6 hold. Then ^ n = + K? n + o p (n?=2 ) as n! ; where K and f n g n2n are as in Lemma 5.0. By the asymptotic equivalence lemma (Rao (973), pp. 22{23), it follows from Lemma 5. that fn =2 ( ^n? )g n2n has the same asymptotic distribution as fn =2 K? n g n2n, provided that the latter converges in 20

21 distribution. Let (x), (x)? J J? (x) R( ;? 0 (x) ) R( ; 0)? fjy? w0? z 0 j? R( ; )gl R( ; ) 2 + fjy? w0 j? R( ; 0)gl 0 R( ; 0) 2 ; x = (v; w 0 ; z 0 ) 0 2 R R l R k : (3) Then we can easily verify that E[(X)] = 0 and E[(X) 0 (X)] < (see Assumption A.8). Because n can be written as n = n? n X (X t ); n 2 N; n is the sample average of zero mean random vectors with nite second moments. We assume that, E[(X)(X) 0 ] is nonsingular so that the central limit theorem (CLT) for i.i.d. random vectors (Rao (973), p. 28) applies to n =2 n. The asymptotic distribution of K? n =2 n is thus normal, and the next theorem follows. Theorem 5.2: Suppose that Assumptions A.{A.0 hold. Then D?=2 n =2 ( ^ n? ) A N(0; I l ) as n! ; where D?=2 is the inverse of the square root matrix of D, K? K? ; and I l is the l l identity matrix. Suppose that the specication is correct in that for some 0 2 R l Q(;? 0 ) = and that we take a suciently large compact set for B so that 0 2 B. Then we have that ( ; ) = ( 0 ; 0). This fact makes the asymptotic covariance matrix simpler under correct specication. Corollary 5.3: Suppose that Assumptions A.{A.0 hold, and that there exists 0 2 B such that Q(;? 0 ) =. Then it holds that = 0, and D?=2 0 n =2 ( ^ n? 0 ) A N(0; I l ) as n! ; where D 0, (J J? J )? J J? E[ZZ 0 ]J? J (J J? J )? : 2

22 The IV estimation might be motivated by a more strong notion of correctness that with the \true" parameter value 0 2 B, V? W 0 0 and Z are statistically independent. The asymptotic covariance matrix is further simplied given the strongly correct specication. Corollary 5.4: Suppose that Assumptions A.{A.0 hold, and that there exists 0 2 B such that U, V? W 0 0 and Z are statistically independent, and the median of U is zero. Also suppose that the distribution of U has a density f U (with respect to the Lebesgue measure) that is continuous and strictly positive at zero. Then n?=2 ( ^ n? 0 ) A N(0; 0) as n! ; where 0, 4f U (0) 2 E[W Z0 ]E[ZZ 0 ]? E[ZW 0 ]: When Z = W, we have that ^R n (; ; ) = n? n X jv t? W t ( + )j: For each realization, the objective function becomes one if we take the LAD estimate in regressing Y t on W t for. Thus, the L IV estimator is simply the LAD estimator in this special case, and Corollary 5.3 applies because l = k implies that the given specication is correct in our sense. Because we have that J 0 = J = J = J ; and that E[W W 0 ] = E[ZZ 0 ], the asymptotic covariance matrix of fn =2 ( ^ n? )g n2n is J 0? E[W W 0 ]J 0?, where J 0 is the Hessian matrix of E[jV? W 0 j] with respect to. Corollary 5.4 also applies to the LAD estimator under the strongly correct specication. The resulting asymptotic covariance matrix of the LAD estimator (with stochastic regressors) is (4f U (0) 2 )? E[W W 0 ]?, which takes essentially the same form as that provided by Bassett and Koenker (978) and Pollard (99). 6 Estimation of Asymptotic Covariance Matrix In statistical inference on, we need to consistently estimate the covariance matrix D of Theorem 5.2. This section proposes a consistent estimator for D. Because D = K? K?, it suces to develop consistent 22

23 estimators for and K. To see what is involved in estimation of, we rewrite as (x) =?sgn(v? w0? z 0 )w + J J? sgn(v? w 0? z 0 )z R( ; )??sgn(v? w0 )w R( ; 0)? fjv? w0? z 0 j? R( ; )gl R( ; ) 2 =?H(x); x = (v; w 0 ; z 0 ) 0 2 R R l R k ; + fjv? w0 j? R( ; 0)gl 0 R( ; 0) 2 where the rst equality holds by Lemma 5.4(a), H is l (l + k + l + 2) matrix dened by! I H, l R( ; ) ; C 0 R( ; ) ;? I l R( ; 0) ; l R( ; ) ;? l 0 2 R( ; ; 0) 2 e : R q R l R k! R l+k+l+2 is dene by e(x), 0 sgn(v? w 0? z 0 )w sgn(v? w 0? z 0 )z sgn(v? w 0 )w jv? w 0? z 0 j? R( ; )j jv? w 0 j? R( ; 0)j C A ; x = (v; w 0 ; z 0 ) 0 2 R R l R k ; and C,?J? J : It follows that can be written as = H H 0 ; where, E[e(X)e(X) 0 ]: (32) We can thus estimate consistently, if we can consistently estimate R( ; ), R( ; 0), l, l 0 C, and. By Lemmas 3.(b) and 3.2(b), we can easily show that and that inf ^Rn ( ^ n ; ; )! R( ; ) as n! a.s.-p (33) 2R k ^R n ( ^ n ; 0; )! R( ; 0) as n! a.s.-p: (34) 23

24 We consider how to estimate C, and K in what follows. A natural estimator of l is ^l ;n,?n? n X sgn(v t? W 0 t ^ n? Z 0 t^ n )W t ; n 2 N; (35) where f^ n :!?g n2n is a sequence of random vectors such that ^R n ( ^ n ; ^ n ; ) = inf ^R n ( ^ n ; ; ) a.s.-p; n 2 N; 2? and? is a compact subset of R l we pick. We assume that is interior to? (Assumption A.). By the Kolmogorov strong law of large numbers (Rao (973), p. 5), it follows from Assumptions A. and A.2 that?n? n X sgn(v t? W 0 t? Z 0 t )W t! l as n! a.s.-p: We can also show that n? nx sgn(v t? W 0 t ^ n? Z 0 t^ n )W t? n? n X sgn(v t? W 0 t? Z 0 t )W t! 0 as n! prob-p (36) (see the proof of Lemma 6.). The estimator f^l;n g n2n is thus consistent for l. Analogously, f^l0 ;n g n2n dened by ^l0 ;n,?n? n X sgn(v t? W 0 t ^ n )W t ; n 2 N (37) is consistent for l 0. Lemma 6.: Suppose that Assumptions A.{A. hold. Let f^l ;n g n2n and f^l 0 ;n g n2n be dened by (35) and (37), respectively. Then f^l;n g converges to l prob-p, and f^l0 ;n g converges to l 0 prob-p. A natural estimator for ^ nt, 0 is its sample analogue. Dene sgn(v t? Wt 0 ^ n? Zt^ 0 n )W t sgn(v t? Wt 0 ^ n? Zt^ 0 n )Z t sgn(v t? Wt 0 ^ n )W t jv t? Wt 0 ^ n? Zt^ 0 n j? ^Rn ( ^ n ; ^ n )j jv t? Wt 0 ^ n j? ^Rn ( ^ n ; 0)j C A ; t = ; : : : ; n; n 2 N: 24

25 Then the sample analogue of is ^ n, n? n X ^ nt^ 0 nt ; n 2 N: (38) Lemma 6.2: Suppose that Assumptions A.{A.9 hold and let f ^ ng n2n and be dened by (38) and (32), respectively. Then f ^ ng converges to prob-p. We next consider estimation of C =?J J. J and J are parts of the Hessian matrix of function R at ( ; ). A natural approach is that we use the Hessian matrix of ^Rn as an estimator for the Hessian of R, because ^Rn approximates R. Nevertheless, this approach is not suitable in our current situation, because ^R n is not smooth. We thus need to take an alternative approach. Our approach is based on Lemma 5.8. Let? be a compact set to which is interior (Assumption A.). Let fb n g n2n be a sequence of l random vectors that converges to prob-p. As shown in Lemma 5.8, there exist a sequence of k random vectors fg n g that minimizes ^Rn (b n ; ; ) with respect to on?, and fg n g is related to fb n g through g n = + C(b n? )? J? n? nx (X t ) + o p (n?=2 ) + o p (jb n? j) as n! : (39) This suggests that a slight change in b n would yield a change in g n that is approximately equal to the linear transformation of the change in b n by C. A possible way to proceed is use this observation with the perturbation method, which is often used in the statistical literature (see Meng and Rubin (99) for example). Take a sequence of random variables f n g n2n that converges to zero prob-p satisfying that n?=2 = n = O p () (Assumption A.2). This n represents the small change we make in b n. Though n does not have to be random, the randomness assumed here may be useful, allowing for data-based choices of the value of n. Let e i be the l vector that is obtained by replacing the ith element of the l zero vector with one. Let ^ n be the (a.s.) minimizer of ^Rn ( ^ n ; ; ) with respect to on? and ~ (i) n ^R n ( ^n + n e i ; ; ) with respect to on?, i = ; 2; : : : ; l. Then? n the (a.s.) minimizer of (~ n (i)? ^ n ) is consistent for the ith column of?j? J as formally shown in the proof of Lemma 6.3. We thus dene an estimator ^Cn for C as ^C n,? n (~ () n? ^ n);? n (~ (2) n? ^ n); : : : ; n? (~ (l) n? ^ n) ; n 2 N: (40) 25

26 Lemma 6.3: Suppose that Assumptions A.{A.2 hold. Then f ^Cn g n2n dened by (40) converges to C =?J? J prob-p. It is clear from its proof that the result of Lemma 6.3 is valid even if each element ^Cn uses a dierent series for f n g. To estimate K, we use the property given in Lemma 5.0. We again employ a sequence of random variables f n g n2n that converges to zero prob-p satisfying that n?=2 = n = O p () (Assumption A.2). With some algebra, we can derive from Lemma 5.0 that for each i = ; 2; : : : ; l,?2 n?2 log ^Qn (;?( ^ n + n e i ); )? log ^Qn (;? ^ n ; ) =e 0 ike i + o p () = K (ii) + o p () as n! ; (4) where K (ij) denotes the (i; j)-element of K, i; j = ; 2; : : : ; l. We thus dene ^K n(ii),?2 n?2 log ^Qn (;?( ^n + n e i ); )? log ^Qn (;? ^n ; ) ; i = ; 2; : : : ; l; n 2 N: (42) We can also derive that for each pair of distinct integers (i; j) between and l,? n?2 log ^Qn (;?( ^ n + n e i ); )? log ^Qn (;? ^ n ; ) = 2 (e i + e j ) 0 K(e i + e j ) + o p () = K (ij) + 2 K (ii) + 2 K (jj) + o p () as n! : (43) We thus propose an estimator for K (ij) : ^K n(ij),??2 n log ^Qn (;?( ^ n + n (e i + e j )); )? log ^Qn (;? ^ n ; )? 2 ^K n(ii)? 2 ^K n(jj) ; i 6= j; i; j = ; 2; : : : ; l; n 2 N: (44) Lemma 6.4: Suppose that Assumptions A.{A.0 and A.2 hold. Dene ^K n, 0 ^K n() ^Kn(2) ^Kn(l) ^K n(2) ^Kn(22) ^Kn(2l) ^K n(l) ^Kn(l2) ^Kn(ll) C A ; n 2 N; where K n(ij) are dened by (42) and (44). Then f ^Kn g n2n converges to K dened by (3) prob-p. Combining (33), (34) and Lemmas 6.{6.4 yields the consistent estimator for D. 26

27 Theorem 6.5: Suppose that Assumptions A.{A.2 hold. Dene ^D n, ^Kn ^ n ^Kn ; n 2 N; where ^H n,? I l ^R n ( ^ n ; ^ n ) ;? ^C0 n ^R n ( ^ n ; ^ n ) ;? ^ n, ^Hn ^ n ^H0 n ; n 2 N; I l ^R n ( ^ n ; 0) ;? ^l;n ^R n ( ^ n ; ^ n ) 2 ; +! ^l0 ;n ^R n ( ^ ; n 2 N; n ; 0) 2 and f ^ ng n2n is dened by (38). Then f ^Dn g n2n converges to D dened in Theorem 5.2 prob-p. 7 Specication Test If Q(;?) = for some 2 B, our model is correctly specied. In this section, we consider how to test this specication correctness. The proposed test corresponds to the over-identication test in the GMM estimation framework. A necessary and sucient condition for the correct specication is that Q(;? ) =. If the model is misspecied, we have that Q(;? ) <. Because we can consistently estimate Q(;? ) by ^Qn (;? ^ n ; ), one might think that a test based on the value of ^Qn (;? ^ n ; ) may be appropriate. Nevertheless, the limiting distribution of ^Qn (;? ^ n ; ) under the null, which we can derive from Lemmas 5.0 and 5., is that of a quadratic form of independently and normally distributed random variables with an unknown weighting matrix, as is the case for many generalized likelihood ratios. This approach is thus quite inconvenient. Alternatively, we here employ another condition that = 0, which is necessary and sucient for the correct specication. Let ^ n be the minimizer of ^Rn ( ^ n ; ; ) with respect to on some compact set? R k. Then ^ n consistently estimates. Below, we show that a certain quadratic form of ^ n is distributed with a 2 -distribution under the null and show that the same quadratic form diverges to innity prob-p under the alternative. We can thus form a consistent specication test based on this statistic. We rst derive a convenient approximation form for ^ n under the null. Note that K = J J? J =R( ; 0); and n = R( ; 0) J J? n? nx (X t ); n 2 N 27

28 under the null. By using Lemmas 5.8 and 5. with this fact, it is straightforward to derive that where ^ n =?J?=2 MJ?=2 n? nx (X t ) + o p (n?=2 ) as n! ; (45) M, I k? J?=2 J (J J? J )? J J?=2 : Because M is symmetric and idempotent with rank k? l, there exists a k (k? l) matrix T such that M = T T 0 and T 0 T = I k?l. Dene T 2, J?=2 J (J J? J )?=2 : Then we have that T 0 2T 2 = I l and that MT 2 = 0. The columns of T 2 is thus orthonormal vectors that spans the null space of M. Letting T is an orthogonal matrix. T, (T ; T 2 ); Substituting T T 0 into M in (45) and premultiply both sides of the resulting equation by n =2 T 0 J =2, we obtain that n =2 T 0 J =2 ^ n =? 0 n 0 l ; C A + o p () as n! ; (46) where 0 l denotes the l zero vector, and We have that and that n, T 0 J?=2 n?=2 nx E[T 0 J?=2 (X)] = 0 (X t ); n 2 N: var[t 0 J?=2 (X)] = T 0 J?=2 E[ZZ 0 ]J?=2 T : It follows by CLT for i.i.d. random vectors (Rao (973), p. 28) that (T 0 J?=2 E[ZZ 0 ]J?=2 T )?=2 n A N(0; Ik?l ) as n! : 28

29 Now let A be a symmetric positive denite matrix whose upper left (k? l) (k? l) block of A is equal to By using (46), we have that A, (T 0 J?=2 E[ZZ 0 ]J?=2 T )? (47) n^ 0 nj =2 T AT 0 J =2 ^ n = 0 na n + o p () as n! : By the asymptotic equivalence lemma (Rao (973), pp. 22{23), it follows that n^ 0 n J =2 T AT 0 J =2 ^ n A 2 (l? k) as n! : In addition, we have that under the alternative hypothesis that 6= 0, ^ 0 n J =2 T AT 0 J =2 ^ n! 0 J =2 T AT 0 J =2 > 0 as n! prob-p: This means that if we have an consistent estimator for J =2 T AT 0 J =2, we can form a consistent specication test based on n^ 0 nj =2 T AT 0 J =2 ^ n. where A convenient choice for A is A, 0 A 0 0 A 22 C A ; A 22, (J J? J )? : For this particular choice, we have that Note that J =2 T AT 0 J =2 = (J?=2 = = = 8 >< T A? T 0 J?=2 >: J?=2 n T 0 T 0 J?=2 E[ZZ 0 ]J?=2 T 0 0 A? 22 )? C A T 0 J?=2 9 >= >; J?=2 T T 0 J?=2 E[ZZ 0 ]J?=2 T T 0 J?=2 + J?=2 T 2 A? T 22 2J?=2 n o J?=2 MJ?=2 E[ZZ 0? ]J?=2 MJ?=2 + J?=2 T 2 A? T 22 2J?=2 :? o? J?=2 MJ?=2 = J?? J? J (J J? J )? J J? = J?? C(R( ; )K)? C 0 ; 29

30 where the second equality holds under the null. Also note that J?=2 T 2 A? T 22 2J?=2 = J? J (J J? J )?=2 (J J? J )(J J? J )?=2 J J? It follows that under the null, = J? J J J? = CC 0 : J =2 T AT 0 J =2 = (J?? C(R( ; )K)? C 0 )E[ZZ 0 ](J?? C(R( ; )K)? C 0 ) + CC 0 : We thus have that under the null n^ 0 nf(j?? C(R( ; )K)? C 0 )E[ZZ 0 ](J?? C(R( ; )K)? C 0 ) + CC 0 g?^ n A 2 (k? l) as n! : (48) To use (48) to form a test, we need to consistently estimate the unknown constants in it. Lemmas 6.3 and 6.4 provide consistent estimators for K and C. The matrix J can be consistently estimated in an analogous manner. For each n 2 N, dene ^J ;n(ii), 2?2 n log ^Rn ( ^ n ; ^ n + n e i ; )? log ^Rn ( ^ n ; ^ n ; ) ; i = ; 2; : : : ; l; (49) and ^J ;n(ij), n?2 log ^Rn ( ^ n ; ^ n + n (e i + e j ); )? log ^Rn ( ^ n ; ^ n ; ) (50)? 2 ^J ;n(ii)? 2 ^J ;n(jj) ; i 6= j; i; j = ; 2; : : : ; l: (5) Lemma 7.: Suppose that Assumptions A.{A.0 and A.2 hold. Dene ^J ;n, 0 ^J ;n() ^J;n(2) ^J;n(l) ^J ;n(2) ^J;n(22) ^J;n(2l) ^J ;n(l) ^J;n(l2) ^J;n(ll) C A ; n 2 N; where J n(ij) are dened by (49) and (5). Then f ^J;n g n2n converges to J dened by (0) prob-p. We now dene a test statistic T n by T n, n^ 0 n ^J + ;n? ^Cn ( ^Rn ( ^ n ; ^ n ; ) ^Kn ) + ^C0 n n? n X Z t Z 0 t ^J +? ^Cn ( ^RN ( ^ n ; ^ n ; ) ^Kn ) + ^C0 n + ^Cn ^C0 n +^ n ; n 2 N: (52)! 30

31 It is straightforward to verify that under the null T n? n^ 0 n f(j?? C(R( ; )K)? C 0 )E[ZZ 0 ](J?? C(R( ; )K)? C 0 ) + CC 0 g?^ n = o p (): By the asymptotic equivalence lemma (Rao (973), pp. 22{23), it follows that T n is asymptotically distributed with 2 (k? l) under the null. We can also easily verify that the asymptotic power of T n is one. Theorem 7.2: Suppose that Assumptions A.{A.2 hold. Also let T n be the statistic dened in (52). (a) If = 0, then T n A 2 (k? l) as n! : (b) If 6= 0, then for any real constant c P [T n > c]! as n! : The specication test with size p based on statistic T n is that we reject the null hypothesis when the observed value of T is greater than the upper 00p percentile of 2 -distribution with k? l degrees of freedom; and accept the null hypothesis otherwise. 8 Concluding Remarks The error term in a structural equation should not be explained by instrumental variables. If we judge whether the instrumental variables explain the error by the orthogonality condition, the IV estimators such as the two-stage least squares estimators are generated. In this paper, we propose an alternative approach. Choose a dispersion measure for univariate distributions. Under the population distribution, consider the ratio of the dispersion of the part of the error term the IVs cannot explain to the dispersion of the error term itself for each parameter value. This ratio ranges between zero and one. The higher this ratio is, the less the error is related to the instruments. If the model assumption is correct, the ratio attains one for some parameter value, and the maximizer gives the desired parameter value; the maximizer gives a model \closest" to our assumption otherwise. Our estimator is dened to be the maximizer of the sample analogue of the ratio of the dispersions. The estimators derived in this manner is invariant to the normalization constraint and require no reduced form specication. 3

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction Instrumental Variables Estimation and Weak-Identification-Robust Inference Based on a Conditional Quantile Restriction Vadim Marmer Department of Economics University of British Columbia vadim.marmer@gmail.com

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition) Vector Space Basics (Remark: these notes are highly formal and may be a useful reference to some students however I am also posting Ray Heitmann's notes to Canvas for students interested in a direct computational

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Richard DiSalvo. Dr. Elmer. Mathematical Foundations of Economics. Fall/Spring,

Richard DiSalvo. Dr. Elmer. Mathematical Foundations of Economics. Fall/Spring, The Finite Dimensional Normed Linear Space Theorem Richard DiSalvo Dr. Elmer Mathematical Foundations of Economics Fall/Spring, 20-202 The claim that follows, which I have called the nite-dimensional normed

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Re-sampling and exchangeable arrays University Ave. November Revised January Summary

Re-sampling and exchangeable arrays University Ave. November Revised January Summary Re-sampling and exchangeable arrays Peter McCullagh Department of Statistics University of Chicago 5734 University Ave Chicago Il 60637 November 1997 Revised January 1999 Summary The non-parametric, or

More information

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra. Scaled and adjusted restricted tests in multi-sample analysis of moment structures Albert Satorra Universitat Pompeu Fabra July 15, 1999 The author is grateful to Peter Bentler and Bengt Muthen for their

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Reduced rank regression in cointegrated models

Reduced rank regression in cointegrated models Journal of Econometrics 06 (2002) 203 26 www.elsevier.com/locate/econbase Reduced rank regression in cointegrated models.w. Anderson Department of Statistics, Stanford University, Stanford, CA 94305-4065,

More information

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables,

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables, University of Illinois Fall 998 Department of Economics Roger Koenker Economics 472 Lecture Introduction to Dynamic Simultaneous Equation Models In this lecture we will introduce some simple dynamic simultaneous

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

ECONOMETRIC MODELS. The concept of Data Generating Process (DGP) and its relationships with the analysis of specication.

ECONOMETRIC MODELS. The concept of Data Generating Process (DGP) and its relationships with the analysis of specication. ECONOMETRIC MODELS The concept of Data Generating Process (DGP) and its relationships with the analysis of specication. Luca Fanelli University of Bologna luca.fanelli@unibo.it The concept of Data Generating

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3. Chapter 3 SEPARATION PROPERTIES, PRINCIPAL PIVOT TRANSFORMS, CLASSES OF MATRICES In this chapter we present the basic mathematical results on the LCP. Many of these results are used in later chapters to

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

Generalized Method of Moments Estimation

Generalized Method of Moments Estimation Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample

More information

On John type ellipsoids

On John type ellipsoids On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

2 Garrett: `A Good Spectral Theorem' 1. von Neumann algebras, density theorem The commutant of a subring S of a ring R is S 0 = fr 2 R : rs = sr; 8s 2

2 Garrett: `A Good Spectral Theorem' 1. von Neumann algebras, density theorem The commutant of a subring S of a ring R is S 0 = fr 2 R : rs = sr; 8s 2 1 A Good Spectral Theorem c1996, Paul Garrett, garrett@math.umn.edu version February 12, 1996 1 Measurable Hilbert bundles Measurable Banach bundles Direct integrals of Hilbert spaces Trivializing Hilbert

More information

STOCHASTIC DIFFERENTIAL EQUATIONS WITH EXTRA PROPERTIES H. JEROME KEISLER. Department of Mathematics. University of Wisconsin.

STOCHASTIC DIFFERENTIAL EQUATIONS WITH EXTRA PROPERTIES H. JEROME KEISLER. Department of Mathematics. University of Wisconsin. STOCHASTIC DIFFERENTIAL EQUATIONS WITH EXTRA PROPERTIES H. JEROME KEISLER Department of Mathematics University of Wisconsin Madison WI 5376 keisler@math.wisc.edu 1. Introduction The Loeb measure construction

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

EFFICIENT ESTIMATION USING PANEL DATA 1. INTRODUCTION

EFFICIENT ESTIMATION USING PANEL DATA 1. INTRODUCTION Econornetrica, Vol. 57, No. 3 (May, 1989), 695-700 EFFICIENT ESTIMATION USING PANEL DATA BY TREVOR S. BREUSCH, GRAYHAM E. MIZON, AND PETER SCHMIDT' 1. INTRODUCTION IN AN IMPORTANT RECENT PAPER, Hausman

More information

3.1 Basic properties of real numbers - continuation Inmum and supremum of a set of real numbers

3.1 Basic properties of real numbers - continuation Inmum and supremum of a set of real numbers Chapter 3 Real numbers The notion of real number was introduced in section 1.3 where the axiomatic denition of the set of all real numbers was done and some basic properties of the set of all real numbers

More information

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS FUNCTIONS Michael Baron Received: Abstract We introduce a wide class of asymmetric loss functions and show how to obtain asymmetric-type optimal decision

More information

Lie Groups for 2D and 3D Transformations

Lie Groups for 2D and 3D Transformations Lie Groups for 2D and 3D Transformations Ethan Eade Updated May 20, 2017 * 1 Introduction This document derives useful formulae for working with the Lie groups that represent transformations in 2D and

More information

Generalized Method of Moment

Generalized Method of Moment Generalized Method of Moment CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University June 16, 2010 C.-M. Kuan (Finance & CRETA, NTU Generalized Method of Moment June 16, 2010 1 / 32 Lecture

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

368 XUMING HE AND GANG WANG of convergence for the MVE estimator is n ;1=3. We establish strong consistency and functional continuity of the MVE estim

368 XUMING HE AND GANG WANG of convergence for the MVE estimator is n ;1=3. We establish strong consistency and functional continuity of the MVE estim Statistica Sinica 6(1996), 367-374 CROSS-CHECKING USING THE MINIMUM VOLUME ELLIPSOID ESTIMATOR Xuming He and Gang Wang University of Illinois and Depaul University Abstract: We show that for a wide class

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr The discrete algebraic Riccati equation and linear matrix inequality nton. Stoorvogel y Department of Mathematics and Computing Science Eindhoven Univ. of Technology P.O. ox 53, 56 M Eindhoven The Netherlands

More information

Notes on the matrix exponential

Notes on the matrix exponential Notes on the matrix exponential Erik Wahlén erik.wahlen@math.lu.se February 14, 212 1 Introduction The purpose of these notes is to describe how one can compute the matrix exponential e A when A is not

More information

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v 250) Contents 2 Vector Spaces 1 21 Vectors in R n 1 22 The Formal Denition of a Vector Space 4 23 Subspaces 6 24 Linear Combinations and

More information

Tests for Neglected Heterogeneity in Moment Condition Models

Tests for Neglected Heterogeneity in Moment Condition Models Tests for Neglected Heterogeneity in Moment Condition Models Jinyong Hahn Department of Economics U.C.L.A. Whitney K. Newey Department of Economics M.I.T. Richard J. Smith cemmap, U.C.L. and I.F.S., Faculty

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS* LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Converse Lyapunov Functions for Inclusions 2 Basic denitions Given a set A, A stands for the closure of A, A stands for the interior set of A, coa sta

Converse Lyapunov Functions for Inclusions 2 Basic denitions Given a set A, A stands for the closure of A, A stands for the interior set of A, coa sta A smooth Lyapunov function from a class-kl estimate involving two positive semidenite functions Andrew R. Teel y ECE Dept. University of California Santa Barbara, CA 93106 teel@ece.ucsb.edu Laurent Praly

More information

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006 Analogy Principle Asymptotic Theory Part II James J. Heckman University of Chicago Econ 312 This draft, April 5, 2006 Consider four methods: 1. Maximum Likelihood Estimation (MLE) 2. (Nonlinear) Least

More information

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f Numer. Math. 67: 289{301 (1994) Numerische Mathematik c Springer-Verlag 1994 Electronic Edition Least supported bases and local linear independence J.M. Carnicer, J.M. Pe~na? Departamento de Matematica

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Projektpartner. Sonderforschungsbereich 386, Paper 163 (1999) Online unter:

Projektpartner. Sonderforschungsbereich 386, Paper 163 (1999) Online unter: Toutenburg, Shalabh: Estimation of Regression Coefficients Subject to Exact Linear Restrictions when some Observations are Missing and Balanced Loss Function is Used Sonderforschungsbereich 386, Paper

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Economics 241B Estimation with Instruments

Economics 241B Estimation with Instruments Economics 241B Estimation with Instruments Measurement Error Measurement error is de ned as the error resulting from the measurement of a variable. At some level, every variable is measured with error.

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

7. Dimension and Structure.

7. Dimension and Structure. 7. Dimension and Structure 7.1. Basis and Dimension Bases for Subspaces Example 2 The standard unit vectors e 1, e 2,, e n are linearly independent, for if we write (2) in component form, then we obtain

More information

10. Smooth Varieties. 82 Andreas Gathmann

10. Smooth Varieties. 82 Andreas Gathmann 82 Andreas Gathmann 10. Smooth Varieties Let a be a point on a variety X. In the last chapter we have introduced the tangent cone C a X as a way to study X locally around a (see Construction 9.20). It

More information

PARAMETER IDENTIFICATION IN THE FREQUENCY DOMAIN. H.T. Banks and Yun Wang. Center for Research in Scientic Computation

PARAMETER IDENTIFICATION IN THE FREQUENCY DOMAIN. H.T. Banks and Yun Wang. Center for Research in Scientic Computation PARAMETER IDENTIFICATION IN THE FREQUENCY DOMAIN H.T. Banks and Yun Wang Center for Research in Scientic Computation North Carolina State University Raleigh, NC 7695-805 Revised: March 1993 Abstract In

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

An Averaging GMM Estimator Robust to Misspecication

An Averaging GMM Estimator Robust to Misspecication An Averaging GMM Estimator Robust to Misspecication Xu Cheng y Zhipeng Liao z Ruoyao Shi x This Version: January, 28 Abstract This paper studies the averaging GMM estimator that combines a conservative

More information

The Equality of OLS and GLS Estimators in the

The Equality of OLS and GLS Estimators in the The Equality of OLS and GLS Estimators in the Linear Regression Model When the Disturbances are Spatially Correlated Butte Gotu 1 Department of Statistics, University of Dortmund Vogelpothsweg 87, 44221

More information

Exercise Solutions to Functional Analysis

Exercise Solutions to Functional Analysis Exercise Solutions to Functional Analysis Note: References refer to M. Schechter, Principles of Functional Analysis Exersize that. Let φ,..., φ n be an orthonormal set in a Hilbert space H. Show n f n

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

QUASI-UNIFORMLY POSITIVE OPERATORS IN KREIN SPACE. Denitizable operators in Krein spaces have spectral properties similar to those

QUASI-UNIFORMLY POSITIVE OPERATORS IN KREIN SPACE. Denitizable operators in Krein spaces have spectral properties similar to those QUASI-UNIFORMLY POSITIVE OPERATORS IN KREIN SPACE BRANKO CURGUS and BRANKO NAJMAN Denitizable operators in Krein spaces have spectral properties similar to those of selfadjoint operators in Hilbert spaces.

More information

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector On Minimax Filtering over Ellipsoids Eduard N. Belitser and Boris Y. Levit Mathematical Institute, University of Utrecht Budapestlaan 6, 3584 CD Utrecht, The Netherlands The problem of estimating the mean

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

Economics 620, Lecture 20: Generalized Method of Moment (GMM)

Economics 620, Lecture 20: Generalized Method of Moment (GMM) Economics 620, Lecture 20: Generalized Method of Moment (GMM) Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 20: GMM 1 / 16 Key: Set sample moments equal to theoretical

More information

Optimality Conditions

Optimality Conditions Chapter 2 Optimality Conditions 2.1 Global and Local Minima for Unconstrained Problems When a minimization problem does not have any constraints, the problem is to find the minimum of the objective function.

More information

University of Pavia. M Estimators. Eduardo Rossi

University of Pavia. M Estimators. Eduardo Rossi University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample

More information

Quantum logics with given centres and variable state spaces Mirko Navara 1, Pavel Ptak 2 Abstract We ask which logics with a given centre allow for en

Quantum logics with given centres and variable state spaces Mirko Navara 1, Pavel Ptak 2 Abstract We ask which logics with a given centre allow for en Quantum logics with given centres and variable state spaces Mirko Navara 1, Pavel Ptak 2 Abstract We ask which logics with a given centre allow for enlargements with an arbitrary state space. We show in

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

2 Section 2 However, in order to apply the above idea, we will need to allow non standard intervals ('; ) in the proof. More precisely, ' and may gene

2 Section 2 However, in order to apply the above idea, we will need to allow non standard intervals ('; ) in the proof. More precisely, ' and may gene Introduction 1 A dierential intermediate value theorem by Joris van der Hoeven D pt. de Math matiques (B t. 425) Universit Paris-Sud 91405 Orsay Cedex France June 2000 Abstract Let T be the eld of grid-based

More information

LINEAR EQUATIONS WITH UNKNOWNS FROM A MULTIPLICATIVE GROUP IN A FUNCTION FIELD. To Professor Wolfgang Schmidt on his 75th birthday

LINEAR EQUATIONS WITH UNKNOWNS FROM A MULTIPLICATIVE GROUP IN A FUNCTION FIELD. To Professor Wolfgang Schmidt on his 75th birthday LINEAR EQUATIONS WITH UNKNOWNS FROM A MULTIPLICATIVE GROUP IN A FUNCTION FIELD JAN-HENDRIK EVERTSE AND UMBERTO ZANNIER To Professor Wolfgang Schmidt on his 75th birthday 1. Introduction Let K be a field

More information

Average Reward Parameters

Average Reward Parameters Simulation-Based Optimization of Markov Reward Processes: Implementation Issues Peter Marbach 2 John N. Tsitsiklis 3 Abstract We consider discrete time, nite state space Markov reward processes which depend

More information

Economics 620, Lecture 18: Nonlinear Models

Economics 620, Lecture 18: Nonlinear Models Economics 620, Lecture 18: Nonlinear Models Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 1 / 18 The basic point is that smooth nonlinear

More information

When is it really justifiable to ignore explanatory variable endogeneity in a regression model?

When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Discussion Paper: 2015/05 When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Jan F. Kiviet www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Roetersstraat

More information

and the polynomial-time Turing p reduction from approximate CVP to SVP given in [10], the present authors obtained a n=2-approximation algorithm that

and the polynomial-time Turing p reduction from approximate CVP to SVP given in [10], the present authors obtained a n=2-approximation algorithm that Sampling short lattice vectors and the closest lattice vector problem Miklos Ajtai Ravi Kumar D. Sivakumar IBM Almaden Research Center 650 Harry Road, San Jose, CA 95120. fajtai, ravi, sivag@almaden.ibm.com

More information

ε ε

ε ε The 8th International Conference on Computer Vision, July, Vancouver, Canada, Vol., pp. 86{9. Motion Segmentation by Subspace Separation and Model Selection Kenichi Kanatani Department of Information Technology,

More information

ECONOMETRICS. Bruce E. Hansen. c2000, 2001, 2002, 2003, University of Wisconsin

ECONOMETRICS. Bruce E. Hansen. c2000, 2001, 2002, 2003, University of Wisconsin ECONOMETRICS Bruce E. Hansen c2000, 200, 2002, 2003, 2004 University of Wisconsin www.ssc.wisc.edu/~bhansen Revised: January 2004 Comments Welcome This manuscript may be printed and reproduced for individual

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Comprehensive Definitions of Breakdown-Points for Independent and Dependent Observations

Comprehensive Definitions of Breakdown-Points for Independent and Dependent Observations TI 2000-40/2 Tinbergen Institute Discussion Paper Comprehensive Definitions of Breakdown-Points for Independent and Dependent Observations Marc G. Genton André Lucas Tinbergen Institute The Tinbergen Institute

More information

On the asymptotic distribution of the Moran I test statistic withapplications

On the asymptotic distribution of the Moran I test statistic withapplications Journal of Econometrics 104 (2001) 219 257 www.elsevier.com/locate/econbase On the asymptotic distribution of the Moran I test statistic withapplications Harry H. Kelejian, Ingmar R. Prucha Department

More information

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS KOLMAN & HILL NOTES BY OTTO MUTZBAUER 11 Systems of Linear Equations 1 Linear Equations and Matrices Numbers in our context are either real numbers or complex

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Generalized Method of Moments (GMM) Estimation

Generalized Method of Moments (GMM) Estimation Econometrics 2 Fall 2004 Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen of29 Outline of the Lecture () Introduction. (2) Moment conditions and methods of moments (MM) estimation. Ordinary

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

On Coarse Geometry and Coarse Embeddability

On Coarse Geometry and Coarse Embeddability On Coarse Geometry and Coarse Embeddability Ilmari Kangasniemi August 10, 2016 Master's Thesis University of Helsinki Faculty of Science Department of Mathematics and Statistics Supervised by Erik Elfving

More information

ROYAL INSTITUTE OF TECHNOLOGY KUNGL TEKNISKA HÖGSKOLAN. Department of Signals, Sensors & Systems

ROYAL INSTITUTE OF TECHNOLOGY KUNGL TEKNISKA HÖGSKOLAN. Department of Signals, Sensors & Systems The Evil of Supereciency P. Stoica B. Ottersten To appear as a Fast Communication in Signal Processing IR-S3-SB-9633 ROYAL INSTITUTE OF TECHNOLOGY Department of Signals, Sensors & Systems Signal Processing

More information