Nonparametric Identi cation and Estimation of Nonclassical Errors-in-Variables Models Without Additional Information

Size: px
Start display at page:

Download "Nonparametric Identi cation and Estimation of Nonclassical Errors-in-Variables Models Without Additional Information"

Transcription

1 Nonparametric Identi cation and Estimation of Nonclassical Errors-in-Variables Models Without Additional Information Xiaohong Chen y Yale University Yingyao Hu z Johns Hopkins University Arthur Lewbel x Boston College First version: November 26; Revised July 27. Abstract This paper considers identi cation and estimation of a nonparametric regression model with an unobserved discrete covariate. The sample consists of a dependent variable and a set of covariates, one of which is discrete and arbitrarily correlates with the unobserved covariate. The observed discrete covariate has the same support as the unobserved covariate, and can be interpreted as a proxy or mismeasure of the unobserved one, but with a nonclassical measurement error that has an unknown distribution. We obtain nonparametric identi cation of the model given monotonicity of the regression function and a rank condition that is directly testable given the data. Our identi cation strategy does not require additional sample information, such as instrumental variables or a secondary sample. We then estimate the model via the method of sieve maximum likelihood, and provide root-n asymptotic normality and semiparametric e ciency of smooth functionals of interest. Two small simulations are presented to illustrate the identi cation and the estimation results. Keywords: Errors-In-Variables (EIV), Identi cation; Nonclassical measurement error; Nonparametric regression; Sieve maximum likelihood. We thank participants at June 27 North American Summer Meetings of the Econometric Society at Duke for helpful comments. Chen acknowledges support from NSF. y Department of Economics, Yale University, Box 2828, New Haven, CT , USA. Tel: xiaohong.chenyale.edu. z Department of Economics, Johns Hopkins University, 44 Mergenthaler Hall, 34 N. Charles Street, Baltimore, MD 228, USA. Tel: yhujhu.edu. x Department of Economics, Boston College, 4 Commonwealth Avenue, Chestnut Hill, MA 2467 USA. Tel: lewbelbc.edu.

2 INTRODUCTION We consider identi cation and estimation of the nonparametric regression model Y = m (X ) + ; E[jX ] = (.) where Y and X are scalars and X is not observed. We assume X is discretely distributed, so for example X could be categorical, qualitative, or count data. We observe a random sample of Y and a scalar X, where X could be arbitrarily correlated with the unobserved X, and is independent of X and X. The extension to Y = m (X ; W ) + ; E[jX ; W ] = where W is an additional vector of observed error-free covariates is immediate (and is included in the estimation section) because our assumptions and identi cation results for model (.) can be all restated as conditional upon W. For convenience we will interpret X as a measure of X that is contaminated by measurement or misclassi cation error, but more generally X could represent some latent, unobserved quanti able discrete variable like a health status or life expectancy quantile, and X could be some observed proxy such as a body mass index quantile or the response to a health related categorical survey question. We will require that X and X have the same support. Equation (.) can be interpreted as a latent factor model Y = m +, with two unobserved independent factors m and, with identi cation based on observing the proxy X and on existence of a measurable function m() such that m = m(x ). Regardless of whether X is literally a awed measure of X or

3 just some related covariate, we may still de ne X X as a measurement error, though in general this error will be nonclassical, that is, it will have nonzero mean not be independent of X. Numerous data sets possess nonclassical measurement errors (see, e.g., Bound, Brown and Mathiowetz (2) for a review), however, to the best of our knowledge, there is no published work that allows for nonparametric identi cation and estimation of nonparametric regression models with nonclassical measurement errors without parametric restrictions or additional sample information such as instrumental variables, repeated measurements, or validation data, which our paper provides. In short, we nonparametrically recover the conditional density f Y jx (or equivalently, the function m and the density of ) just from the observed joint distribution f Y;X, while imposing minimal restrictions on the joint distribution f X;X. The relationship between the latent model f Y jx and the observed density f Y;X is Z f Y;X (y; x) = f Y jx (yjx )f X;X (x; x ) dx : (.2) All the existing published papers for identifying the latent model f Y jx use one of the three methods: i) assuming the measurement error structure f XjX belongs to a parametric family (see, e.g., Fuller (987), Bickel and Ritov (987), Hsiao (99), Murphy and Van der Vaart (996), Wang, Lin, Gutierrez and Carroll (998), Liang, Hardle and Carroll (999), Taupin (2), Hong and Tamer (23), Liang and Wang (25)); ii) assuming there exists an additional exogenous variable Z in the sample (such as an instrument or a repeated measure) that does not enter the latent model f Y jx, and exploiting assumed restrictions on f Y jx ;Z and f X;X ;Z to identify f Y jx given the joint distribution of fy; x; zg (see, e.g., Hausman,

4 Ichimura, Newey and Powell (99), Li and Vuong (998), Li (22), Wang (24), Schennach (24), Carroll, Ruppert, Crainiceanu, Tosteson and Karagas (24), Mahajan (26), Lewbel (27), Hu (26) and Hu and Schennach (26)); or iii) assuming a secondary sample to provide information on f X;X to permit recovery of f Y jx from the observed f Y;X in the primary sample (see, e.g., Carroll and Stefanski (99), Lee and Sepanski (995), Chen, Hong, and Tamer (25), Chen, Hong, and Tarozzi (27), Hu and Ridder (26)). Detailed reviews on existing approaches and results can be found in several recent books and surveys on measurement error models; see, e.g., Carroll, Ruppert, Stefanski and Crainiceanu (26), Chen, Hong and Nekipelov (27), Bound, Brown and Mathiowetz (2), Wansbeek and Meijer (2), and Cheng and Van Ness (999). In this paper, we obtain identi cation by exploiting nonparametric features of the latent model f Y jx, such as independence of the regression error term. Our results are useful because many applications specify the latent model of interest f Y jx, while often little is known about f XjX, that is, about the nature of the measurement error or the exact relationship between the unobserved latent X and a proxy X. In addition, our key rank condition for identi cation is directly testable from the data. Our identi cation method is based on characteristic functions. Suppose X and X have support X = f; 2; :::; Jg. Then by equation (.), exp (ity ) = exp (it) P J j= (X = j) exp [im (j) t] for any given constant t, where () is the indicator function that equals one if its argument is true and zero otherwise. This equation and independence of yield moments JX E [exp (ity ) j X = x] = E [exp (it)] f X jx(x jx) exp [im (x ) t] (.3) x =

5 Evaluating equation (.3) for t 2 ft ; :::; t K g and x 2 f; 2; :::; Jg provides KJ equations in J 2 + J + K unknown constants. These unknown constants are the values of f X jx(x jx), m (x ), and E [exp (it)] for t 2 ft ; :::; t K g, x 2 f; 2; :::; Jg, and x 2 f; 2; :::; Jg. Given a large enough value of K, these moments provide more equations than unknowns. We provide su cient regularity assumption to ensure existence of some set of constants ft ; :::; t K g such that these equations do not have multiple solutions, and the resulting unique solution to these equations provides identi cation of m() and of f X;X, and hence of f Y jx. As equation (.3) shows, our identi cation results depend on many moments of Y or equivalently of, rather than just on the conditional mean restriction E[jX ] = that would su ce for identi cation if X were observed. Previous results, for example, Reiersol (95), Kendall and Stuart (979), and Lewbel (997), exist that obtain identi cation based on higher moments as we do (without instruments, repeated measures, or validation data), but all these previous results have assumed either classical measurement error or/and parametric restrictions. Equation (.3) implies that independence of the regression error is actually stronger than necessary for identi cation, since e.g. we would obtain the same equations used for identi cation if we only had exp (it)? X jx for t 2 ft ; :::; t K g. To illustrate this point further, we provide an alternative identi cation result for the dichotomous X case without the independence assumption, and in this case we can identify (solve) for all the unknowns in closed-form. Estimation could be based directly on equation (.3) using, for example, Hansen s (982) Generalized Method of Moments (GMM). However, this would require knowing or choosing constants t ; :::; t K. Moreover, under the independence assumption of and X, we

6 have potentially in nitely many constants t that solves equation (.3); hence GMM estimation using nitely many such t s will not be e cient in general. One could apply the in nite-dimensional Z-estimation as described in Van der Vaart and Wellner (996), here we instead apply the method of sieve Maximum Likelihood (ML) of Grenander (98) and Geman and Hwang (982), which does not require knowing or choosing constants t ; :::; t K, and easily allows for an additional vector of error-free covariates W. The sieve ML estimator essentially replaces the unknown functions f, m, and f X jx;w with polynomials, Fourier series, splines, wavelets or other sieve approximators, and estimates the parameters of these approximations by maximum likelihood. By simple applications of the general theory on sieve MLE developed in Wong and Shen (995), Shen (997), Van de Geer (993, 2) and others, we provide consistency and convergence rate of the sieve MLE, and root-n asymptotic normality and semiparametric e ciency of smooth functionals of interest, such as the weighted averaged derivatives of the latent nonparametric regression function m(x ; W ), or the nite-dimensional parameters () in a semiparametric speci cation of m(x ; W ; ). The rest of this paper is organized as follows. Section 2 provides the identi cation results. Section 3 describes the sieve ML estimator and presents its large sample properties. Section 4 provides two small simulation studies. Section 5 brie y concludes and all the proofs are in the appendix. 2 NONPARAMETRIC IDENTIFICATION Our basic nonparametric regression model is equation (.) with scalar Y and X 2 X = f; 2; :::; Jg. We observe a random sample of (X; Y ) 2 X Y, where X is a proxy of X. The

7 goal is to consider restrictions on the latent model f Y jx that su ce to nonparametrically identify f Y jx and f X jx from f Y jx. Assumption 2. X? jx. This assumption implies that the measurement error X X is independent of the dependent variable Y conditional on the true value X. In other words, we have f Y jx ;X(yjx ; x) = f Y jx (yjx ) for all (x; x ; y) 2 X X Y. Assumption 2.2 X? : This assumption implies that the regression error is independent of the regressor X so f Y jx (yjx ) = f (y ones are then: m(x )). The relationship between the observed density and the latent JX f Y;X (y; x) = f (y m(x )) f X;X (x; x ) : (2.) x = Assumption 2.2 rules out heteroskedasticity or other heterogeneity of the regression error, but allows its density f to be completely unknown and nonparametric. Let denote a characteristic function (ch.f.). Equation (2.) is equivalent to JX Y;X=x (t) = (t) exp (itm(x )) f X;X (x; x ) ; (2.2) x = for all real-valued t, where Y;X=x (t) = R exp(ity)f Y;X (y; x)dy and x 2 X. Since may not be symmetric, (t) = R exp(it)f ()d need not be real-valued. We therefore let (t) (t) exp (ia (t)) ;

8 where (t) q Ref (t)g 2 + Imf (t)g 2 ; Ref (t)g a (t) arccos (t) : We then have for any real-valued scalar t, Y;X=x (t) = (t) JX exp (itm(x ) + ia (t)) f X;X (x; x ) : (2.3) x = De ne F X;X = B f X;X (; ) f X;X (; 2) ::: f X;X (; J) f X;X (2; ) f X;X (2; 2) ::: f X;X (2; J) ::: ::: ::: ::: f X;X (J; ) f X;X (J; 2) ::: f X;X (J; J) ; C A and for any real-valued vector t = (; t 2 ; :::; t J ), we de ne Y;X (t) = B f X () Y;X= (t 2 ) ::: Y;X= (t J ) f X (2) Y;X=2 (t 2 ) ::: Y;X=2 (t J ) ::: ::: ::: ::: f X (J) Y;X=J (t 2 ) ::: Y;X=J (t J ) ; D jj (t) = C B A ::: (t 2 ) ::: ::: ::: ::: ::: ::: (t J ) ; C A and de ne m j = m(j) for j = ; 2; :::; J; m;a (t) = B exp (it 2 m + ia (t 2 )) ::: exp (it J m + ia (t J )) exp (it 2 m 2 + ia (t 2 )) ::: exp (it J m 2 + ia (t J )) ::: ::: ::: ::: exp (it 2 m J + ia (t 2 )) ::: exp (it J m J + ia (t J )) : C A

9 With these matrix notations, for any real-valued vector t equation (2.3) is equivalent to Y;X (t) = F X;X m;a (t) D jj (t): (2.4) Equation (2.4) relates the known parameters Y;X (t) (which may be interpreted as reduced form parameters of the model) to the unknown structural parameters F X;X, m;a (t), and D jj (t). Equation (2.4) provides a su cient number of equality constraints to identify the structural parameters given the reduced form parameters, so what is required are su cient invertibility or rank restrictions to rule out multiple solutions of these equations. To provide these conditions, consider both the real and imaginary parts of Y;X (t). Since D jj (t) is real by de nition, we have Ref Y;X (t)g = F X;X Ref m;a (t)g D jj (t); (2.5) and Imf Y;X (t)g = F X;X Imf m;a (t)g D jj (t): (2.6) The matrices Imf Y;X (t)g and Imf m;a (t)g are not invertible because their rst columns are zeros, so we replace equation (2.6) with (Imf Y;X (t)g + X ) = F X;X (Imf m;a (t)g + ) D jj (t); (2.7)

10 where X = B f X () ::: f X (2) ::: ::: ::: ::: ::: f X (J) ::: C A and = B ::: ::: ::: ::: ::: ::: ::: : C A Equation (2.7) holds because F X;X = X and D jj (t) =. Let C t (Ref Y;X (t)g) (Imf Y;X (t)g + X ). Assumption 2.3 (rank).there is a real-valued vector t = (; t 2 ; :::; t J ) such that: (i) Ref Y;X (t)g and (Imf Y;X (t)g + X ) are invertible; (ii) For any real-valued J J diagonal matrices D k = Diag (; d k;2 ; :::; d k;j ), if D + C t D C t + D 2 C t C t D 2 = then D k = for k = ; 2. We call Assumption 2.3 the rank condition, because it is analogous to the rank condition for identi cation in linear models, and in particular implies identi cation of the two diagonal matrices and D lnjj (t) = Diag ; t ln (t 2 ) ; :::; t ln (t J ) D a (t) = Diag ; t a(t 2); :::; t a(t J). Assumption 2.3(ii) is rather complicated, but can be replaced by some simpler su cient alternatives, which we will describe later. Note also that the rank condition, Assumption 2.3, is testable, since it is expressed entirely in terms of f X and the matrix Y;X (t), which, given a vector t, can be directly estimated from data.

11 In the appendix, we show that Re Y;X (t) A t (Re Y;X (t)) = F XjX D m F XjX ; (2.8) where A t on the left-hand side is identi ed when D lnjj (t) and D a (t) are identi ed, D m = Diag (m(); :::; m(j)), and F XjX = B f XjX (j) f XjX (j2) ::: f XjX (jj) f XjX (2j) f XjX (2j2) ::: f XjX (2jJ) ::: ::: ::: ::: f XjX (Jj) f XjX (Jj2) ::: f XjX (JjJ) : C A Equation (2.8) implies that f XjX (jx ) and m(x ) are eigenfunctions and eigenvalues of an identi ed J J matrix on the left-hand. We may then identify f XjX (jx ) and m(x ) under the following assumption: Assumption 2.4 (i) m(x ) < and m(x ) 6= for all x 2 X ; (ii) m(x ) is strictly increasing in x 2 X. Assumption 2.4(i) implies that each possible value of X is relevant for Y, and the monotonicity assumption 2.4(ii) allows us to assign each eigenvalue m(x ) to its corresponding value x. If we only wish to identify the support of the latent factor m = m(x ) and not the regression function m() itself, then this monotonicity assumption can be dropped. Assumption 2.4 could be replaced by restrictions on f X;X instead (e.g., by exploiting knowledge about the eigenfunctions rather than eigenvalues to properly assign each m(x )

12 to its corresponding value x ), but assumption 2.4 is more in line with our other assumptions, which assume that we have information about our regression model but know very little about the relationship of the unobserved X to the proxy X. Theorem 2. Suppose that assumptions 2., 2.2, 2.3 and 2.4 hold in equation (.). Then the density f Y;X uniquely determines f Y jx and f X;X. Given our model, de ned by assumptions 2. and 2.2, Theorem 2. shows that assumptions 2.3 and 2.4 guarantee that the sample of (Y; X) is informative enough to identify, m(x ) and f X;X in equation (2.2), meaning that all the unknowns in equation (2.) are nonparametrically identi ed. This identi cation is obtained without additional sample information such as an instrumental variable or a secondary sample. Of course, if we have additional covariates such as instruments or repeated measures, they could exploited along with Theorem 2.. Our results can also be immediately applied if we observe an additional covariate vector W that appears in the regression function, so Y = m (X ; W ) +, since our assumptions and results can all be restated as conditioned upon W. Now consider some simpler su cient conditions for assumption 2.3(ii) in Theorem 2.. Denote C T t as the transpose of C t. Let the notation "" stand for the Hadamard product, i.e., the element-wise product of two matrices. Assumption 2.5 The real-valued vector t = (; t 2 ; :::; t J ) satisfying assumption 2.3(i) also satis es: C t C T t + I is invertible and all the entries in the rst row of the matrix C t are nonzero. Assumption 2.5 implies assumption 2.3(ii), and is in fact stronger than assumption 2.3(ii), since if it holds then we may explicitly solve for D lnjj (t) and D a (t) in simple closed form.

13 Another alternative to assumption 2.3(ii) is the following: Assumption 2.6 (symmetric rank) a (t) = for all t and for any real-valued JJ diagonal matrix D = Diag (; d ;2 ; :::; d ;J ), if D + C t D C t = then D =. The condition in assumption 2.6 that a (t) = for all t is the same as assuming that the distribution of the error term is symmetric. We call assumption 2.6 the symmetric rank condition because it implies our previous rank condition when is symmetrically distributed. Finally, as noted in the introduction, the assumption that the measurement error is independent of the regression error, assumption 2.2, is stronger than necessary. All independence is used for is to obtain equation (.3) for some given values of t. More formally, all that is required is that equation (2.4), and hence that equations (2.6) and (2.7) hold for the vector t in assumption 2.3. When there are covariates W in the regression model, which we will use in the estimation, the requirement becomes that equation (2.4) hold for the vector t in assumption 2.3 conditional on W. Therefore, Theorem 2. holds replacing assumption 2.2 with the following, strictly weaker assumption. Assumption 2.7 For the known t = ; t 2 ; :::; t J that satis es assumption 2.3, jx =x (t) = jx = (t) and t jx =x (t) = t jx = (t) for all x 2 X. This condition permits some correlation of the proxy X with the regression error, and allows some moments of to correlate with X 2. THE DICHOTOMOUS CASE We now show how the assumptions required for Theorem 2. can be relaxed and simpli ed in the special case where X is a - dichotomous variable, i.e., X = f; g. De ne m j = m(j)

14 for j = ;. Given just assumption 2., the relationship between the observed density and the latent ones becomes f Y jx (yjj) = f X jx (jj) f jx (y m jj) + f X jx (jj) f jx (y m jj) for j = ; : (2.9) With assumption 2.2, equation (2.9) simpli es to f Y jx (yjj) = f X jx (jj) f (y m ) + f X jx (jj) f (y m ) for j = ; ; (2.) which says that the observed density f Y jx (yjj) is a mixture of two distributions that only di er in their means. Studies on mixture models focus on parametric or nonparametric restrictions on f for a single value of j that su ce to identify all the unknowns in this equation. For example, Bordes, Mottelet and Vandekerkhove (26) shows that all the unknowns in equation (2.) are identi ed for each j when the distribution of is symmetric. In contrast, errors-in-variables models typically impose restrictions on f X jx (or exploit additional information regarding f X jx such as instruments or validation data) along with equation (2.9) or (2.) to obtain identi cation with few restrictions on the distribution f. Now consider assumptions 2.3 or 2.5 in the dichotomous case. We then have for any real-valued 2 vector t = (; t), Y;X (t) = B f X () Y jx= (t)f X () f X () Y jx= (t)f X () C A

15 Ref Y;X (t)g = B f X () Re Y jx= (t)f X () f X () Re Y jx= (t)f X () det (Ref Y;X (t)g) = f X ()f X () Re Y jx= (t) Imf Y;X (t)g + X = B C A f X () Im Y jx= (t)f X () f X () Im Y jx= (t)f X () Re Y jx= (t) C A det (Imf Y;X (t)g + X ) = f X ()f X () Im Y jx= (t) Im Y jx= (t) Also, C t (Ref Y;X (t)g) (Imf Y;X (t)g + X ) 2 = = 6 det (Ref Y;X (t)g) Re Y jx= (t)f X () Re Y jx= (t)f X () f X () f X () f X()f X ()[Im Y jx= (t) Re Y jx= (t) Re Y jx= (t) Im Y jx= (t)] det(ref Y;X (t)g) det(imf Y;X (t)g+ X) det(ref Y;X (t)g) ; 3 7 B 5 f X () Im Y jx= (t)f X () f X () Im Y jx= (t)f X () C A thus 2 C t Ct T + I = det(imf Y;X (t)g+ X) det(ref Y;X (t)g) ; which is always invertible. Therefore, for the dichotomous case, assumption 2.3 and assumption 2.5 become the same, and can be expressed as the following rank condition for binary data: Assumption 2.8 (binary rank) (i) f X ()f X () > ; (ii) there exist a real-valued scalar t such that Re Y jx= (t) 6= Re Y jx= (t), Im Y jx= (t) 6= Im Y jx= (t), Im Y jx= (t) Re Y jx= (t) 6=

16 Re Y jx= (t) Im Y jx= (t). It should be generally easy to nd a real-valued scalar t that satis es this binary rank condition. In the dichotomous case, instead of imposing Assumption 2.4, we may obtain the ordering of m j from that of observed j E(Y jx = j) under the following assumption: Assumption 2.9 (i) > ; (ii) f X jx (j) + f X jx (j) < : Assumption 2.9(i) is not restrictive because one can always rede ne X as X if needed. Assumption 2.9(ii) reveals the ordering of m and m, by making it the same as that of and because f X jx (j) f X jx (j) = m m ; so m > m. Assumption 2.9(ii) says that the sum of misclassi cation probabilities is less than one, meaning that, on average, the observations X are more accurate predictions of X than pure guesses. See Lewbel (27) for further discussion of this assumption. The following Corollary is a direct application of Theorem 2.; hence we omit its proof. Corollary 2.2 Suppose that X = f; g, equations (.) and (2.), assumptions 2.8 and 2.9 hold. Then the density f Y;X uniquely determines f Y jx and f X;X. 2.. IDENTIFICATION WITHOUT INDEPENDENCE We now show how to obtain identi cation in the dichotomous case without the independent regression error assumption 2.2. Given just assumption 2., Equation (2.9) implies that

17 the observed density f Y jx (yjj) is a mixture of two conditional densities f jx (y m jj) and f jx (y m jj). Instead of assuming the independence between X and, we impose the following weaker assumption: Assumption 2. E k jx = E k for k = 2; 3. Only these two moment restrictions are needed because we only need to solve for two unknowns, m and m. Identi cation could also be obtained using other, similar restrictions such as quantiles or modes. For example, one of the moments in this assumption 2. might be replaced with assuming that the density f jx = has zero median. Equation (2.9) then implies that :5 = m Z m f Y jx= (y)dy + m Z m f Y jx= (y)dy which may uniquely identify m under some testable assumptions. An advantage of Assumption 2. is that we obtain a closed-form solution for m and m. h h 2 3 De ne v j E Y j jx = j i, s j E Y j jx = j i, C (v + 2 ) (v + 2 ) ; C 2 2 ( ) v v s s : 2 We leave the detailed proof to the appendix and present the result as follows: Theorem 2.3 Suppose that X = f; g, equations (.) and (2.9), assumptions 2.9 and 2. hold. Then the density f Y jx uniquely determines f Y jx and f X jx. To be speci c, we

18 have m = 2 C r 2 C 2; m = r 2 C + 2 C 2; 2 C f X jx (j) = p 2C2 2 ; f X jx (j) = 2 C p 2C2 2 ; and m j f Y jx =j(y) = f Y jx= (y) + m j f Y jx= (y): 3 SIEVE MAXIMUM LIKELIHOOD ESTIMATION This section considers the estimation of a nonparametric regression model as follows: Y = m (X ; W ) + ; where the function m () is unknown and W is a vector of error-free covariates and is independent of (X ; W ). Let fz t (Y t ; X t ; W t )g n t= denote a random sample of Z (Y; X; W ). We have shown that f Y jx ;W and f X jx;w are identi ed from f Y jx;w. Let (f ; f 2 ; f 3 ) T f ; f X jx;w ; m T be the true parameters of interest. Then the observed likelihood of Y given (X; W ) (or the likelihood for ) is ny f Y jx;w (Y t jx t ; W t ) = t= ( ) ny X f (Y t m (x ; W t )) f X jx;w (x jx t ; W t ) : t= x 2X Before we present a sieve ML estimator b for, we need to impose some mild smoothness restrictions on the unknown functions f ; f X jx;w ; m T. The sieve method allows for

19 unknown functions belonging to many di erent function spaces such as Sobolev space, Besov space and others; see e.g., Shen and Wong (994), Wong and Shen (995), Shen (997) and Van de Geer (993, 2). But, for the sake of concreteness and simplicity, we consider the widely used Hölder space of functions. Let = ( ; :::; d ) T 2 R d, a = (a ; :::; a d ) T be a vector of non-negative integers, and r a h() jaj a a h( d ; :::; d ) d denote the jaj = a + + a d -th derivative. Let kk E denote the Euclidean norm. Let V R d and be the largest integer satisfying >. The Hölder space (V) of order > is a space of functions h : V 7! R such that the rst derivatives are continuous and bounded, and the -th derivative are Hölder continuous with the exponent 2 (; ]. The Hölder space (V) becomes a Banach space under the Hölder norm: khk = max sup jaj jr a h()j + max sup jaj= jr a h() r a h( )j 6= (k k E ) < : Denote c (V) fh 2 (V) : khk c < g as a Hölder ball. Let 2 R, W 2 W with W a compact convex subset in R dw. Also denote F = pf () 2 c (R) : f () >, Z R f ()d = ; F 2 = pf2 (x jx; ) 2 2 c (W) : f 2 (j; ) >, Z X f 2 (x jx; w) dx = for all x 2 X, w 2 W ;

20 and F 3 = ff 3 (x ; ) 2 3 c (W) : f 3 (i; w) > f 3 (j; w) for all i > j, i; j 2 X, w 2 Wg We impose the following smoothness restrictions on the densities: Assumption 3. (i) all the assumptions in Theorem 2. hold; (ii) f () 2 F with > =2; (iii) f X jx;w (x jx; ) 2 F 2 with 2 > d w =2 for all x ; x 2 X f; :::; Jg; (iv) m (x ; ) 2 F 3 with 3 > d w =2 for all x 2 X : Denote A = F F 2 F 3 and = (f ; f 2 ; f 3 ) T. Let E[] denote the expectation with respect to the underlying true data generating process for Z t. Then (f ; f 2 ; f 3 ) T = arg max 2A E[`(Z t ; )], where ( ) X `(Z t ; ) ln f (Y t f 3 (x ; W t )) f 2 (x jx t ; W t ) : (3.) x 2X Let A n = F n F n 2 F n 3 be a sieve space for A, which is a sequence of approximating spaces that are dense in A under some pseudo-metric. The sieve MLE b n = bf ; b f 2 ; b f 3 T 2 An for 2 A is de ned as: b n = arg max 2A n nx `(Z t ; ): (3.2) t= We could apply in nite-dimensional approximating spaces as sieves F n j for F j ; j = ; 2; 3. However, in applications, we shall use nite-dimensional sieve spaces since they are easier to implement. For j = ; 2; 3, let p k j;n j () be a k j;n vector of known basis functions, such as power series, splines, Fourier series, etc. Then we denote the sieve space for F j ; j = ; 2; 3

21 as follows: F n = F n 2 = ( pf2 (x jx; ) = F n 3 = ( np f () = p k ;n () T 2 F o ; JX JX k= j= f 3 (x ; ) = JX k= I (x = k) I (x = j) p k 2;n 2 () T 2;kj 2 F 2 ) ; I (x = k) p k 2;n 3 () T 3;k 2 F 3 ) : We note that the method of sieve MLE is very exible and we can easily impose prior information on the parameter space (A) and the sieve space (A n ). For example, if the functional form of the true regression function m (x ; w) is known upto some nite-dimensional parameters 2 B, where B is a compact subset of R d, then we can take A = F F 2 F B and A n = F n F n 2 F B with F B = ff 3 (x ; w) = m (x ; w; ) : 2 Bg. The sieve MLE becomes b n = arg max 2A n nx `(Z t ; ); t= ( ) X with `(Z t ; ) = ln f (Y t m (x ; W t ; )) f 2 (x jx t ; W t ) : x 2X (3.3) 3. Consistency The consistency of the sieve MLE b n can be established by applying either Geman and Hwang (982) or lemma A. of Newey and Powell (23). First we de ne a norm on A as follows: kk s = sup h() + 2 =2 + sup x ;x;w jf 2 (x jx; w)j + sup jf 3 (x ; w)j for some > : x ;w

22 We assume Assumption 3.2 (i) < E [`(Z t ; )] <, E [`(Z t ; )] is upper semicontinuous on A under the metric kk s ; (ii) there are a nite > and a random variable U(Z t ) with EfU(Z t )g < such that sup 2An:k k s j`(z t ; ) `(Z t ; )j U(Z t ). Assumption 3.3 (i) p k ;n () is a k ;n vector of spline wavelet basis functions on R, and for j = 2; 3, p k j;n j () is a k j;n vector of tensor product of spline basis functions on W; (ii) k n maxfk ;n ; k 2;n ; k 3;n g! and k n =n!. The following consistency lemma is a direct application of lemma A. of Newey and Powell (23) or theorem 3. (or remark 3.(4), remark 3.3) of Chen (26); hence we omit its proof. Lemma 3. Let b n be the sieve MLE. Under assumptions , we have kb n k s = o p (). 3.2 Convergence rate under the Fisher metric Given Lemma 3., we can now restrict our attention to a shrinking jj jj s neighborhood around. Let A s f 2 A : jj jj s = o(); jjjj s c < cg and A sn f 2 A n : jj jj s = o(); jjjj s c < cg. Then, for the purpose of establishing a convergence rate under a pseudo metric that is weaker than jj jj s, we can treat A s as the new parameter space and A sn as its sieve space, and assume that both A s and A sn are convex parameter spaces. For any, 2 2 A s, we consider a continuous path f () : 2 [; ]g in A s such that () = and () = 2. For simplicity we assume that for any ; + v 2 A s,

23 f + v : 2 [; ]g is a continuous path in A s, and that `(Z t ; + v) is twice continuously di erentiable at = for almost all Z t and any direction v 2 A s. De ne the pathwise rst derivative as d`(z t ; ) d [v] d`(z t; + v) j = a.s. Z t ; d and the pathwise second derivative as d 2`(Z t ; ) [v; v] d2`(z t ; + v) j dd T d 2 = a.s. Z t : De ne the Fisher metric kk on A s as follows: for any, 2 2 A s, k 2 k 2 E ( d`(zt ; ) d ) 2 [ 2 ] : We show that b n converges to at a rate faster than n =4 under the Fisher metric kk with the following assumptions: Assumption 3.4 (i) > ; (ii) minf ; 2 =d w ; 3 =d w g > =2: Assumption 3.5 (i) A s is convex at ; (ii) `(Z t ; ) is twice continuously pathwise di erentiable with respect to 2 A s. Assumption 3.6 sup e2as sup 2Asn d`(z t;e) d with Ef[U(Z t )] 2 g <. h k k s i U(Zt ) for a random variable U(Z t )

24 Assumption 3.7 (i) sup v2as :jjvjj s= E d`(zt; ) A s and 2 A sn, we have d 2 [v] c < ; (ii) uniformly over e 2 E d2`(z t ; e) [ dd T ; ] = k k 2 f + o()g: Assumption 3.4 guarantees that the sieve approximation error under the strong norm jj jj s goes to zero at the rate of (k n ). Assumption 3.5 makes sure that the twice pathwise derivatives are well de ned with respect to 2 A s, hence the pseudo metric k k is well de ned on A s. Assumption 3.6 impose an envelope condition. Assumption 3.7(i) implies that k k p c k k s for all 2 A s. Assumption 3.7(ii) implies that there are positive nite constants c and c 2 such that for all 2 A sn, c k k 2 E[`(Z t ; ) `(Z t ; )] c 2 k k 2, that is, k k 2 is equivalent to the Kullback- Leibler discrepancy on the local sieve space A sn. The following convergence rate theorem is a direct application of theorem 3.2 of Shen and Wong (24) or theorem 3.2 of Chen (26) to the local parameter space A s and the local sieve space A sn ; hence we omit its proof. Theorem 3.2 Under assumptions , we have kb n k = O P max ( r )! kn kn ; = O P n n 2+ if k n = O n 2+ :

25 3.3 Asymptotic normality and semiparametric e ciency Let V denote the closure of the linear span of A s f g under the Fisher metric kk. Then V; kk is a Hilbert space with the inner product de ned as d`(zt ; ) hv ; v 2 i E d d`(zt ; ) [v ] [v 2 ] : d We are interested in estimation of a functional ( ), where : A! R. It is known that the asymptotic properties of (^ n ) depend on the smoothness of the functional and the rate of convergence of the sieve MLE ^ n. For any v 2 V, we denote d( ) d [v] lim [(( + v) ( ))=]! whenever the right hand-side limit is well de ned. We impose the following additional conditions for asymptotic normality of plug-in sieve MLE (^ n ): Assumption 3.8 (i) for any v 2 V, ( + v) is continuously di erentiable in 2 [; ] near =, and k d( ) d k sup v2v:jjvjj> d( ) d jjvjj [v] < ; (ii) there exist constants c > ;! >, and a small " > such that for any v 2 V with jjvjj ", we have ( + v) ( ) d( ) d [v] cjjvjj! : Under Assumption 3.8 (i), by the Riesz representation theorem, there exists 2 V such

26 that and h ; vi = d( ) [v] for all v 2 V (3.4) d jj jj 2 k d( ) d( ) [v] 2 d d k2 sup < : (3.5) v2v:jjvjj> jjvjj 2 Under Theorem 3.2, we have jjb n jj = O P ( n ) with n = n 2+. In the following we denote N = f 2 A s : k k = O( n )g and N n = f 2 A sn : k k = O( n )g. Assumption 3.9 (i) ( n )! = o(n =2 ); (ii) there is a n 2 A n f g such that jj n jj = o() and n k n k = o(n =2 ). Assumption 3. there is a random variable U(Z t ) with Ef[U(Z t )] 2 g < and a nonnegative measurable function with lim! () = such that for all 2 N n, sup d 2`(Z t ; ) [ dd T ; n] U(Z t) (jj jj s ): 2N Assumption 3. Uniformly over 2 N and 2 N n, E d2`(z t ; ) dd T [ ; n] d 2`(Z t ; ) [ dd T ; n] = o n =2 : Assumption 3.8(i) is critical for obtaining the p n convergence of plug-in sieve MLE (^ n ) to ( ) and its asymptotic normality. If Assumption 3.8(i) is not satis ed, then the plug-in sieve MLE (^ n ) is still consistent for ( ), but the best achievable convergence rate is slower than the p n rate. Assumption 3.9 implies that the asymptotic bias of the Riesz representer is negligible. Assumptions 3. and 3. control the remainder term.

27 Applying theorems and 4 of Shen (997), we immediately obtain Theorem 3.3 Suppose that assumptions hold. Then the plug-in sieve MLE (^ n ) is semiparametrically e cient, and p n ((^ n ) ( )) d! N (; jj jj 2 ). Following Ai and Chen (23), the asymptotic e cient variance, jj jj 2, of the plug-in sieve MLE (^ n ) can be consistently estimated by b 2 n: b 2 n = max v2a n n P n t= d(bn) [v] 2 d d`(zt;b n) [v] d 2 : Instead of estimating this asymptotic variance, one could also construct con dence intervals by applying the likelihood ratio inference as in Murphy and Van der Vaart (996, 2). 4 Simulation This section presents two small simulation studies: the rst one corresponds to the identi - cation strategy, and the second one checks the performance of sieve MLE. 4. Moment-based estimation This subsection applies the identi cation procedure to a simple nonlinear regression model with simulated data. We consider the following regression model y = + :25 (x ) 2 + : (x ) 3 + ;

28 where N(; ) is independent of x. The marginal distribution Pr(x ) is as follows: Pr(x ) = :2 [(x = ) + (x = 4)] + :3 [(x = 2) + (x = 3)] and the misclassi cation probability matrix F xjx are in Tables -2. We consider two examples of the misclassi cation probability matrix. Example considers a strictly diagonally dominant matrix F xjx as follows: F xjx = B :6 :2 : : :2 :6 : : : : :7 : : : : :7 : C A Example 2 has a misclassi cation probability matrix F xjx = :7F u + :3I; where I is an identify matrix and F u = h i uij P k u kj ij with u ij independently drawn from a uniform distribution on [; ]. In each repetition, we directly follow the identi cation procedure shown in the proof of theorem 2.. The matrix Y;X is estimated by replacing the function Y;X=x (t) with its corresponding empirical counterpart as follows: b Y;X=x (t) = nx exp(ity j ) (x j = x): j=

29 Since it is directly testable, assumption 2.3 is veri ed with t j in the vector t = (; t 2 ; t 3 ; t 4 ) independently drawn from a uniform distribution on [ ; ] until a desirable t is found. The sample size is 5 and the repetition times is. The simulation results in Tables -2 include the estimates of regression function m(x ), the marginal distribution Pr(x ), and the estimated misclassi cation probability matrix F xjx, together with standard errors of each element. As shown in Tables -2, the estimator following the identi cation procedure performs well with the simulated data. 4.2 Sieve MLE This subsection applies the sieve ML procedure to a semiparametric model as follows: Y = W + 2 ( X ) W ; where is independent of X 2 f; g and W. The unknowns include the parameter of interest = ( ; 2 ; 3 ) and the nuisance functions f and f X jx;w. We simulate the model from N(; ) and X 2 f; g according to the marginal distribution f X (x ) = :4 (x = ) + :6 (x = ). We generate the covariate W as W = ( :5X ), where N(; ) is independent of X. The observed mismeasured X is generated as follows: 8 >< () p(x ) X = >: otherwise ; where p() = :5 and p() = :3. The Monte Carlo simulation consists of 4 repetitions. In each repetition, we randomly

30 draw 3 observations of (Y; X; W ), and then apply three ML estimators to compute the parameter of interest. All three estimators assume that the true density f of the regression error is unknown. The rst estimator uses the contaminated sample fy i ; X i ; W i g n i= as if it were accurate; this estimator is inconsistent and its bias should dominate the squared root of mean square error (root MSE). The second estimator is the sieve MLE using uncontaminated data fy i ; Xi ; W i g n i= ; this estimator is consistent and most e cient. However, we call it infeasible MLE since X i is not observed in practice. The third estimator is the sieve MLE (3.3) presented in Section 3, using the sample fy i ; X i ; W i g n i= and allowing for arbitrary measurement error by assuming f XjX ;W is unknown. In this simulation study, all three estimators are computed by approximating the unknown p f using the same Hermite polynomial sieve; for the third estimator (the sieve MLE) we also approximate p f XjX ;W by another Hermite polynomial sieve. The Monte Carlo results in Table 3 show that the sieve MLE has a much smaller bias than the rst estimator ignoring measurement error. Since the sieve MLE has to estimate the additional unknown function f XjX ;W, its b j, j = ; 2; 3 estimate may have larger standard error compared to the other two estimators. In summary, our sieve MLE performs well in this Monte Carlo simulation. 5 Discussion We have provided nonparametric identi cation and estimation of a regression model in the presence of a mismeasured discrete regressor, without the use of additional sample information such as instruments, repeated measurements or validation data, and without parameterizing the distributions of the measurement error or of the regression error.

31 Identi cation mainly comes from the monotonicity of the regression function, the limited support of the mismeasured regressor, su cient variation in the dependent variable, and from some independence related assumptions regarding the regression model error. It may be possible to extend these results to continuously distributed nonclassically mismeasured regressors, by replacing many of our matrix related assumptions and calculations with corresponding linear operators. APPENDIX. MATHEMATICAL PROOFS Proof. (Theorem 2.) Notice that () t = and a () =. we de ne t ie [Y jx = ] f X () t t Y;X=(t 2 ) ::: t Y;X=(t J ) Y;X(t) = B ie [Y jx = 2] f X (2) t Y;X=2(t 2 ) ::: t Y;X=2(t J ) C ::: ::: ::: ::: A : ie [Y jx = J] f X (J) t Y;X=J(t 2 ) ::: t Y;X=J(t J ) By taking the derivative with respect to scalar t, we have from equation (2.3) t Y;X=x (t) = t +i (t) t a (t) X exp (itm(x ) + ia (t)) f X;X (x; x ) (A.) x (t) X exp (itm(x ) + ia (t)) f X;X (x; x ) x +i (t) X x exp (itm(x ) + ia (t)) m(x )f X;X (x; x ) : Equation (A.) is equivalent to t Y;X(t) = F X;X m;a (t) D jj (t) (A.2) +i F X;X m;a (t) D jj (t) D a (t) + i F X;X D m m;a (t) D jj (t); where D a (t) = ::: D jj (t) = B t (t 2 ) ::: ::: ::: ::: ::: ::: (t t J ) ::: B C A ; D m = B t a(t 2) ::: ::: ::: ::: ::: ::: t a(t J) C A ; m ::: m 2 ::: ::: ::: ::: ::: ::: m J C A :

32 Since by de nition, D jj (t) and D a (t) are real-valued, we also have from equation (A.2) Ref t Y;X(t)g = F X;X Ref m;a (t)g D jj (t) (A.3) F X;X Imf m;a (t)g D jj (t) D a (t) F X;X D m Imf m;a (t)g D jj (t): In order to replace the singular matrix Imf m;a (t)g with the invertible (Imf m;a (t)g + ), we de ne E [Y jx = ] f X () ::: E[Y jx] = B E [Y jx = 2] f X (2) ::: C ::: ::: ::: ::: A = F X;X D m : E [Y jx = J] f X (J) ::: We then have Ref t Y;X(t)g E[Y jx] = F X;X Ref m;a (t)g D jj (t) (A.4) F X;X (Imf m;a (t)g + ) D jj (t) D a (t) F X;X D m (Imf m;a (t)g + ) D jj (t); where D jj (t) D a (t) = and = D jj (t). Similarly, we have Imf t Y;X(t)g = F X;X Imf m;a (t)g D jj (t) +F X;X Ref m;a (t)g D jj (t) D a (t) + F X;X D m Ref m;a (t)g D jj (t) = F X;X (Imf m;a (t)g + ) D jj (t) +F X;X Ref m;a (t)g D jj (t) D a (t) + F X;X D m Ref m;a (t)g D jj (t); where D jj (t) =. De ne Y jx (t) = m;a (t) D jj (t). We then have Ref Y jx (t)g = Ref m;a (t)gd jj (t); Imf Y jx (t)g + = (Imf m;a (t)g + )D jj (t): In summary, we have Ref Y;X (t)g = F X;X Ref Y jx (t)g; (A.5) (Imf Y;X (t)g + X ) = F X;X Imf Y jx (t)g + ; (A.6) Re t Y;X(t) E[Y jx] = F X;X Re m;a (t) D jj (t) (A.7) F X;X Im Y jx (t) + D a (t) F X;X D m Im Y jx (t) + ; Im t Y;X(t) = F X;X (Im m;a (t) + ) D jj (t) (A.8) +F X;X Re Y jx (t) D a (t) + F X;X D m Re Y jx (t):

33 The left-hand sides of these equations are all observed, while the right-hand sides contain all the unknowns. Assumption 2.3(i) also implies that F X;X, Ref m;a (t)g and (Imf m;a (t)g + ) are invertible in equations (2.5) and (2.7). Recall the de nition of the observed matrix C t ; which by equations (A.5) and (A.6) equals C t (Re Y;X (t)) (Im Y;X (t) + X ) = Re Y jx (t) Im Y jx (t) + : Denote A t Re Y jx (t) Dm Re Y jx (t). With equations (A.5) and (A.7), we consider B R (Re Y;X (t)) Re t Y;X(t) E[Y jx] = Re m;a (t) D jj (t) Re m;a (t) D jj (t) Re Y jx (t) Im Y jx (t) + D a (t) Re Y jx (t) Dm Im Y jx (t) + = [D jj (t)] D jj (t) C t D a (t) Re Y jx (t) Dm Re Y jx (t) C t D lnjj (t) C t D a (t) A t C t : (A.9) Similarly, we have by equations (A.6) and (A.8) B I (Im Y;X (t) + X ) Im t Y;X(t) = (Im m;a (t) + ) D jj (t) (Im m;a (t) + ) D jj (t) + Im Y jx (t) + Re Y jx (t) D a (t) + Im Y jx (t) + Dm Re Y jx (t); = D lnjj (t) + C t D a (t) + C t A t (A.) We eliminate the matrix A t in equations (A.9) and (A.) to have B R + C t B I C t = D lnjj (t) + C t D lnjj (t) C t + D a (t) C t C t D a (t): (A.) Notice that both D lnjj (t) and D a (t) are diagonal, Assumption 2.3(ii) implies that D lnjj (t) and D a (t) are uniquely identi ed from equation (A.). Further, since the diagonal terms of (D a (t) C t C t D a (t)) are zeros, we have diag (B R + C t B I C t ) = diag D lnjj (t) + C t Ct T diag D lnjj (t) +D a (t) diag (C t ) D a (t) diag (C t ) = C t Ct T + I diag D lnjj (t) ; where the function diag() generates a vector of the diagonal entries of its argument and the notation "" stands for the Hadamard product or the element-wise product. By assumption

34 2.5(i), we may solve D lnjj (t) as follows: diag D lnjj (t) = C t C T t Furthermore, equation (A.) implies that + I diag (BR + C t B I C t ) : (A.2) U B R + C t B I C t D lnjj (t) C t D lnjj (t) C t (A.3) = D a (t) C t C t D a (t); De ne a J by vector e = (; ; ; :::; ) T. The de nition of D a (t) implies that e T D a (t) =. Therefore, equation A.3 implies e T U = e T C t D a (t): Assumption 2.5(ii) implies that all the entries in the row vector e T C t are nonzero. Let e T C t (c ; c 2 ; :::; c J ). The vector diag (D a (t)) is then uniquely determined as follows: diag (D a (t)) = B c ::: c 2 ::: ::: ::: ::: ::: ::: c J C A U T e : After D lnjj (t) and D a (t) are identi ed, we may then identify the matrix A t Re Y jx (t) D m Re Y jx (t) from equation (A.) Notice that A t = C t B I D lnjj (t) D a (t): Re Y jx (t) = (F X;X ) Re Y;X (t) = F XjX F X Re Y;X (t) where F XjX = F X = B B f XjX (j) f XjX (j2) ::: f XjX (jj) f XjX (2j) f XjX (2j2) ::: f XjX (2jJ) ::: ::: ::: ::: f XjX (Jj) f XjX (Jj2) ::: f XjX (JjJ) f X () ::: f X (2) ::: ::: ::: ::: ::: ::: f X (J) C A : C A ; Therefore, we have Re Y;X (t) A t (Re Y;X (t)) = F XjX F X Dm F XjX F X = F XjX D m F XjX : (A.4)

35 Equation (A.4) implies that the unknowns m j in matrix D m are eigenvalues of a directly estimatable matrix on the left-hand side, and each column in the matrix F XjX is an eigenvector. Assumption 2.4 guarantees that all the eigenvalues are distinctive and nonzero in the diagonalization in equation (A.4). We may then identify m j as the roots of det (A t m j I) = : To be speci c, m j may be identi ed as the j-th smallest root. Equation (A.4) also implies that the j-th column in the matrix F XjX is the eigenvector corresponding to the eigenvalue m j. Notice that each eigenvector is already normalized because each column of F XjX is a conditional density and the sum of entries in each column equals one. Therefore, each column of F XjX is identi ed as normalized eigenvectors corresponding to each eigenvalue m j. Finally, we may identify f Y;X through equation (2.) as follows, for any y 2 Y: f Y;X (y; ) f Y;X (y; 2) ::: f Y;X (y; J) T = F XjX f Y;X (y; ) f Y;X (y; 2) ::: f Y;X (y; J) T : Proof. (Theorem 2.3) First, we introduce notations as follows: for j = ; h v j = E Y m j = m (j) ; j = E(Y jx = j); i h 2 j jx = j, s j = E Y j 3 jx = j i, p = f X jx (j) ; q = f X jx (j) ; f Y jx=j (y) = f Y jx (yjj): We start the proof with equation (2.9), which is equivalent to fy jx (yj) fx = jx (j) f X jx (j) fjx =(y m ) f Y jx (yj) f X jx (j) f X jx (j) f jx =(y m ) Using the notations above, we have fy jx= (y) p p = f Y jx= (y) q q Since E[jX ] =, we have We may solve for p and q as follows: fjx =(y m ) f jx =(y m ) = ( p) m + pm and = qm + ( q) m : : : (A.5) p = m m m and q = m m m : (A.6)

36 We also have p q = Assumption 2.9 implies that m m + m m m > m : = m m : and fjx =(y m ) f jx =(y m ) = q p p q q p Plug-in the expression of p and q in equation (A.6), we have fy jx= (y) f Y jx= (y) : p p q = m ; q p q = m ; and p p q = q p q ; q p q = p p q ; fjx =(y m ) f jx =(y m ) = = m m m m m m m m! fy jx= (y) f Y jx= (y) :! fy jx= (y) f Y jx= (y) In other words, we have for j = ; m j f jx =j(y) = f Y jx= (y + m j ) + m j f Y jx= (y + m j ): (A.7) In summary, f X jx (or p and q) and f jx are identi ed if we can identify m and m. Next, we show that m and m are indeed identi ed. By assumption 2., we have E k jx = E k for k = 2; 3: For k = 2, we consider v = E (m(x ) ) 2 jx = + E 2 Similarly, we have = E m(x ) 2 jx = 2 + E 2 = qm 2 + ( q) m E 2 : We eliminate E ( 2 ) to obtain, v = ( p) m 2 + pm E 2 : ( p) m 2 + pm 2 v + 2 = qm 2 + ( q) m 2 v + 2 : That is v + 2 v + 2 = ( p q) m 2 m 2 ;

37 We have shown that p q = m m : Thus, m and m satisfy the following linear equation: m + m = (v + 2 ) (v + 2 ) C : This means we need one more restriction to identify m and m.we consider and s = E (Y ) 3 jx = = E (m (X ) ) 3 jx = + E 3 = q (m ) 3 + ( q) (m ) 3 + E 3 s = ( p) (m ) 3 + p (m ) 3 + E 3 : We eliminate E ( 2 ) in the two equations above to obtain, ( p) (m ) 3 + p (m ) 3 s = q (m ) 3 + ( q) (m ) 3 s Plug in the expression of p and q in equation (A.6), we have (m ) (m ) (m + m 2 ) s = (m ) (m ) (m + m 2 ) s ; Since m + m = C ; we have (C m ) (m ) (C 2 ) + s = (C m ) (m ) (C 2 ) + s ; that is, Moreover, we have m 2 2 (C 2 ) + (m ) C (C 2 ) + s = m 2 2 (C 2 ) + (m ) C (C 2 ) + s 2m 2 ( ) + 2C ( ) m = 2 (C 2 ) 2 (C 2 ) C (C 2 ) + C (C 2 ) + s s = 2 2 C ( ) C C + s s Since ( ) >, we have 2m 2 + 2C m = 3 ( + ) C C 2 + s s : Finally, we have 2 2 m 2 C + C 2 =

38 where C 2 = 3 2 C2 3 ( + ) C s s = 3 2 [C ( + )] ( + ) s s = 3 2 [C ( + )] ( ) 2 s s 2 v v s s = 2 ( ) h s j = E Y j 3 jx = j i = E Y 3 jx = j 3E Y 2 jx = j j j 3 j = E Y 3 jx = j 3E Y 2 jx = j j j j 3 j j j, C 2 = 3 2 C2 3 ( + ) C s s = 3 2 C2 3 ( + ) C ( ) = 3 2 C2 3 ( + ) C 3 ( 3 ) = 3 2 C2 3 ( + ) = Notice that we also have 2 2 m 2 C + C 2 = ; which implies that m and m are two roots of this quadratic equation. Since m > m, we have m = r 2 C 2 C 2; m = r 2 C + 2 C 2: After we have identi ed m and m, p and q (or f X jx) are identi ed from equation (A.6), and the density f (or f Y jx ) is also identi ed from equation (A.7). Thus, we have identi ed the latent densities f Y jx and f X jx from the observed density f Y jx. REFERENCES Ai, C., and Chen, X. (23), E cient Estimation of Models With Conditional Moment Restric-

39 tions Containing Unknown Functions, Econometrica, 7, Bickel, P. J.; Ritov, Y., 987, "E cient estimation in the errors in variables model." Ann. Statist. 5, no. 2, Bordes, L., S. Mottelet, and P. Vandekerkhove, 26, "Semiparametric estimation of a twocomponent mixture model," Annals of Statistics, 34, Bound, J. C. Brown and N. Mathiowetz, 2, Measurement error in survey data, in Handbook of Econometrics, Vol. 5, ed. by J. Heckman and E. Leamer. North Holland. Carroll, R.J., D. Puppert, C. Crainiceanu, T. Tostenson and M. Karagas, 24), Nonlinear and Nonparametric Regression and Instrumental Variables, Journal of the American Statistical Association, 99 (467), pp Carroll, R.J., D. Puppert, L. Stefanski and C. Crainiceanu, 26, Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition, CRI. Carroll, R.J. and L.A. Stefanski, 99, Approximate quasi-likelihood estimation in models with surrogate predictors, Journal of the American Statistical Association 85, pp Chen, X. (26): Large Sample Sieve Estimation of Semi-nonparametric Models, in J.J. Heckman and E.E. Leamer (eds.), The Handbook of Econometrics, vol. 6. North-Holland, Amsterdam, forthcoming. Chen, X., H. Hong, and D. Nekipelov, 27, Measurement error models, working paper of New York University and Stanford University, a survey prepared for the Journal of Economic Literature. Chen, X., H. Hong, and E. Tamer, 25, Measurement error models with auxiliary data, Review of Economic Studies, 72, pp Chen, X., H. Hong, and A. Tarozzi (27): Semiparametric E ciency in GMM Models with Auxiliary Data, Annals of Statistics, forthcoming. Cheng, C. L., Van Ness, J. W., 999, Statistical Regression with Measurement Error, Arnold, London. Fuller, W., 987, Measurement error models. New York: John Wiley & Sons. Geman, S., and Hwang, C. (982), Nonparametric Maximum Likelihood Estimation by the Method of Sieves, The Annals of Statistics,, Grenander, U. (98), Abstract Inference, New York: Wiley Series. Hansen, L.P. (982): Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 5,

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z Boston College First

More information

NONPARAMETRIC IDENTIFICATION AND ESTIMATION OF NONCLASSICAL ERRORS-IN-VARIABLES MODELS WITHOUT ADDITIONAL INFORMATION

NONPARAMETRIC IDENTIFICATION AND ESTIMATION OF NONCLASSICAL ERRORS-IN-VARIABLES MODELS WITHOUT ADDITIONAL INFORMATION Statistica Sinica 19 (2009), 949-968 NONPARAMETRIC IDENTIFICATION AND ESTIMATION OF NONCLASSICAL ERRORS-IN-VARIABLES MODELS WITHOUT ADDITIONAL INFORMATION Xiaohong Chen, Yingyao Hu and Arthur Lewbel Yale

More information

Nonparametric Identi cation of a Binary Random Factor in Cross Section Data

Nonparametric Identi cation of a Binary Random Factor in Cross Section Data Nonparametric Identi cation of a Binary Random Factor in Cross Section Data Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College Original January 2009, revised March

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION Arthur Lewbel Boston College December 19, 2000 Abstract MisclassiÞcation in binary choice (binomial response) models occurs when the dependent

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Rank Estimation of Partially Linear Index Models

Rank Estimation of Partially Linear Index Models Rank Estimation of Partially Linear Index Models Jason Abrevaya University of Texas at Austin Youngki Shin University of Western Ontario October 2008 Preliminary Do not distribute Abstract We consider

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Nonparametric Identication of a Binary Random Factor in Cross Section Data and

Nonparametric Identication of a Binary Random Factor in Cross Section Data and . Nonparametric Identication of a Binary Random Factor in Cross Section Data and Returns to Lying? Identifying the Effects of Misreporting When the Truth is Unobserved Arthur Lewbel Boston College This

More information

A Local Generalized Method of Moments Estimator

A Local Generalized Method of Moments Estimator A Local Generalized Method of Moments Estimator Arthur Lewbel Boston College June 2006 Abstract A local Generalized Method of Moments Estimator is proposed for nonparametrically estimating unknown functions

More information

Econ 273B Advanced Econometrics Spring

Econ 273B Advanced Econometrics Spring Econ 273B Advanced Econometrics Spring 2005-6 Aprajit Mahajan email: amahajan@stanford.edu Landau 233 OH: Th 3-5 or by appt. This is a graduate level course in econometrics. The rst part of the course

More information

Control Functions in Nonseparable Simultaneous Equations Models 1

Control Functions in Nonseparable Simultaneous Equations Models 1 Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell 2 UCL & IFS and Rosa L. Matzkin 3 UCLA June 2013 Abstract The control function approach (Heckman and Robb (1985)) in a

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Locally Robust Semiparametric Estimation

Locally Robust Semiparametric Estimation Locally Robust Semiparametric Estimation Victor Chernozhukov Juan Carlos Escanciano Hidehiko Ichimura Whitney K. Newey The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors The Asymptotic Variance of Semi-parametric stimators with Generated Regressors Jinyong Hahn Department of conomics, UCLA Geert Ridder Department of conomics, USC October 7, 00 Abstract We study the asymptotic

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

The Econometrics of Unobservables: Applications of Measurement Error Models in Empirical Industrial Organization and Labor Economics

The Econometrics of Unobservables: Applications of Measurement Error Models in Empirical Industrial Organization and Labor Economics The Econometrics of Unobservables: Applications of Measurement Error Models in Empirical Industrial Organization and Labor Economics Yingyao Hu Johns Hopkins University July 12, 2017 Abstract This paper

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

The Influence Function of Semiparametric Estimators

The Influence Function of Semiparametric Estimators The Influence Function of Semiparametric Estimators Hidehiko Ichimura University of Tokyo Whitney K. Newey MIT July 2015 Revised January 2017 Abstract There are many economic parameters that depend on

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Online publication date: 30 April 2010 PLEASE SCROLL DOWN FOR ARTICLE

Online publication date: 30 April 2010 PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [JHU John Hopkins University] On: 30 April 2010 Access details: Access Details: [subscription number 917268152] Publisher Taylor & Francis Informa Ltd Registered in England

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

Estimation and Inference with Weak Identi cation

Estimation and Inference with Weak Identi cation Estimation and Inference with Weak Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania First Draft: August, 2007 Revised: March

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS

SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS JONG-MYUN MOON Abstract. This paper studies transformation models T (Y ) = X + " with an unknown monotone transformation T. Our focus is on the identi

More information

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 19: Nonparametric Analysis

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

Speci cation of Conditional Expectation Functions

Speci cation of Conditional Expectation Functions Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics

More information

Nonparametric Estimation of Wages and Labor Force Participation

Nonparametric Estimation of Wages and Labor Force Participation Nonparametric Estimation of Wages Labor Force Participation John Pepper University of Virginia Steven Stern University of Virginia May 5, 000 Preliminary Draft - Comments Welcome Abstract. Model Let y

More information

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form. 1 Introduction to Nonparametric and Semiparametric Estimation Good when there are lots of data and very little prior information on functional form. Examples: y = f(x) + " (nonparametric) y = z 0 + f(x)

More information

Minimax Estimation of a nonlinear functional on a structured high-dimensional model

Minimax Estimation of a nonlinear functional on a structured high-dimensional model Minimax Estimation of a nonlinear functional on a structured high-dimensional model Eric Tchetgen Tchetgen Professor of Biostatistics and Epidemiologic Methods, Harvard U. (Minimax ) 1 / 38 Outline Heuristics

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

SIEVE INFERENCE ON SEMI-NONPARAMETRIC TIME SERIES MODELS. Xiaohong Chen, Zhipeng Liao and Yixiao Sun. February 2012

SIEVE INFERENCE ON SEMI-NONPARAMETRIC TIME SERIES MODELS. Xiaohong Chen, Zhipeng Liao and Yixiao Sun. February 2012 SIEVE INFERENCE ON SEMI-NONPARAMERIC IME SERIES MODELS By Xiaohong Chen, Zhipeng Liao and Yixiao Sun February COWLES FOUNDAION DISCUSSION PAPER NO. 849 COWLES FOUNDAION FOR RESEARCH IN ECONOMICS YALE UNIVERSIY

More information

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

GMM Estimation of SAR Models with. Endogenous Regressors

GMM Estimation of SAR Models with. Endogenous Regressors GMM Estimation of SAR Models with Endogenous Regressors Xiaodong Liu Department of Economics, University of Colorado Boulder E-mail: xiaodongliucoloradoedu Paulo Saraiva Department of Economics, University

More information

Informational Content of Special Regressors in Heteroskedastic Binary Response Models 1

Informational Content of Special Regressors in Heteroskedastic Binary Response Models 1 Informational Content of Special Regressors in Heteroskedastic Binary Response Models 1 Songnian Chen, Shakeeb Khan 3 and Xun Tang 4 This Version: April 3, 013 We quantify the informational content of

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and

More information

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY Rosa L. Matzkin Department of Economics University of California, Los Angeles First version: May 200 This version: August 204 Abstract We introduce

More information

Closed-Form Estimation of Nonparametric Models. with Non-Classical Measurement Errors

Closed-Form Estimation of Nonparametric Models. with Non-Classical Measurement Errors Closed-Form Estimation of Nonparametric Models with Non-Classical Measurement Errors Yingyao Hu Johns Hopkins Yuya Sasaki Johns Hopkins November 30, 04 Abstract This paper proposes closed-form estimators

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Identification and Estimation of Regression Models with Misclassification

Identification and Estimation of Regression Models with Misclassification Identification and Estimation of Regression Models with Misclassification Aprajit Mahajan 1 First Version: October 1, 2002 This Version: December 1, 2005 1 I would like to thank Han Hong, Bo Honoré and

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables

Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables Josep Lluís Carrion-i-Silvestre University of Barcelona Dukpa Kim y Korea University May 4, 28 Abstract We consider

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Random Utility Models, Attention Sets and Status Quo Bias

Random Utility Models, Attention Sets and Status Quo Bias Random Utility Models, Attention Sets and Status Quo Bias Arie Beresteanu and Roee Teper y February, 2012 Abstract We develop a set of practical methods to understand the behavior of individuals when attention

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Lecture 8: Basic convex analysis

Lecture 8: Basic convex analysis Lecture 8: Basic convex analysis 1 Convex sets Both convex sets and functions have general importance in economic theory, not only in optimization. Given two points x; y 2 R n and 2 [0; 1]; the weighted

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Instrumental Variable Treatment of Nonclassical Measurement Error Models

Instrumental Variable Treatment of Nonclassical Measurement Error Models Instrumental Variable Treatment of Nonclassical Measurement Error Models Yingyao Hu Department of Economics The University of Texas at Austin 1 University Station C3100 BRB 1.116 Austin, TX 78712 hu@eco.utexas.edu

More information

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770 Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 77 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill October 4, 2 MATHEMATICAL

More information

Made available courtesy of Elsevier:

Made available courtesy of Elsevier: Binary Misclassification and Identification in Regression Models By: Martijn van Hasselt, Christopher R. Bollinger Van Hasselt, M., & Bollinger, C. R. (2012). Binary misclassification and identification

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Conditions for the Existence of Control Functions in Nonseparable Simultaneous Equations Models 1

Conditions for the Existence of Control Functions in Nonseparable Simultaneous Equations Models 1 Conditions for the Existence of Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell UCL and IFS and Rosa L. Matzkin UCLA First version: March 2008 This version: October 2010

More information

Identi cation and Estimation of Semiparametric Two Step Models

Identi cation and Estimation of Semiparametric Two Step Models Identi cation and Estimation of Semiparametric Two Step Models Juan Carlos Escanciano y Indiana University David Jacho-Chávez z Indiana University Arthur Lewbel x Boston College original May 2010, revised

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Non-parametric Identi cation and Testable Implications of the Roy Model

Non-parametric Identi cation and Testable Implications of the Roy Model Non-parametric Identi cation and Testable Implications of the Roy Model Francisco J. Buera Northwestern University January 26 Abstract This paper studies non-parametric identi cation and the testable implications

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Estimating Semi-parametric Panel Multinomial Choice Models

Estimating Semi-parametric Panel Multinomial Choice Models Estimating Semi-parametric Panel Multinomial Choice Models Xiaoxia Shi, Matthew Shum, Wei Song UW-Madison, Caltech, UW-Madison September 15, 2016 1 / 31 Introduction We consider the panel multinomial choice

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

Nonparametric Identification and Semiparametric Estimation of Classical Measurement Error Models Without Side Information

Nonparametric Identification and Semiparametric Estimation of Classical Measurement Error Models Without Side Information Supplementary materials for this article are available online. Please go to www.tandfonline.com/r/jasa Nonparametric Identification and Semiparametric Estimation of Classical Measurement Error Models Without

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

Pseudo panels and repeated cross-sections

Pseudo panels and repeated cross-sections Pseudo panels and repeated cross-sections Marno Verbeek November 12, 2007 Abstract In many countries there is a lack of genuine panel data where speci c individuals or rms are followed over time. However,

More information

Sieve Inference on Semi-nonparametric Time Series Models

Sieve Inference on Semi-nonparametric Time Series Models Sieve Inference on Semi-nonparametric ime Series Models Xiaohong Chen y, Zhipeng Liao z, and Yixiao Sun x First Draft: October 2009; his draft: August 202 Abstract his paper provides a general theory on

More information

Informational Content in Static and Dynamic Discrete Response Panel Data Models

Informational Content in Static and Dynamic Discrete Response Panel Data Models Informational Content in Static and Dynamic Discrete Response Panel Data Models S. Chen HKUST S. Khan Duke University X. Tang Rice University May 13, 2016 Preliminary and Incomplete Version Abstract We

More information

Semiparametric Estimation of Partially Linear Dynamic Panel Data Models with Fixed Effects

Semiparametric Estimation of Partially Linear Dynamic Panel Data Models with Fixed Effects Semiparametric Estimation of Partially Linear Dynamic Panel Data Models with Fixed Effects Liangjun Su and Yonghui Zhang School of Economics, Singapore Management University School of Economics, Renmin

More information