Relaxing Conditional Independence in a Semi-Parametric. Endogenous Binary Response Model

Size: px
Start display at page:

Download "Relaxing Conditional Independence in a Semi-Parametric. Endogenous Binary Response Model"

Transcription

1 Relaxing Conditional Independence in a Semi-Parametric Endogenous Binary Response Model Alyssa Carlson Dated: October 5, 2017 Abstract Expanding on the works of Rivers and Vuong (1988), Blundell and Powell (2004), and Rothe (2009) this paper presents a new flexible conditional maximum likelihood estimator that is able to address issues previously ignored in the literature. This estimator follows the standard two step control function approach to address endogeneity of a continuous random variable and is semi-parametric in the standard preliminary infinite dimensional nuisance parameter sense. Relaxing the Conditional Independence assumption that was previously used for identification, the proposed estimator is more robust in certain respects. For instance, this estimation procedure allows for parametric specification of heteroskedasticity in which the Blundell and Powell and Rothe estimators can only address in restricted forms. In addition, following the work of Kim and Petrin (working paper), the model allows for a more flexible (although parametrically specified) control function. Standard asymptotic results for the estimator are derived including consistency, n-asymptotic normality, and an estimator for the asymptotic variance. Simulation results on parameter estimates, Average Partial Effects estimates, and Average Structural Function estimates are provided for two different specifications. The data generating process for the simulations model the empirical data given in Blundell and Powell (2004) and Rothe (2009) to give some economic context to the results. This paper concludes that there is a trade-off between a flexible specification and a structural interpretation in which the consequences of assuming Conditional Independence cannot be ignored. Keywords: Binary Response, Probit, Endogeneity, Control Function, Sieve Estimation, Heteroskedasticity PhD student at the Economics Department of Michigan State University. I would like to thank Professor Jeffrey Wooldridge, Professor Kyoo Il Kim, and Professor Ben Zou for all their direction, helpful comments, and advice. 1

2 Contents 1 Introduction 3 2 Model Set Up 4 3 Identification Multiplicative Heteroskedasticity Conditional Mean Restriction Conditional Mean and Variance Functions are Nonlinear in Parameters Estimation Average Structural Function Average Partial Effects Asymptotics Consistency n-asymptotic Normality Simulation Conditional Independence does not hold Linear Index Conclusion 31 A Equations and Notation 34 B Consistent Variance Estimation 36 C Proofs 39 C.1 Proof of Theorem C.2 Proof of Lemma B D Simulation Implementation 40 D.1 Conditional Independence does not hold D.2 Linear Index

3 1 Introduction In investigating causal effects in a binary response framework, a predominant concern is the ability to address endogeneity. For instance, in models of labor participation, one is usually concerned of the endogeneity of non-wage income as it is usually simultaneously determined in the household. Rivers and Vuong (1988), Blundell and Powell (2004), and Rothe (2009) are a series of paper that propose a control function method to address endogeneity of a continuous regressor. However all three papers either explicitly or implicitly assume Conditional Independence (CI) to simplify identification. Following the notation of Blundell and Powell (2004), y 1i = { 1 y 1i 0 0 y 1i < 0 y 1i = x i β o + u 1i (1) y 2i = π o (z i ) + v 2i where x i includes exogenous regressors (z 1i ) and the continuous endogenous regressors (y 2i ). Then the CI assumption is, u 1i v 2i, z i u 1i v 2i (2) in other words, conditional on v 2, z is independent of u 1. In Rivers and Vuong (1988), CI is a consequence of their assumptions that u 1 and v 2 are jointly independent of z. Blundell and Powell (2004) and Rothe (2009) state that the CI assumption is necessary for identification. But CI is may be too stringent of an assumption in many empirical contexts. This paper proposes an estimator that utilizes the same control function technique but extends the model by relaxing the CI assumption. Following the literature on Non-Parametric Instrumental Variables (NPIV) 1 and Kim and Petrin (working paper), the CI assumption can and in many cases should be relaxed. Kim and Petrin (working paper) relax CI in the non-parametric case where the unobserved heterogeneity is additively separable. They also provide several examples where CI is too restrictive such as returns to education, productions functions and supply and demand frameworks where the reduced form for price is non-separable. This paper extends the 1 Ai and Chen (2003), Newey and Powell (2003), Hall et al. (2005), and Blundell et al. (2007) 3

4 generalization to a binary response model where the error is not additively separable. Alternative estimators that do not require the CI assumption in the context of the control function approach are the special regressor estimator proposed in Lewbel (2000) and Dong and Lewbel (2015) and the minimum distance estimator proposed by Hong and Tamer (2003). However both of these methods require alternative conditional restrictions. The special regressor estimator requires a strictly exogenous regressor that has large support while the minimum distance estimator requires a conditional median restriction that the median of the error u 1it conditional on the instruments z i is zero. Under these alternative assumptions, the proposed estimators are valid. This paper aims to propose an alternative estimator that falls under the control function method with weaker assumptions than conditional independence. The remainder of this paper is organized as follows. Section 2 describes the set up of the model and how it relates to and differs from previous models and assumptions in the literature. The generalization proposed in this paper does have its costs in that identification is slightly more difficult to obtain. So the following section is a discussion on what restrictions are needed to obtain local and global identification. Section 4 gives instructions on how the estimator would be implemented and discusses the usual functions of interest, Average Structural Function (ASF) and Average Partial Effects (APE), and how to estimate them. Standard asymptotic results are presented in Section 5. This section relies heavily on the work in Newey (1994). Section 6 presents the results from a simulation study, comparing the proposed estimator with current estimation procedures under two scenarios The data generating process for the simulations model the empirical data given in Blundell and Powell (2004) and Rothe (2009) to give some economic context to the results. Finally, the paper concludes with a short discussion on possible directions to expand this research. 2 Model Set Up Consider the triangular system described by equation (1) where y 1i is a binary response variable, z i = (z 1i, z 2i ) is a 1 (k 1 + k 2 ) vector of co-regressors, y 2i is a single continuous endogenous regressor, and x i 4

5 is a 1 K x vector where each element is a function of (z 1i, y 2i ). Let u 1i and v 2i be mean zero unobserved heterogeneity and the function π o ( ) is allowed to be an unknown function. Unlike in the linear case (see Chapter 6.2 in Wooldridge (2010) for reference) where constructing the control function only relies on linear projection arguments, in the nonlinear case stronger assumptions are needed. In particular, the function π o ( ) will need to be the true conditional mean. This will be discussed further in the section on identification. The distributional assumptions for u 1i and v 2i determine the estimation procedure and the consistency of estimates. For example, if one were to assume u 1i v 2i, z i N(0, 1), then there is no endogeneity and no heteroskedasticity and therefore a standard probit MLE procedure will yield consistent estimates. On the other hand, if u 1i v 2i, z i N(0, exp(2z i δ)) then heteroskedasticity is present and the standard probit MLE procedure would be inconsistent but as het-probit MLE procedure would be consistent. If u 1i v 2i, z i N(ρv 2i, 1)) then the two step CMLE procedure developed by Smith and Blundell (1986) and further explored by Rivers and Vuong (1988) would be consistent and other methods that ignore the endogeneity would be inconsistent. More generally, if the CI assumption holds such that u 1i v 2i, z i u 1i v 2i with some unknown distribution Blundell and Powell (2004) (for the remainder of the paper referred to as BP) and Rothe (2009) provide semi-parametric methods that estimates the parameters consistently. As a first step in relaxing the CI assumption, the proposed estimation procedure will require that conditional distribution is assumed to be normal with known conditional mean and conditional variance functions as stated in Assumption further generalizations to unknown conditional mean, conditional variance or distributions will be left to future research. Assumption 2.1. From the set up in equation (1), let the unobserved heterogeneity have the following conditional distribution ( ) u 1i z i, v 2i, y 2i = u 1i z i, v 2i N h(v 2i, z i ; γ o ), exp(2 g(v 2i, z i ; δ o ) Where z i = (z 1i, z 2i )) and h(v 2i, z i ; γ o ) and g(v 2i, z i ; δ o ) are known functions up to a finite number of unknown parameters (γ o, δ o ). Under Assumption (2.1), the conditional mean of y 1i is: 2 The normality assumption could be easily generalized to just a known distribution with CDF G( ). Obviously this would then allow for the logistic distribution and therefore this paper would generalize to the logit model as well. 5

6 ( ) xi β o + h(v 2i, z i ; γ o ) E(y 1i z i, y 2i, v 2i ) = Φ exp(g(v 2i, z i ; δ o )) (3) This result should be unsurprising as it appears to be a heteroskedastic probit model that adjusts for endogeneity using the control function approach, both of which have been discussed extensively in the literature before. Since the normal distribution is indexed by its mean and variance, relaxing CI with a normality assumption is simply allowing for heteroskedasticity and a flexible control function in a probit model. As a result, estimation will follow a simple two step approach where the conditional mean of y 2i is estimated in the first stage to obtain residuals (ˆv 2i ) that will be plugged into a second step heteroskedastic probit using the conditional mean given in equation (3). This will be discussed with more detail in Section 3. There are several important implications of Assumption 2.1 in comparison to the standard CI assumption. First, Assumption 2.1 allows the control function to be a general function of (v 2i, z i ) whereas CI implies that the control function cannot be a function z i. In the linear model this was not an issue since the control function was derived using a projection argument. Consider the example of the demand model from Kim and Petrin (working paper) where the outcome (y 1i ) is demand for the product, the endogenous variable (y 2i ) is price, the exogenous variables (z i ) are observable characteristics, and latent error (u 1i ) can be interpreted as the unobservable characteristics. Then CI requires the expectation of unobservable characteristics of a product conditional on the price and observable characteristics to be just a function of the price. One could imagine how the observable characteristics may interact with price in the conditional expectation. For example, consider the demand for purchasing a home where the unobservable characteristic is quality of the neighbors (ie: are they loud neighbors). Then observable characteristics such as proximity to the neighbors (ie: apartment versus spaced out homes) and affluence of the neighborhood would have interactive effects on expected price of the home. Therefore CI may be too strong of an assumption. Second, CI implies that, u 1 conditional on v 2 cannot be heteroskedastic in z. Assumption 2.1 allows for general forms of heteroskedasticity through the function g(v 2i, z i ; δ o ) 3. This is particularly relevant in the 3 Note that assuming the conditional variance is an exponential function is not restrictive, it merely enforces non-negativity for the variance and allows for g(v 2i, z i ; δ o) to be unrestricted which eases estimation 6

7 nonlinear model where unaccounted heteroskedasticity can result in inconsistent parameter estimates. BP and Rothe claim that although they impose CI, their estimation procedure should be able to handle limited forms of heteroskedasticity. This proposition will be addressed further in the section on deriving the ASF and explored in the simulation study. The third implication is the distributional assumption and possible misspecification of the probit functional form. This is particular pertinent in contrast to estimators from BP and Rothe who have no distributional assumptions. Although the distributional assumption may not hold in empirical contexts, having the probit functional form simplifies identification and estimation. This could be relaxed in future research. In the following section, identification will be shown for u 1i v 2i, z i normally distributed and the functions h(v 2i, z i ; γ) and g(v 2i, z i ; δ) are known. To see why we cannot relax both CI and a distributional assumption, consider the most general case: let F ( ; v 2i, z i ) be the conditional cdf of u 1i v 2i, z i, then, E(y 1i v 2i, z i ) = P ( u 1i x i β o v 2i, z i ) = F (x i β o ; v 2i, z i )) where F ( ; v 2i, z i ) would be estimated non-parametrically. But since y 2i is in x i and is perfectly determined by z i and v 2i, identification would not hold. Therefore some structure must be imposed in order to generalize. There will always be instances where one can argue which assumption should take precedence. However, one of the benefits of assuming a known distributional functional form is the computational ease of implementation. In particular, the proposed estimation procedure can be executed using preprogrammed commands in common statistical packages and can be computed quite quickly. This is compared to the estimation procedure described BP and Rothe (2009) which are much more computationally demanding. 4 3 Identification Since the estimation process will be done in two stages, identification needs to be shown in both steps. Identification of the first stage is a standard application of non-parametric identification of a conditional mean. 4 Rothe provides code in R for his estimator which makes it fairly easy to implement. 7

8 The exogeneity condition for the reduced form of y 2i is a decomposition of its conditional mean (π o (z 1i, z 2i )) and unobserved heterogeneity (v 2i ). Therefore the reduced form is a model without endogeneity. 5 This is stated in the following lemma, Lemma 3.1 (First Stage Identification). Consider the set-up in equation (1), let E(v 2i z i ) = 0 such that E(y 2i z i ) = π o (z 1i, z 2i ), then the function π o (z 1i, z 2i ) is non-parametrically identified. Notice that the assumption in this lemma is much stronger than the usual projection argument when the control function approach is used in linear regression. When using a projection argument, one may always write y 2i = z i ρ + v 2i with E(z i v 2i) = 0 but z i ρ is not necessarily the conditional mean. Identification of the second stage parameters β o, γ o, and δ o requires more thought and depends on the functional forms of h(v 2i, z i ; γ o ) and g(v 2i, z i ; δ o ). Let s first consider identification in the most common setting in which h(v 2i, z i ; γ o ) = h(v 2i, z i )γ o and g(v 2i, z i ; δ o ) = g(v 2i, z i )δ o are linear in parameters. There are two major issues that need to be addressed for identification: multiplicative heteroskedasticity and conditional mean restriction. The next two subsections address both issues in the linear case. 3.1 Multiplicative Heteroskedasticity Carlson (Working Paper) addresses the issue of identification with exponential multiplicative heteroskedasticity and provides sufficient conditions for identification. From Theorem 1 of Carlson (Working Paper) the following would be sufficient for the identification of the parameters in the heteroskedastic probit model Assumption 3.1. In the set-up in equation (1) where h(v 2i, z i ; γ o ) = h(v 2i, z i )γ o and g(v 2i, z i ; δ o ) = g(v 2i, z i )δ o are linear in parameters, (i) E[(x i, h(v 2i, z i )) (x i, h(v 2i, z i )] is non-singular and x i includes a constant. (ii) E[g(v 2i, z i ) g(v 2i, z i ] is non-singular and does not include a constant. (iii) The joint support of (x i, h(v 2i, z i ), g(v 2i, z i )) has at least three points (iv) The parameter space of (β o, γ o ) does not allow for the coefficient on the constant to be 0 or the coefficients on the remainder of the terms to be all 0. Parts (i) and (ii) are fairly standard in the literature. Parts (iii) and (iv) are needed to insure there is no manipulation of the support or heteroskedastic transformation that does not allow for separate identification 5 As discussed in Chen (2007), this implies the true parameter π o is identified as the unique maximizer of Q(π) = E[(y 2i π(z i )) 2 ] 8

9 of the mean parameters β o and γ o and the heteroskedastic parameters δ o. Although part (i) does not appear to be restrictive, it does make assumptions on the relationship between x i and h(v 2i, z i ). The next section discusses this issue in more detail. 3.2 Conditional Mean Restriction The random variables that compose the elements in x i and h(v 2i, z i ) are the same since y 2i is a function of z i and v 2i, therefore it is quite likely that even if E[x i x i] and E([h(v 2i, z i ) h(v 2i, z i )] are non-singular, Assumption 3.1 (i) may not be satisfied. This section aims to develop a sufficient condition on the construction of the control function to insure identification. Since Assumption 3.1 (i) essentially requires that none of the elements in x i and h(v 2i, z i ) can be written as linear combinations of the elements, a sufficient assumption is the Conditional Mean Restriction (CMR) Assumption 3.2. Given the set up in Assumption 2.1 with the function h(v 2i, z i ; γ o ) = h(v 2i, z i )γ o is linear in its parameters, (i) E(x i x i) is non-singular. (ii) E(h(v 2i, z i ) h(v 2i, z i )) is non-singular. (iii) (CMR) E(h(v 2i, z i ) z i ) is a zero vector. The Conditional Mean Restriction is from Kim and Petrin (working paper) and the literature on nonparametric IV (Newey and Powell (2003), Hall et al. (2005), Blundell et al. (2007) and others). Kim and Petrin (working paper) uses a similar assumption to show identification in a non-parametric triangular system with an additively separable error. The CMR can be interpreted as a way to distinguish the endogeneity of y 2i and the exogeneity of z i. Previous papers have utilized the CI assumption (Newey et al. (1999), Blundell and Powell (2004), Rothe (2009)), equation (2), and as a result, the distribution of u 1i conditional on v 2i and z i can not be a function of z i. CI insures identification since v 2i (and any function of v 2i ) is linearly independent of x i by construction. After removing the CI assumption z i and u 1i are allowed to have a relationship but the CMR restricts that relationship so z i cannot be interpreted as an endogenous variable. By law of iterated expectations, E(u 1i z i ) = E(E(u 1i z i, v 2i ) z i ) = E(h(v 2i, z i )γ o z i ) = 0 9

10 The middle equality holds by the specification provided in Assumption 2.1 and the last equality holds by the CMR. As a result, the CMR merely requires that z i be mean independent of u 1i. To provide some intuition for the implications, the CMR requires the elements of h i to be random and to be demeaned. For instance, v 2 2i could not be an element of h i, but v 2 2i E(v2 2i z 1i, z 2i ) could be. In addition, no element can only be a function of (z 1i, z 2i ) they can only enter as an interaction with functions of v 2i. This prevents any issues of linear dependence between elements of x i and h(v 2i, z i ). To show, let ξ be a nonrandom vector such that ( xi h(v 2i, z i ) ) ( ) ξ 1 = 0 ξ 2 x i ξ 1 + h(v 2i, z i )ξ 2 = 0 Taking the conditional expectation with respect to z i, E(x i z i )ξ 1 + E(h i z i )ξ 2 = 0 E(x i z i )ξ 1 = 0 By standard exclusion and relevance conditions on the instrument z 2i and linear independence of x i, E(x i z i ) is also linearly independent. Therefore ξ 1 is a zero vector and it follows that ξ 2 is a zero vector. Therefore Assumption 3.2 is sufficient for Assumption 3.1 (i). The next subsection extends this discussion to the nonlinear case. Since the specifications for the control function and heteroskedastic function are left to be general, identification cannot be shown without knowing the functional forms. Consequently, this section shows how one would go about showing identification given a known nonlinear functional form. 3.3 Conditional Mean and Variance Functions are Nonlinear in Parameters If h(v 2i, z i ; γ o ) and g(v 2i, z i ; δ o ) are nonlinear in the parameters showing identification will follow the works of Rothenberg (1971) and Komunjer (2012). Using Theorem 1 of Rothenberg (1971), Local identification requires verification that the information matrix is full rank along with some other regulatory conditions. 6 6 The standard regulatory assumptions are that the parameter space is open, the support of the random variables do not depend on the values of the parameter, the functions h(v 2i, z i ; γ o) and g(v 2i, z i ; δ o) are continuously differentiable in γ o and δ o 10

11 Let θ o = (β o, γ o, δ o) then let S i (θ o, π o ) denote the score of the log-likelihood for a probit model derived using the conditional mean given in equation (3). Then the information matrix can be written as, I(θ o ) =E[S i (θ o, π o )S i (θ o, π o ) ] [ x i x i x i h γ (x i + h(v 2i, z i ; γ o ))x i g ] δ =E ω(x i, z i ; θ o, π o ) h γx i h γh γ (x i + h(v 2i, z i ; γ o ))h γg δ (x i + h(v 2i, z i ; γ o ))g δ x i (x i + h(v 2i, z i ; γ o ))g δ h γ (x i + h(v 2i, z i ; γ o )) 2 g δ g δ (4) Where h γ and g δ are the row vector partial derivatives of the functions h(v 2i, z i ; γ) and g(v 2i, z i ; δ) with respect to γ and δ (respectively) and evaluated at γ o and δ o (respectively), and ω(x i, z i ; θ o, π o ) is a non-zero scalar function of it s arguments. 7 Verification of full rank will reduce to verification of linear independence of the random row vectors (x i, h γ ) and g δ where an analogue of the CMR will be used, as well as no elements of g δ being a constant term. This can be extended to global identification following Theorem 2 of Komunjer (2012). However, this does require that the determinant of the information matrix be non-positive for all values of the parameters in the parameter space and that the expectation of the score be a proper function of the parameters θ. This is much more difficult to show or provide intuition without providing the functional forms of h(v 2i, z i ; γ o ) and g(v 2i, z i ; δ o ). 4 Estimation The estimator follows a two step procedure as an application of a semi-parametric two-step estimator. First, estimate the conditional mean function E(y 2i z i ) = π o (z 1i, z 2i ) non-parametrically. This could be done via the sieve method where Π denotes a space of functions that includes π o ( ) with a metric induced by the norm π. Then let {p lln (z i ), l = 1, 2,..., L n } be a sequence of basis functions of z i. Letting Z be the support of z i then the sieve spaces are defined as: Π Ln = {π : π = l<l n p lln (z)λ l, λ l R dim(plln (z)), z Z} (5) respectively, and θ o is a regular point of the information matrix I(θ) the where θ o = (β o, γ o, δ o ). 7 For it to be non-zero, β o is required to be non-zero as well as sufficient support of x i so that x i β o is non-zero. This assumption is paralleled in the linear case where (β o, γ o) is required to be non-zero to identify the heteroskedastic component. 11

12 where L n, (L n /n 0) so that Π L Π L+1... Π. Consider a non-parametric multivariate least squares regression where the population criterion function is Q(π) = E[(y 2i π(z i )) 2 ] (6) Then the series estimator would maximize the sample criterion function such that ˆπ = arg min π Π L 1 ˆQn (π) arg min π Π L n n (y 2i π(z i )) 2 (7) In practice this is as simple as OLS estimation where if P Ln (z i ) = (p 1Ln (z i ), p 2Ln (z i ),..., p LnL n (z i )), then estimate of the conditional mean is i=1 ( n ) 1 n ˆπ(z i ) = P Ln (z i ) P Ln (z k ) P Ln (z k ) P Ln (z j ) y 2j k=1 j=1 = P Ln (z i )ˆλ where the residuals would be constructed as ˆv 2i = y 2i ˆπ(z i ) and used in the second stage. In determining a sieve space, for a moment suppose z i is a scalar, then if the support of z i is a the unit interval [0, 1], one could consider a simple linear sieve such as polynomials. 8 Then the polynomial sieve space is, Π P oly L n = {π : π = l<l n λ l z l, λ l R, z [0, 1]} (8) This can be extended to a multi-dimensional space using tensor products. One could also consider splines or orthogonal wavelets in place of the polynomial sieve if there is bounded support. However, if z i has unbounded support, it would be more appropriate to use a Hermite polynomial or Laguerre polynomial sieve space. In Section 4 where asymptotics are derived, it will be assumed that a polynomial sieve space is utilized. Since only the convergence rate of the first stage estimator comes into play when deriving n-asymptotic normality for the second stage parameters, any non-parametric first stage estimator can be consider in place of the series estimator such as a kernel based method. Then in the second step, one would maximize the following likelihood 8 Of course, any bounded random variable with known bounds can be transformed into a random variable with support [0, 1] 12

13 L(y 1i, x i, z i ; θ, ˆπ) = 1 n n i=1 [ y 1i log Φ ( )] [ xi β + h(ˆv 2i, z i ; γ) + (1 y 1i ) log 1 Φ exp(g(ˆv 2i, z i ; δ)) ( )] xi β + h(ˆv 2i, z i ; γ) exp(g(ˆv 2i, z i ; δ)) with respect to β, γ and δ to obtain estimates of the parameters. This can be as simple as running a heteroskedastic probit, a standard preprogrammed command in many statistical packages. However, standard errors need to be adjusted to account for the variation from using the first stage estimates in which a simple solution would be to bootstrap the standard errors. In many instances, the parameters are not of much interest, but rather the average structural function and the partial effects. These next two sections will provide estimators for the ASF and APE and compare them with other estimators standard in the literature. (9) 4.1 Average Structural Function Consider a general, possibly nonlinear, model y = m(x, u), then from BP, the ASF for y is E u [m(x o, u)] evaluated as the non-random vector x o and averaged over the unobserved heterogeneity u. The ASF for the model considered can be derived using law of iterated expectations, [ ( x ASF (x o o )] β o + h(v 2i, z i ; γ o ) ) = E Φ exp(g(v 2i, z i ; δ o )) (10) which can be estimated by approximating the expectation (with respect to v 2i and z i ) with a sample average and plugging in the parameter estimates for β o, γ o, and δ o and the first stage residuals for v 2i. Now I would like to compare this to the ASF s derived in BP and Rothe. Recall that they assume CI so that equation (2) holds. By doing so they have no distributional assumptions whereas this paper imposes a normality assumption. Following the same set up as in equation (1), let G( ; v 2i ) be the unknown CDF of u 1i v 2i. Because of the CI assumption, u 1i is independent of z i conditional on v 2i. Therefore G( ; v 2i ) is the conditional CDF of u v 2i, z i. Fixing x i = x o, averaging out the unobserved heterogeneity u 1i, and by law of iterated expectations, ASF (x o ) = E(G(x o β; v 2i )) (11) where the expectation is with respect to v 2i. Comparing equations (10) and (11), highlights the differences 13

14 in the two approaches. The proposed approach requires the normality assumption but by doing so, is able to specify more flexible functions for the conditional mean and variance of u 1i v 2i, z i. However if the normality assumption fails (which may be very likely), then this method is no longer valid. A small consolation derived by Ruud (1983), if E(x i x i β) is linear in x i β (which is true for x i distributed multivariate normal) then MLE assuming normality will produce consistent estimates of a scaled β o. Consequently APE estimates would still be consistent. The BP approach relaxes the distributional assumption and in doing so, requires that the unobserved heterogeneity u 1i is conditionally independent of the instruments z i. This generally rules out z i entering the control function and any heteroskedasticity in z i. But BP argues that their estimation method could technically allow for heteroskedasticity in terms of the linear index x i β where the CI assumption is relaxed to u i v 2i, z i u i v 2i, x i β (12) (equivalent to equation 2.2b in Rothe). This is because the function G(x i β o ; v 2i ) is merely estimated as an approximation over the two arguments and therefore could represent any function of x i β o and v 2i which is not restricted to those that impose CI. It is important to note that this is still a fairly strong restriction on the conditional distribution of u i v 2i, z i. For a moment, imagine the best case scenario in which CI does not hold but the linear index condition (equation (12)) does hold. Therefore the estimators in BP and Rothe (2009) should consistently estimate the function G(x i β o ; v 2i ) which is the conditional distribution of u 1i given v 2i and z i. Now to re-examine the ASF in equation (11), and what is being averaged out. Since it is only averaging out the second argument, v 2i, it would not be averaging out any the effect of the heterogeneity due to the linear index (ie: the parts of the conditional CDF, P ( u 1i < c v 2i, x i β o ) that include the linear index, but not at the point of evaluation). Therefore, even if their estimation procedure could handle limited forms of heteroskedasticity and flexible control function, I would argue that their estimation of the Average Structural Function is no longer capturing what they stress is important. 9 This will be reflected in the simulation study presented in 9 Rothe (2009) considers this case in their simulation study with Design III where there is heteroskedasticity in the latent variable error in terms of the linear index Xβ, but they only report results on coefficient estimates and not ASF or APE 14

15 section Average Partial Effects The APE captures the causal effect that a regressors has on the outcome variable averaged over the distribution of the explanatory variables. At first impression, it would seem appropriate, as many statistical packages do, to calculate average partial effects from the conditional mean, ( ) E(y x) AP E = E x (13) This definition is correct in the simple cases such as linear regression and a probit model without endogeneity or heteroskedasticity. In the more complicated nonlinear models like a binary response model where the distribution of the unobserved heterogeneity depends on the explanatory variables, for instance through its mean (endogeneity) or through its variance (heteroskedasticity), then the above definition for the APE is incorrect. As clearly put in Lin and Wooldridge (2015), if E(Y X) were the object of interest then decades of published research on accounting for [endogenous explanatory variables] in econometric models would be irrelevant. They go on to argue that an APE derived from the partial effect on the ASF is intuitive, can be derived via counter-factual reasoning, and has the desirable property that it corresponds to the parameter of interests in linear models with endogeneity. As a result, this paper derives the APE from the ASF. First consider the partial derivative of the average structural function with respect to the jth element of x o assuming regularity conditions that would allow passing the derivative through the integral. [ ( x o )] β o + h(v 2i, z i ; γ o ) / E Φ x o exp(g(v 2i, z i ; δ o )) [ ( x o β o + h(v 2i, z i ; γ o ) j = E Φ exp(g(v 2i, z i ; δ o )) [ ( x o ) β o + h(v 2i, z i ; γ o ) = E φ exp(g(v 2i, z i ; δ o )) ) / ] x o j β j,o exp(g(v 2i, z i ; δ o )) Then the APE averages the partial derivative over the distribution of x. Using sample averages and consistent estimates of the parameters, the APE s can be estimated with: ] AP E j = 1 n n k=1 1 n [ ( ) ] n x k ˆβ + h(ˆv2i, z i ; ˆγ) ˆβ j φ exp(g(ˆv 2i, z i ; ˆδ)) exp(g(ˆv 2i, z i ; ˆδ)) i=1 (14) estimates. 15

16 Notice that the inner sum is taken with respect to the explanatory variables in the control function and heteroskedasticity and the outer sum is averaging with respect to the explanatory variables in the structural function. 5 Asymptotics This section presents the standard asymptotic results for the proposed estimator. Useful equations and notations are defined in Appendix A. Proofs of theorems and lemmas that are not direct restatements of previous results in the literature are given in Appendix C. In general, since this is a two-step semi-parametric estimator with a conditional mean being non-parametrically estimated in the first step, consistency and n- asymptotic normality follow from Newey (1994). This does require that the score of the log-likelihood be continuous in the second stage parameters over the entire parameter space. Although it is beyond the scope of this paper, this assumption can be relaxed following Theorem 1 of Chen et al. (2003). Moreover, it is assumed that the first stage is estimated using a polynomial sieve. This could be extended to other sieve spaces or other non-parametric estimation technique as long as their convergence rates are known. 5.1 Consistency For now, we will assume consistency of the first stage estimates. Later when deriving asymptotic normality we will be deriving the convergence rate of the first stage estimates and therefore deriving consistency will be redundant. Consistency of the second stage estimates follows directly from Lemma 5.2 of Newey (1994). Let θ = (β, γ, δ ) have support Θ and W i = (y 1i, y 2i, z i ) have support W where the maximum likelihood estimation can be formated to fit a generalized method of moments framework. Recall the sample loglikelihood function in equation (9) where the first stage parameter π will enter the log likelihood through v 2i and let S i (θ, π) denote the score of the likelihood (explicitly written in Appendix A). Note that the MLE parameter estimates also solve the sample minimization problem: min θ Θ 1 n n S i (θ, ˆπ) 2 (15) i=1 16

17 where the norm,, is defined as A = (tr(a A) 1/2 ) for any matrix A. This follows the framework of Newey (1994) where the weighting matrix is the identity matrix. Moreover, let d denote the Sobolev norm define as, f(x) d = sup λ d x X sup λ f(x) The following two assumptions correspond to Assumptions 5.4 and 5.5 of Newey (1994) and are needed to apply Lemma 5.2. Assumption 5.1 (5.4 of Newey (1994)). There are ɛ, b(w i ), b(w i ) > 0 such that, (i) for all θ Θ, S i (θ, π o ) is continuous in θ with probability 1 and S i (θ, π o ) b(w i ); (ii) S i (θ, π) S i (θ, π o ) b(w i )( π π o 0 ) ɛ This assumption provides sufficient conditions for uniform convergence or sup θ Θ 1/n n i=1 S i(θ, ˆπ) E(S i (θ, π o )) P 0. Part (i) should be easily verifiable as long as h(v 2i, z i ; γ) and g(v 2i, z i ; δ) are continuous. Note that part (ii) is a smoothness condition on S i (θ, π) in π and to derive any lower level assumptions, one would need to know more specifically how π (or equivalently v 2i ) enters the control function and the heteroskedastic function. The following is a standard assumption for identification. Assumption 5.2 (5.5 of Newey (1994)). (i) E(S i (π, θ)) has a unique solution on Θ at θ o (ii) Θ is compact. Section 3 discussed sufficient (lower level) assumptions for part (i) (global identification). The following theorem gives consistency of the second stage estimators. Theorem 5.1 (Consistency of θ). Suppose that {W i } n i=1 are i.i.d, ˆπ π o 0 = o p (1), and that Assumptions hold. Then, ˆθ θ o = o p (1) The proof is a direct application of Lemma 5.2 of Newey (1994). Assumptions correspond to Assumptions 5.4 and 5.5 of Newey (1994). 17

18 5.2 n-asymptotic Normality This section derives the asymptotic normality of the second stage estimates using Lemma 5.3 and Theorem 6.1 of Newey (1994). Lemma 5.3 is the general n-asymptotic normality result for two step semi-parametric estimators while Theorem 6.1 applies Lemma 5.3 to the case where the first stage is a least squares projection using a series estimation method. Since the estimator is based on a general model where the functions h(v 2i, z i ; γ), and g(v 2i, z i ; δ) are not specified, some of the assumptions are rather conditions that should be verified given particular specifications of the functions. Denote the derivative of the score with respect to theta as θ S i (θ, π) and define Γ E[ θ S i (θ o, π o )] (both explicitly defined in Appendix A). The following set of assumptions are (or sufficient for) Assumptions (5.6), (6.1)-(6.7) of Newey (1994) that are needed to apply Theorem 6.1 of Newey (1994) to obtain n- asymptotic normality. The following correspond with Assumption (5.6) of Newey (1994), Assumption 5.3 (5.6 of Newey (1994)). (i) θ int(θ); (ii) there is an ɛ > 0, a d 0, and a neighborhood N of θ o such that for all π π o d < ɛ, S i (θ, π) is differentiable in θ on N ; (iii) Γ is non-singular; (iv) E[ S i (θ o, π o ) 2 ] < (v) Assumption 5.1 is satisfied with S i (θ, π) equal to a column of θ S i (θ o, π o ). This assumption is sufficient for uniform convergence of the Jacobian terms. Generally part (i) is assumed whiles parts (ii), (iv), and (v) can be verified given specifications of h(v 2i, z i ; γ) and g(v 2i, z i ; δ). Recall that in the discussion on identification, part (iii) of this assumption is needed for the second stage parameters to be locally identified. The next set of assumptions are used to derive the convergence rates of the first stage estimator under the Sobolev norm. Assumption 5.4 (6.1 and 6.2(i) of Newey (1994)). (i) E[(y 2i π o (z i )) 2 z i ] < (ii) The smallest eigen value of E[P Ln (z i ) P Ln (z i )] is bounded away 0. The first part is standard in the literature and difficult to relax without affecting the convergence rates. The second part holds when there is no perfect multicollinearity in the sequence of polynomial basis functions 18

19 of z i. The next assumption provides the remaining necessary conditions in deriving the convergence rates for power series. Assumption 5.5 (Assumptions 8 and 9 of Newey (1997)). (i) The support of z i is a Cartesian product of compact connected intervals on which x has a probability density function that is bounded away from zero; (ii) π o (z i ) = E(y 2i z i ) is continuously differentiable of order s on the support of z i. The following is a restatement of Theorem 4 from Newey (1997) and provides the convergence rates results. Lemma 5.1 (Theorem 4 of Newey (1997)). Suppose {(y 2i, z i )} n i=1 satisfied and L 3 n/n 0 then is i.i.d. If Assumptions are where K 1 + K 2 is the dimension of z i. ˆπ π o 0 = O(L n [ L n / n + L s/(k1+k2) n ]) (16) L 3 n/n 0 is needed to place limits on the growth of the series terms. The remainder of the assumptions needed to show asymptotic normality places conditions on the linearized function D(W i, π; θ, π). This function is linear in π such that on could write D(W i, π; θ, π) = D(W i ; θ, π) π and in this context it will be the path-wise derivative of the score with respect to the function π( ) (explicitly written in Appendix A), The next assumption states conditions that linearization D(W i, π; θ, π) needs to satisfy in order to be considered a good approximation of S i (θ o, ˆπ) S i (θ o, π o ), primitive assumptions for stochastic equicontinuity for the linear function D(W i, π; θ, π), and lower level conditions for mean square continuity that allows the first stage estimator to be n-consistent. Assumption 5.6 ( of Newey (1994)). (i) There are ɛ, b(w i ) > 0 such that for all θ θ o < ɛ and π π o 0 < ɛ, S i (θ, π) S i (θ, π) D(W i, π π; θ, π) < b(w i )( π π o 0 ) 2 such that E[b(W i )] < ; (ii) There is a b(w i ) > 0 such that E[ b(w i ) 2 ] < and D(W i, π; θ o, π o ) < b(w i ) π 0 (iii) Let d(z i ) be defined as, [ ( d(z i ) =E E [l xx y 1i, x ) ] ( ) iβ o + h(γ o ) hv (γ o ) + g v (δ o )(x i β o + h(γ o )) y2i, z i exp(g(δ o )) exp(2g(δ o )) 19

20 x i h γ (γ o ) (x i β o + h(γ o )) g δ (δ o ) zi ] is defined in equation (27) and let d(z i ) be continuously differ- ( ) ] where E [l xx y 1i, xiβo+h(γo) y2i exp(g(δ o)), z i entiable of order s on the support of z i such that nln (s+ s)/(k1+k2) (17) 0 and L 2 s/(k1+k2) n 0; This assumption should be verifiable given functional forms for h(v 2 i, z i ; γ) and g(v 2i, z i ; δ). Part (i) corresponds to 6.4(i) of Newey (1994) and to show (6.4)(ii) requires sup P Ln (z) [(L n /n) 1/2 + Ln s/(k1+k2) ] 0 (18) z Z n[sup P Ln (z) ] 2 [L n /n + Ln 2s/(K1+K2) ] 0 (19) z Z Part (ii) corresponds to the first part of 6.5 of Newey (1994) whereas the second part requires Ln 2s/(K1+K2) 0 (20) ) 1/2 p lln (z i ) 2 0 [(L n /n) 1/2 + L s/(k1+k2) n ] 0 (21) ( Ln l=1 Equations (18)-(21) will hold under rate and smoothness conditions specified in the Theorem. Part (iii) is sufficient for 6.6(ii) of Newey (1994) in which 6.6(i) assumes the existence of the correction term α(w i ). Since the first stage is estimating a conditional mean, by proposition 4, Newey (1994) shows that the correction term will be of the form: α(w i ) = d(z i )[y 2i π o (z i )] where d(z i ) satisfies E[D(W i, π; θ o, π o )] = E[d(z i )π] (22) The existence of such a function d(z i ) follows from the Reisz Representation Theorem. The proposed d(z i ) from Assumption 5.6 is shown to satisfy equation (22) in Appendix A. Last, we need to define the asymptotic variance, V Avar( n(ˆθ θ o )) = Γ 1 + Γ 1 E[d(z i )[y 2i π o (z i )] 2 d(z i ) ]Γ 1 (23) that follows from equation (5.2) of Newey (1994). Finally, the general n-asymptotic normality result is 20

21 stated below, Theorem 5.2 (Asymptotic Normality of θ). Suppose that θ o Θ satisfies E[S i (θ o, π o )] = 0 (or that the specification in Assumption 2.1 holds) and {W i } n i=1 is i.i.d., then under Assumptions and the following growth rate and smoothness conditions (i) L 6 n/n 0 (ii) s/(k 1 + K 2 ) > 5/2 for V defined in equation (23) n(ˆθ θ o ) N(0, V ). Growth rate conditions (i) and (ii) are the growth rate conditions on L in which equations (18)-(19) and (20)-(21) can be shown to hold. First, since the estimation procedure is using a polynomial sieve, by Lemma A.15 of Newey (1995), sup z Z P Ln (z) < CL n for some constant C and p lln (z i ) 0 < L 1/2 n. Then equations (18)-(21) become, (L n /n) 3/2 + L 1 s/(k1+k2) n 0 (18 ) n[l 3 n /n + L 2 2s/(K1+K2) n ] 0 (19 ) L 2s/(K1+K2) n 0 (20 ) (L n /n) 3/2 + L 1 s/(k1+k2) n 0 (21 ) which easily hold given the rates and smoothness conditions (i) and (ii) of Theorem 5.2. An estimator for the asymptotic variance is provided in Appendix B along with regulatory conditions needed for the estimator to be consistent. Because of the numerical equivalence results of Newey (1994), Chen (2007), and Ackerberg et al. (2012), the consistent estimator for the asymptotic variance is equivalent to estimator for the asymptotic variance if the first stage parameters were finite dimensional. As a result, implementing the calculations for correct standard errors is fairly simple. 6 Simulation I will consider two different simulation designs in order to highlight the impact of the CI assumption. The first design specifies the conditional distribution of u 1i to be normal but CI does not hold in the conditional mean (control function) or conditional variance (heteroskedasticity). The second design examines in more detail the ability of the BP and Rothe estimators to handle relaxing the CI assumption in terms of the linear 21

22 index x i β o. In both designs, I will compare the proposed estimator to a sieve variation on the BP and Rothe estimators as well as standard parametric approaches probit, probit with a control variable, and linear probability models such as OLS and 2SLS. 6.1 Conditional Independence does not hold To give this simulation some context, consider the BP application of British male (without college education) labor force participation during 1985 to Consider the latent variable setting as described in equation (1) where x i = (z 1i, y 2i ). In this application, y 1i is labor force participation, z 1i is education level of the husband (and possibly other factors that can influence the market wage level), and y 2i is the log of other (non-wage) income. The usual argument for endogeneity is that other non-wage income is contemporaneously determined with labor force participation. They consider two instruments: the first, z 21i is the potential welfare benefits entitlement of the family if neither spouse was working 10 and the second, z 22i is the wife s education level. The education variables, z 1i and z 22i, are indicators for whether or not the student stayed on after the minimum school leaving age of 16. As per usual, the argument is potential welfare benefits entitlement and wife s education level should have no direct impact on the husband s labor force participation except through the non-wage income. Why should CI not hold in this set up? Consider the scenario where individual i has a particularly large shock to his outside income (inheritance, lottery, ect.). I would argue that the probability that he is in the labor force depends on his education level. For instance, someone with higher education is more likely pursuing a passion and therefore a positive shock to income would likely not dissuade them from working. In addition, one would expect more variability among lower educated individuals than higher educated. As a result, I will construct the conditional distribution of u 1i conditional on z i and v 2i as, ( ) u zi 1i, v 2i N v 2i γ 1 + (v2i 2 σv)γ z 1i v 2i γ 3, exp(2 (z 1i δ 1 + z 22i δ 2 )) (24) such that CI does not hold. Furthermore, the conditional distribution cannot be written as a function of v 2i and the linear index x i β, therefore we would expect the BP and Rothe estimator to fail in both the ASF 10 In the application of BP, this variable is constructed from local welfare benefit rules, the demographic structure of the family, the geographic location, and housing costs. 22

23 and APE estimates as well as the coefficient estimates. Table 1: Comparison of Summary Statistics Blundell and Powell Simulated Data Variable Mean Std dev. Mean Std dev. Work (y 1 ) Education > 16 (z 1 ) ln(other income) (y 2 ) ln(benefit income) (z 2 1) Education (spouse) (z 2 2) ,000 simulations of a sample size of 1,606 The construction of the random variables z 1i, z 21i, z 22i, and y 2i is discussed in Appendix D. Generating 1,000 simulations of a sample size of 1,606, the Table 1 presents the summary statistics from BP as well as the summary statistics from the simulated samples. The parameter values in the construction of the random variables z 1i, z 21i, z 22i, and y 2i are calibrated to so the summary statistics of the simulated data match those from BP. Table 2 presents the results for parametric specifications (corresponding to Table 4.2 of BP). Table 2: Comparison of Parametric Estimators Blundell and Powell Simulated Data Variable Reduced Form Probit Probit (CF) Reduced Form Probit Probit (CF) (1) (2) (3) (4) (5) (6) Education z (0.0224) (0.1474) (0.1677) (0.0005) (0.0253) (0.0257) ln(other inc) y (0.1299) (0.6624) (0.0054) (0.0410) ln(benefit inc) z (0.0093) (0.0005) Education(sp) z (0.0219) (0.0004) R Standard errors (for Blundell and Powell) and standard deviations (for Simulated Data) are given in parenthesis. Omitted is the estimates of the intercept and the coefficient on the control function (ˆv 2i ) in columns (3) and (6). The simulated data is not an exact replica of the BP data, but it still provides the main takeaway in which the impact of adjusting for endogeneity is quite dramatic. This is reflected in the BP data in the comparisons of the coefficient estimates in columns (2) vs (3). In the simulated data we see a similar effect 23

24 when comparing columns (5) and (6). Table 3: APE Results and Simulated Distribution (True APE = ) Specification Mean SD 10% 25% 50% 75% 90% Het-2SCML Het-2SCML (true first stage) BP and Rothe (Sieve) Probit Probit (CF) Lin. Prob. (OLS) Lin. Prob. (2SLS) Het-2SCML is the proposed estimator, Het-2SCML (true first stage) is the proposed estimator using the true values of the control variable (v 2i ) instead of the reduce form residuals, and BP and Rothe is the sieve analogue of the BP and Rothe estimators. It is estimated using polynomials of (x i β, ˆv 2i ) up to order 3. Table 3 provide the APE estimates using different estimation specifications on the simulated data. The first column provides the mean of the APE estimates over the simulations, the second column is the standard deviation, and the last five columns are the quantiles of the empirical distribution of the simulated APE estimates. The first step (when applicable) uses a polynomial sieve up to order three. The first three rows are semi-parametric estimators, the second two are parametric probit models, and the last two are linear probability models. The most prominent result is that addressing endogeneity in any of the specifications makes the largest impact (compare probit vs probit (cf) and lin. prob. (OLS) vs lin. prob. (2SLS)). In addition the proposed Het-2SCML performs better than all the other specifications. This is unsurprising since CI does not hold so the BP and Rothe estimator should not perform well. To give the difference some context: the Het-2SCML would estimate that for every 10% increase in other income, the probability of being employed decreases by while BP and Rothe would estimate a decrease of There is also a much larger spread in the simulated distribution of the BP and Rothe estimated APE in which the 90th quantile is positive. As a result, the BP and Rothe estimated APE would not be able to reject the null that the APE is 0 under standard confidence levels. In addition, we see that the probit estimator with the control function performs quite well even though there is heteroskedasticity and a more complex control function than what was included. It is concerning that even though the proposed estimator performed the best, there seems to be an upward 24

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

Parametric Identification of Multiplicative Exponential Heteroskedasticity

Parametric Identification of Multiplicative Exponential Heteroskedasticity Parametric Identification of Multiplicative Exponential Heteroskedasticity Alyssa Carlson Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States Dated: October 5,

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

A simple alternative to the linear probability model for binary choice models with endogenous regressors

A simple alternative to the linear probability model for binary choice models with endogenous regressors A simple alternative to the linear probability model for binary choice models with endogenous regressors Christopher F Baum, Yingying Dong, Arthur Lewbel, Tao Yang Boston College/DIW Berlin, U.Cal Irvine,

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

More information

Endogeneity in Semiparametric Binary Response Models

Endogeneity in Semiparametric Binary Response Models Review of Economic Studies (2004) 71, 655 679 0034-6527/04/00270655$02.00 c 2004 The Review of Economic Studies Limited Endogeneity in Semiparametric Binary Response Models RICHARD W. BLUNDELL University

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Lecture 11/12. Roy Model, MTE, Structural Estimation

Lecture 11/12. Roy Model, MTE, Structural Estimation Lecture 11/12. Roy Model, MTE, Structural Estimation Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy model The Roy model is a model of comparative advantage: Potential earnings

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

CENSORED DATA AND CENSORED NORMAL REGRESSION

CENSORED DATA AND CENSORED NORMAL REGRESSION CENSORED DATA AND CENSORED NORMAL REGRESSION Data censoring comes in many forms: binary censoring, interval censoring, and top coding are the most common. They all start with an underlying linear model

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Estimating Panel Data Models in the Presence of Endogeneity and Selection

Estimating Panel Data Models in the Presence of Endogeneity and Selection ================ Estimating Panel Data Models in the Presence of Endogeneity and Selection Anastasia Semykina Department of Economics Florida State University Tallahassee, FL 32306-2180 asemykina@fsu.edu

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?

More information

IV estimators and forbidden regressions

IV estimators and forbidden regressions Economics 8379 Spring 2016 Ben Williams IV estimators and forbidden regressions Preliminary results Consider the triangular model with first stage given by x i2 = γ 1X i1 + γ 2 Z i + ν i and second stage

More information

Estimation of Dynamic Panel Data Models with Sample Selection

Estimation of Dynamic Panel Data Models with Sample Selection === Estimation of Dynamic Panel Data Models with Sample Selection Anastasia Semykina* Department of Economics Florida State University Tallahassee, FL 32306-2180 asemykina@fsu.edu Jeffrey M. Wooldridge

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

The Influence Function of Semiparametric Estimators

The Influence Function of Semiparametric Estimators The Influence Function of Semiparametric Estimators Hidehiko Ichimura University of Tokyo Whitney K. Newey MIT July 2015 Revised January 2017 Abstract There are many economic parameters that depend on

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Locally Robust Semiparametric Estimation

Locally Robust Semiparametric Estimation Locally Robust Semiparametric Estimation Victor Chernozhukov Juan Carlos Escanciano Hidehiko Ichimura Whitney K. Newey The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2015-16 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Jeffrey M. Wooldridge Michigan State University

Jeffrey M. Wooldridge Michigan State University Fractional Response Models with Endogenous Explanatory Variables and Heterogeneity Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Fractional Probit with Heteroskedasticity 3. Fractional

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Working Paper No Maximum score type estimators

Working Paper No Maximum score type estimators Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Binary Models with Endogenous Explanatory Variables

Binary Models with Endogenous Explanatory Variables Binary Models with Endogenous Explanatory Variables Class otes Manuel Arellano ovember 7, 2007 Revised: January 21, 2008 1 Introduction In Part I we considered linear and non-linear models with additive

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Lecture #8 & #9 Multiple regression

Lecture #8 & #9 Multiple regression Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Models of Qualitative Binary Response

Models of Qualitative Binary Response Models of Qualitative Binary Response Probit and Logit Models October 6, 2015 Dependent Variable as a Binary Outcome Suppose we observe an economic choice that is a binary signal. The focus on the course

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

Thoughts on Heterogeneity in Econometric Models

Thoughts on Heterogeneity in Econometric Models Thoughts on Heterogeneity in Econometric Models Presidential Address Midwest Economics Association March 19, 2011 Jeffrey M. Wooldridge Michigan State University 1 1. Introduction Much of current econometric

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

A Goodness-of-fit Test for Copulas

A Goodness-of-fit Test for Copulas A Goodness-of-fit Test for Copulas Artem Prokhorov August 2008 Abstract A new goodness-of-fit test for copulas is proposed. It is based on restrictions on certain elements of the information matrix and

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances Discussion Paper: 2006/07 Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances J.S. Cramer www.fee.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department of

More information

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity A Discontinuity Test for Identification in Nonparametric Models with Endogeneity Carolina Caetano 1 Christoph Rothe 2 Nese Yildiz 1 1 Department of Economics 2 Department of Economics University of Rochester

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

ivporbit:an R package to estimate the probit model with continuous endogenous regressors

ivporbit:an R package to estimate the probit model with continuous endogenous regressors MPRA Munich Personal RePEc Archive ivporbit:an R package to estimate the probit model with continuous endogenous regressors Taha Zaghdoudi University of Jendouba, Faculty of Law Economics and Management

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Next, we discuss econometric methods that can be used to estimate panel data models.

Next, we discuss econometric methods that can be used to estimate panel data models. 1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is

More information

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Maria Ponomareva University of Western Ontario May 8, 2011 Abstract This paper proposes a moments-based

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information