Nonseparable Unobserved Heterogeneity and Partial Identification in IV models for Count Outcomes

Size: px
Start display at page:

Download "Nonseparable Unobserved Heterogeneity and Partial Identification in IV models for Count Outcomes"

Transcription

1 Nonseparable Unobserved Heterogeneity and Partial Identification in IV models for Count Outcomes Dongwoo Kim Department of Economics, University College London [Latest update: March 29, 2017] Abstract This paper studies count data instrumental variable (IV) models where explanatory variables are endogenous and unobserved heterogeneity is nonseparable. Prevailing models in literature are shown to suffer from undesirable specification problems. I propose a single equation count data model in which neither parametric restrictions nor strong separability is required. This model explicitly accommodates the discreteness of count data by modifying an ordered choice model. Structural features of interest are set identified and the characterisation of the identified set is provided by a generalised IV model framework introduced in Chesher and Rosen (2016). Identified sets can be rather small since count data often have a rich support. Numerical examples and an application to the effect of supplemental insurance on doctor visits are provided. State-of-art inference methods are employed to find confidence regions for set estimates. The empirical application shows that the set estimation framework delivers useful information about structural features. It also examines misspecification. Keywords: Count data; Poisson regression; negative binomial regression; endogeneity; instrumental variables; single equation models; partial identification; set identification; intersection bounds; incomplete models; confidence regions; JEL Classification Numbers: C25, C26, I12 Address: Room G01, Department of Economics, University College London, 30 Gordon Street, London, UK, dongwoo.kim.13@ucl.ac.uk. The author is deeply grateful to Andrew Chesher and Toru Kitagawa for their supervision. I also thank Ivan Canay, Matias Cattaneo, Sokbae Lee, Jeff Rowley, Alexander Torgovitsky, Daniel Wilhelm, and seminar participants at UCL and LSE for helpful discussions.

2 1 Introduction This paper introduces a new approach to count data 1 instrumental variable (IV) models where explanatory variables are potentially endogenous and unobserved heterogeneity is nonseparable. The proposed approach is widely applicable in applied studies as many outcomes of interest are count-measured. For instance, in health economics, the numbers of doctor visits and other types of health care utilisation, occupational injuries and illnesses are all count outcomes. Other examples are widely found in labour and empirical IO, and even finance literature such as absenteeism in the workplaces, recreational or shopping trips, entry and exits from industries, mortgage prepayments and loan defaults, bank failures, patent registration in connection with industrial R&D, and frequency of airline accidents (see Cameron and Trivedi (2013) - CT2013 henceforth - for more detailed examples). Endogeneity may arise in count data models. In the context of doctor visits, some observable characteristics can be correlated with unobserved heterogeneity. Suppose that individuals in a survey self-reported their current health statuses. If individuals do not report whether they have private health insurance, explanatory variables such as income would be endogenous as this unobserved factor is probably correlated with their health conditions and income. In this case, the OLS estimator fails to deliver correct information about causal effects of interest. IV models are a usual ploy to cope with this problem. I propose a single equation count IV model by modifying an ordered choice model 2 suggested in Chesher and Smolinski (2012). Structural features of interest are set identified and the parsimonious characterisation of the sharp identified set is provided by a generalised IV model framework introduced in Chesher and Rosen (2016). The use of this model is beneficial in the sense that popular count data models with endogeneity in the current literature tend to be misspecified as the discreteness of the count outcome is often ignored. I demonstrate that popular approaches such as the control function approach and moment based models deliver misleading information about the causal effects of interest by using simulated data. The proposed model respects the discreteness of count data and hence more robust to misspecification. Partially identifying models often provides uninformative - could be too large - identified sets so its usefulness in practice is a concern in empirical studies (Ho and Rosen (2015), Section 7.2). In the context of count data models, this could be overcome since count outcome 1 Count data are a type of discrete data in which observations only take non-negative integer values. 2 Ordered choice models are sometimes used for count data as count outcomes are also of ordered choices. CT2013 suggests that parametric models such as logit and probit are particularly suitable when the support of the outcome is very limited such as binary or {0, 1, 2} or if the outcome is generated from threshold crossing of a latent continuous variable. 2

3 may have a rich support depending upon the duration in which the count is aggregated. The richer support of the outcome in general leads to the smaller identified set. I show that identified sets of structural features are very close to points in some numerical examples where the IV is strong or the support of the outcome is rich. A simple algorithm is introduced to numerically implement the characterisation of the identified set. As simple grid search with a selection of conditional moment inequalities provides a good approximation of the identified set, the problem at hands becomes substantially tractable and computationally feasible. Estimation in a finite sample is also studied. An empirical example is provided on a data set used in CT2013 and the set estimates of structural parameters are compared to the point estimation results. Recent developments in the partial identification literature provide inference methods on identified sets. Chernozhukov et al. (2013) develops a novel inference method on identified sets characterised by intersection bounds. Kaido et al. (2016) introduces a bootstrap based inference method on projections of high dimensional identified sets. These state-of-art methods are employed to find confident regions for set estimates. Therefore, this paper documents a unified framework for partially identifying count data IV models from identification to inference. In the current literature, two branches of IV estimation for count outcomes are particularly prevailing. The first is a full information (FI) approach in which data generating processes (DGPs) are specified for all endogenous variables. The control function method is a representative example. Terza et al. (2008) implements this approach in the context of count models, namely 2 stage residual inclusion estimation (2SRI). The control function approach is widely used in applied studies but is known to have several problems. For example, the recursive structure rules out full simultaneity. Moreover, endogenous variables are required to be continuously distributed. Otherwise, structural features of interest are generally set identified as shown in Chesher (2005). On the contrary, a limited information (LI) method does not specify some or all endogenous variables. LI models are more robust to misspecification as less restrictions are imposed. This robustness is, however, often obtained at the cost of identification power or efficiency of an estimator. Moment based LI approaches are suggested in Windmeijer and Santos Silva (1997) (WS1997 henceforth) and Mullahy (1997). These single equation models are argued to be point identifying under strong separability. However, they ignore the discreteness of count outcomes. Therefore, even though the parameters in their moment conditions are point identified, the models explains nothing about the DGPs of outcome variables. Furthermore, count outcomes are discrete but the continuous structural function is imposed in their specifications. Hence the separable errors absorb the discreteness of the 3

4 outcomes. Consequently, the conditional support of the separable error depends on given values of explanatory variables. It can be shown that no instrument satisfies the strong independence condition if endogenous variables are discrete. Therefore, a more flexible form of the error term is required to avoid this specification problem. It may give rise to partial identification as Chesher (2010) points out that IV models for discrete outcomes are generally not point identifying but set identifying unless strong restrictions are invoked. The importance of model specifications cannot be emphasised enough in applied economic studies. Applied researchers often impose simplifying assumptions which are not based on economic theories in order to make identification and estimation more tractable. In many cases, they become the primary source of misspecification. Misspecified models deliver misleading information on causal relationships of interest. Therefore, econometricians have tried to minimise redundant and unjustifiable restrictions. Partial identification is a brainchild of this philosophy. It imposes a minimal set of restrictions to read useful information in data and hence it is less vulnerable to attacks on econometric assumptions. LI often induces partial identification but even with FI, point identification is not always guaranteed. This paper is structured as follows. Section 2 points out potential flaws of prevailing approaches in literature. Section 3 introduces count data IV models with the nonseparable error and the characterisation of identified sets. Section 4 demonstrates identified sets in numerical examples by employing parametric restrictions. Section 5 shows estimation and inference results on an empirical example. Section 6 concludes. All proofs are provided in Appendix I. 2 Prevailing approaches and potential problems Standard linear regression models are incoherent for count outcomes as fitted values can take negative numbers. The most wildly used count data method is Poisson regression. Define that Y is a scalar count outcome and X is a vector of explanatory variables. Then the conditional mean function of Y given X is E(Y X) = exp(x β). The vector of structural parameters β governs the conditional mean and the variance of Y given X. Equidispersion is the main feature of this model. It is too simple and highly restrictive because count data are in general over or underdispersed. The negative binomial (NB) model, under which a shape parameter controls the degree of dispersion of Y, is a popular ploy in such a case. Now suppose that there is an unobserved characteristic U which should be included to the outcome equation if observable. The endogeneity problem arises when U is correlated with X since the parameters are not identified. Instrumental variable models address this problem. The control function approach and the moment based methods are most popular in literature. 4

5 2.1 Control function approach Terza et al. (2008) introduce the control function approach in the context of count data models. The model is specified as Y P oisson[λ(x) = exp(x β + U)] (1) X = g(z δ) + V (2) U = αv + e (3) Z is assumed to be independent of e and V. e and V are mutually independent and E[exp(e)] is normalised to 1. Then E[λ(X) X, Z, V ] = E[exp(e) X, Z, V ] exp(x β + αv ) = exp(x β + αv ) and V is identified by the second equation. Therefore, in the first stage, the regression of X on Z yields ˆV. Secondly, the Poisson regression of Y on X and ˆV gives the IV estimate of β. This method is widely used as it is very tractable but is somewhat restrictive in the sense that the recursive structure rules out full simultaneity 3 (Koenker (2005), Section 8.8.3). Moreover, there would be additional sources of misspecification due to the auxiliary first stage equation for which simple linear models are usually employed in practice. If the true function g is nonlinear, then all the estimation results are invalid. Furthermore, the endogenous variable X is generally required to be continuously distributed. Otherwise, the error term in the first stage is not separably identified. For instance, if X is ordered choice, then standard parametric models do not provide a single value of e given Z and X. The instrument Z is also required to be continuous unless the first stage is linear. Chesher (2005) suggests that set identification is possible when X is discrete and the error term is nonseparable but his method is not applicable if X is a single binary variable. 2.2 Moment based approaches with strong separability Moment based approaches are not reliant on the recursive structure. Suppose that unobserved heterogeneity U is additively separable. Then the model is specified as follows. Y = exp(x β) + U, E[U Z] = 0 (4) 3 In simultaneous equation models, endogenous variables might affect each other. Therefore, the variation of Y is possibly able to lead the change in X. In the recursive system in the control function approach rules out this relationship as Y is restricted to have no effect on X. 5

6 Then under the existence of a relevant instrument, WS1997 show that β is point-identified by the moment condition E[Z(Y exp(x β))]. The generalised method of moments (GMM) estimator with an appropriate weight matrix consistently estimates β. However, Mullahy (1997) points out that this specification treats X and U asymmetrically without a particular reason. Suppose now that unobserved heterogeneity W is omitted characteristics. U is a regression error such that E[U X, W, Z] = 0. Then the structural equation is written as Y = exp(x β + W δ) + U = exp(x β)v + U (5) where exp(w δ) = V. V is multiplicatively separable and X and V are treated symmetrically. Normalise that E[V Z] = 1. The following moment condition point identifies β as shown in Mullahy (1997). [ ] Y E exp(x β) 1 Z = 0 E [ ( Z Y exp(x β) 1 )] = 0 (6) Those two specifications (4) and (5) are observationally equivalent (see Wooldridge (1992)). The moment based approaches involve a fundamental problem when unobserved heterogeneity is interpreted as of economic interest. In econometric models with endogeneity, unobserved heterogeneity generally has a clear economic meaning. When it comes to returns to schooling, years of education (X) are supposed to be correlated with unobservable ability (U) which affects X as well as income (Y ) for an individual. Therefore, an instrument Z is necessary in order to separately identify the causal effect of education on earnings from that of unobserved ability and persuasive explanation about the relationship between Z and U should be presented as it is untestable. Now suppose that a model specification per se highly restricts the distribution of U with which endowing U with economic interpretation is hard. If one cannot devise an economic example of such unobserved heterogeneity, then it would be also impossible to argue that there exist some good instruments Z. For example, suppose that one writes a linear probability model (LPM) when Y and X are binary. Y = α + βx + U, E[U Z] = 0 (7) Then the conditional support of U given X is binary as it only takes either 1 α βx or α βx and hence X and U are not independent. This arises due to the attempt to fit the discrete outcome by a continuous (linear) function. U absorbs the discreteness of Y. However it is seldom justified to impose such discreteness on U. Can unobserved heterogeneity, whose discrete conditional support varies with X, be found in any economic 6

7 example? How can one endow it with an economic meaning? These questions are very hard to answer, even though this model specification is not uncommon. The more fundamental problem is that there exist no instrument which is independent of U but correlated with X. Suppose that Z U and X is binary. The conditional support and the probability mass of U given Z are shown in Table 1. As P[U Z] = P[U] by the independence assumption, P[Y = y X = x Z] = P[Y = y X = x] for any x, y {0, 1} and hence X and Z are also independent. Therefore, the rank condition is never satisfied. This result is extended to a more general case where X is continuous. Y X U P[U Z] 0 0 α P[Y = 0 X = 0 Z] α P[Y = 1 X = 0 Z] 0 1 α β P[Y = 0 X = 1 Z] α β P[Y = 1 X = 1 Z] Table 1: Conditional Support of U given Z Proposition 1 Under the LPM (7), suppose that Y is binary and X is continuous. Assume that all admissible structures by the model satisfy the condition 0 < α + βx < 1 for all x R X. If there exist an instrument Z which is independent of U, then X and Z are independent of each other. Remark 1 The crucial condition for Proposition 1 is 0 < α + βx < 1. This is sensible as α + βx is the conditional probability of Y = 1 given X. If it is violated, then variation of X in Z cannot be ruled out because a value of u matches to two pairs of (Y, X) in the overlapped area between the conditional supports. This is paradoxical as the more extreme behaviour of the conditional probability is a key to resolve the problem. Conditional mean independence of U given Z is required for identification of α and β. It is slightly weaker than the strong independence condition in Proposition 1 above. However, in many applied economic studies, it is rarely justifiable to argue that Z satisfies conditional mean independence but is not independent of U. As neither of those is testable, most applied researchers argue that their instruments are completely exogenous to unobserved heterogeneity. (For example, see Angrist and Krueger (1991).) However, this argument is fundamentally impossible due to the specification of the LPM. A partial identification approach to address this problem is proposed in Chesher and Rosen (2013) for binary outcomes. Models (4) and (5) have the similar problem. Suppose the model is Y = exp(α+βx)+u. There is no support restriction on α + βx. Define the support of Y as M {0, 1, 2, }. M 7

8 is possibly unbounded. If X is continuous and unbounded, then for any u = m exp(α+βx) where m M and x R X, there exist m M and x R X such that m exp(α + βx) = m exp(α + βx ). Then the probability distribution of X varies with Z even if Z U. This can be easily shown. Suppose that β > 0. For any given u R U, a level set of pairs (m, x m ) is defined as follows. C(u) {(m, x m ) : u = m exp(α + βx m ), x m R X, m M} (8) As the exponential term is always positive, if m u, then u > m exp(α + βx) regardless of the value of X. Thus the cumulative distribution function of U, F U ( ), is defined as F U (u) = P[Y u] + P[Y = m X x m ]. m>u The strict independence condition requires that F U Z (u z) = F U (u) for all z R Z and hence F U Z (u z) = P[Y u z] + P[Y = m X x m z] = F U (u), z R Z. (9) m>u Neither P[Y u z] = P[Y u] nor P[Y = m X x m z] = P[Y = m X x m ] is necessarily required. Therefore, the possibility of variation of X in Z cannot be ruled out. If X is discrete and bounded, however, the same problem occurs here. The following proposition shows that existence of a good IV is rarely assured. Proposition 2 Suppose that Y is count and X is discrete and finite i.e. R X {x 1, x 2,, x n }. Under the model such that Y = exp(α + βx) + U, only a particular set of pairs (α, β), whose Lebesgue measure is zero, allows for the instrument Z being independent of U, but correlated with X. The true parameter values are never known. Therefore, it is never assured that there exists a proper instrument. Even if the true parameters indeed lie on the particular set in Proposition 2, limited variation between certain values of X is allowed. The result in Proposition 2 is extended to the model (5) with the multiplicative error. The additive error U is omitted here as it is redundant. Proposition 3 Suppose that Y is count and X is discrete and finite. Under the model Y = exp(α + βx)v, only a particular set of pairs (α, β), whose Lebesgue measure is zero, allows for the instrument Z being independent of U, but correlated with X. 8

9 The moment based approaches ignore the discreteness of count outcomes. Even though parameters in the models are point identified, those do not necessarily tell about the underlying DGP. Furthermore, the specifications cannot accommodate more complex interactions between X and U. For instance, polynomial regression models involving interaction terms between explanatory variables are often employed in applied studies. Suppose that the true model is Y = exp(α + βx + γw + δxw ) + U and W is unobservable. Then multiplicative unobserved heterogeneity V in the model (3) is exp(γw + δxw ) and hence there exists no instrument satisfying the moment condition E[exp(γW + δxw ) Z] = 0 and the rank condition simultaneously 4. Therefore, the models (4) and (5) cannot handle more complex interactions between X and U. Needless to say, the moment based approach does not work when unobserved heterogeneity is nonseparable. 3 Count data IV Models with nonseparable error A nonparametric count data IV model is built upon an ordered outcome model suggested by Chesher (2010). Define M as a subset of all non-negative integers, M {0, 1, 2, } and m M. M is possibly unbounded. Y is a random count outcome, X is a vector of potentially endogenous explanatory variables and U is a random scalar. The model is Y = h(x, U) = 0 if p 0 (X) U p 1 (X) = 1 if p 1 (X) < U p 2 (X) = (10) = m if p m (X) < U p m+1 (X) = where p 0 (X) = 0, 0 p m (X) 1 and p m (X) p m+1 (X) for all X and m. U is normalised to U Unif(0, 1) without loss of generality. The threshold functions {p m (X)} m=1 are of interest. Suppose that X is discrete and independent of U. Then it is reasonable to define the conditional distribution function of Y given X as p m+1 (X) = P[Y m X] = F Y X (m X). Since U X Unif(0, 1), the thresholds, {{p m (x)} m=1} k x=0 where k = R X, are all point 4 In a standard linear regression model with no endogeneity, Y = α + βx + γw + δxw + U, if X is independent of U and W, the moment condition E[X(γW + δxw + U)] = 0 can be satisfied. Therefore, the parameters of interest are identified. But, the model (4) and (5) cannot identify the parameters even if X is independent of W as X and unobserved variable are nonseparable. 9

10 identified by the cumulative distribution function (cdf) of Y conditional on X. Therefore, the full conditional distribution of Y given X is nonparametrically identified and it provides useful insight about the causal relationship between X and Y as the distributional shift of Y with regard to X is captured. Structural features of interest such as average treatment effects are also identified. In a finite sample, {{p m (x)} m=1} k x=0 are consistently estimated by the sample analogue estimator. If X and U are not independent, the thresholds are not identified by F Y X. Suppose that X is binary i.e. R X {0, 1} and that F U X (τ X = 1) first order stochastically dominates F U X (τ X = 0). F U X (τ X = 1) F U (τ) F U X (τ X = 0), τ [0, 1] Then F Y X (m X = 1) p m+1 (1) and p m+1 (0) F Y X (m X = 0). Therefore without additional information, {p m (0), p m (1)} m=1 are not identified. What one can identify are lower bounds for {p m (1)} m=1 and upper bounds for {p m (0)} m=1 which might not be very informative. Without the first order stochastic dominance assumption, one may be able to identify no-assumption bounds as in Manski and Pepper (2000). The main question of this paper is how to identify the threshold functions under the existence of an instrument Z. Strong separability has been the key source of point identification in count data models. As Chesher (2010) shows, point identification is generally not achievable in single equation nonseparable IV models for discrete outcomes even with parametric restrictions. However, under the existence of a relevant instrument, one is able to identify more informative bounds than no-assumption bounds. 3.1 Generalised Instrumental Variable Model The characterisation of identified sets in count data IV models is provided under the generalised instrumental variable (GIV) model restrictions in Chesher and Rosen (2016). Let G U Z denote the collection of conditional distributions of U given Z. G U Z {G U Z ( z) : z R Z } Under the GIV restrictions 1-6 in their paper, the identified set for the structural function h and G U Z is characterised. The following restrictions 1-3 satisfy the GIV restrictions so they facilitate the use of the same characterisation of the identified set in the model (10). Restriction 1 Y and U are random scalars and X and Z are random vectors defined on a probability space (Ω, L, P), endowed with the Borel sets on Ω. 10

11 Restriction 2 The support of Y is a subset of all non-negative integers M {0, 1, 2, } and the support of (X, Z) is a subset of Euclidean space. A collection of conditional distributions F Y X Z {F Y X Z ( z) : z R Z } is identified by the sampling process where F Y X Z (T z) P[(Y, X) T z] for all T {(y, x) : y R Y, x R X }. Restriction 3 U is uniformly distributed on the unit interval [0, 1] and G U Z ( z) = G U ( ) for all z R Z where G U denotes the marginal distribution function of U. As G U Z is singleton by Restriction 3, the object of identification is only the structural function h which is fully characterised by the threshold functions {{p m (x)} m M } x RX. To use theories of random sets, define two level sets. Y(U; h) {(y, x) R Y X : h(x, U) = y}, U(Y, X; h) {u R U : h(x, u) = Y } Then under the model (10), those two level sets are modified as follow. Y(u; h) = {(m, x) R Y X : p m (x) < u p m+1 (x)}, U(m, x; h) = [p m (x), p m+1 (x)] To be more precise, U(m, x; h) should be left-open but I stick to the closed set definition henceforth. It utilises random set theories which characterise distributions of random closed sets and it does not incur any problem because the difference between the level set U(m, x; h) and its closure is always zero as U is continuously distributed 5. Let S be a closed subset of [0, 1]. A containment functional of U(Y, X; h) is C h (S z) P[U(Y, X; h) S z]. A set function G U (S) P[U S] is defined. Let H denote the identified set of the structural function h. F(A) is the collection of all closed subsets of a set A. Then Corollary 1 provides the sharp characterisation of the identified set. Corollary 1 (Chesher and Rosen (2016)) Under Restriction 1-3, the sharp identified set of the structural function h in the model (10) is defined as H {h : S F([0, 1]), C h (S z) G U (S), a.e z R Z }. 3.2 Core determining test sets The number of closed subsets of [0, 1] is infinite. Computation of the sharp identified set is thus often infeasible in practice. To find a practically implementable characterisation of 5 This is because the left-end point of the level set U has Lebesgue measure zero. See Chesher and Rosen (2016) p. 9 for a detailed discussion. 11

12 the identified set, a notion of core determining classes is employed as Galichon and Henry (2011) suggest. In the context of the model (10), a collection of core determining test sets (CDTS) is defined as follows. Definition 1 (Core determining test sets) Let Q h denote a subcollection of F([0, 1]) such that C h (S z) G U (S), S Q h (11) and for almost every z R Z. Q h is a collection of core determining test sets if the same inequality also holds for every S [0, 1]. Therefore, finding the smallest possible subcollection of F([0, 1]) which satisfies the definition of CDTS is essential to reduce computational burden for identification of H. Let U h denote the support of the random level set U(Y, X; h). U h {[0, p 1 (x)], [p 1 (x), p 2 (x)],, [p m (x), p m+1 (x)], : x R X } Theorem 3 in Chesher and Rosen (2016) (TH3 henceforth) suggests a collection of all connected unions 6 of elements in U h as the collection of CDTS. But under the model (10) the number of elements in U h is possibly infinite as M may be unbounded. Thus the number of core determining test sets is also infinite. Let Q h be a collection of all connected unions of elements of U h. To make identification feasible in practice, a finite subcollection of Qh should be selected. The objective is to construct a finite collection which has as fewer elements as possible without loss of information. Further refinement from Q h is achievable under a certain condition. Condition 1 For all intervals [p m (x), p m+1 (x)] U h, there exists k M such that p k (x ) [p m (x), p m+1 (x)] for all x x. By exploiting Condition 1, I propose a refinement of Q h and show that the suggested collection loses no information compared to Q h in the following theorem. Theorem 1 Suppose that X is discrete. Under the model (10) and Condition 1, the collection Q h such that Q h {[0, p m (x)], [p m (x), 1] : m M\{0}, x R X } (12) is a collection of core determining test sets. 6 All disconnected unions and [0, 1] are excluded by TH3. The inequality (11) is trivially satisfied by [0, 1] thus there is no need to check this interval. 12

13 Theorem 1 also works for an outcome with a bounded support so it is applicable to other ordered choice models. Identification of the structural function is now straightforward. Corollary 1 with Q h gives the sharp characterisation of the identified set for h. Under Condition 1, Q h suffices to deliver the sharp identified set. Define a set of threshold functions P {p m (x) : x R X, m M}. Then a function h is fully characterised by P so I substitute P for h in notations henceforth. Let P denote a collection of all admissible P. P {P : p m (x) [0, 1], p m (x) p m+1 (x) for all x R X, m M} Then the identified set P, a subcollection of P, is found by the following corollary. Corollary 2 Given the joint distribution of (Y, X, Z), the identified set for the structural function h is characterised as follows. P = {P : S Q h, C P (S z) G U (S) a.e z R Z }. Corollary 3 Given the joint distribution of (Y, X, Z) and under Condition 1, the identified set for the structural function h is characterised as follows. P = {P : S Q h, C P (S z) G U (S) a.e z R Z }. Corollary 2 and 3 are the direct applications of TH3 and Theorem 1. The corollaries provides a parsimonious characterisation of the sharp identified set. The identification result here is fully nonparametric. The value of the containment functional is determined by the ordering of the elements of P. Therefore all possible orderings need to be considered for identification. Given a particular set of threshold functions P, its ordering gives upper and lower bounds for each of its elements. If all the elements lie between their bounds, P P. However, as the supports of Y and X become richer, the number of admissible orderings increase explosively. The number of admissible orderings is computed in Chesher and Smolinski (2012). L = (k( m 1))! (( m 1)!) k As M is possibly unbounded, computation for identification is highly cumbersome in count data IV models. Therefore, appropriate shape restrictions or parametric restrictions might be imposed in practice to reduce computational burden. In particular, parametric restrictions generate the large number of threshold functions with a few structural parameters. It motivates the use of appropriate parametric specifications which fit data well. Remark 2 Condition 1 is highly restrictive as it is only satisfied for a limited subset of 13

14 P. In the process of identification, therefore, Condition 1 should not be imposed unless the impact of X on Y is expected to be very small. Nonetheless, the use of Q h could be still beneficial without Condition 1 if it provides a good approximation of the sharp identified set as discussed henceforth. If a parametric restriction is imposed, the use of Q h is particularly beneficial even without Condition 1 being satisfied. As the condition seems not very intuitive, it is helpful to find more intuitive sufficient conditions under which Condition 1 is satisfied. Condition 2 For all m M, (i) Complete separation : max{p m (x 1 ), p m (x 2 ),, p m (x K )} min{p m+1 (x 1 ), p m+1 (x 2 ),, p m+1 (x K )} (ii) Monotonicity : p m (x 1 ) p m (x 2 ) p m (x K ) or p m (x K ) p m (x K 1 ) p m (x 1 ) Lemma 1 Under Condition 2, Condition 1 is satisfied. If the threshold functions p m (x) are generated from a parametric structure i.e. p m (x) = F (m, λ(x)), λ(x) = exp(α + βx) where F belongs to a known class of parametric count distribution functions, complete separation means that X has a very weak impact to shift the thresholds. In other words, β is close enough to zero. Under this type of parametric restrictions, a set of threshold functions P(α, β) is generated by a finite number of structural parameters. Thus identification of α and β is provided by the conditional moment inequalities from Corollary 3 given P(α, β). A certain pair of (α, β) is to be included in the outer region if P(α, β) satisfies all the conditional moment inequalities. As we can always find a small enough β to ensure Condition 2, Q h provides the strongest possible criterion for values of β around 0. If the true β is close to 0, the main interest of identification is whether the identified set of β includes 0. Therefore, when the outer region provided by Q h contains 0, the core determining collection Q h is also unable to exclude 0. In the case where β is large (that means Condition 2 is highly violated), the outer region delivered by Q h can be very close to the sharp identified set. The proximity of the outer region to the sharp identified set depends on the data generating process of Y. 14

15 Theorem 2 As K or var(y ) increases, the outer region delivered by Q h converges to the sharp identified set of the thresholds {p m (x)} m M,x RX. Theorem 2 implies that the use of Q h is particularly beneficial if the supports of Y and X are rich or the impact of X on Y is huge. In such a case, the outer region is a good approximation of the identified set as Theorem 2 shows. As X or Y becomes less discrete, the size of the identified set tends to be smaller and so is the outer region. Remark 3 If a large outer region excluding β = 0 is identified, further refinement of the outer region is available via Corollary 2. However, how much it would be refined is a question left to numerical exercises. For computational feasibility, a focus is given to a finite subset of M as M is infinite. Since p m (x) converges to 1 as m goes to infinity, a large enough m at which p m (x) for all x are very close to 1 can be found. For identification on numerical examples, m is defined as follows. m min{n : min x R X p n (x) > } The values greater than m are almost never realised so ignoring these values has a negligible effect on the size of the identified set. When it comes to estimation, one can use the largest realisation of Y in data in practice. The subset of M we are interested in is now M {0, 1, 2,, m}. Qh and Q h are also redefined under M, not M. The following examples show the improvement in computation by using Q h rather than Q h. The number of elements of Q h is 2 mk where K = R X. Example 1 Suppose that m = 2. Then the support of the U level set is Ū h {[0, p 1 (x)], [p 1 (x), p 2 (x)], [p 2 (x), 1] : x R X }. All disconnected unions and the unit interval are excluded in Q h. unions in Q h are Therefore, all possible [0, p 1 (x)], [0, p 2 (x)], [p 1 (x), p 2 (x )],, [p 1 (x), 1], [p 2 (x), 1], x, x R X. The number of intervals in Q h is naturally 5k+k(k 1). Suppose that the computing time for an information bound induced by each interval is T. Let T and T denote the total computing times with Q h and Q h respectively. Then T = 5kT + k(k 1)T, T = 4kT, T / T = k = k

16 X is not constant so k 2 and T > T. As k increases, the enhancement achieved by Q h enlarges. If k = 4, then the computation with Q h is 2 times faster than that with Q h. Example 2 Suppose that m = 3. Ū h {[0, p 1 (x)], [p 1 (x), p 2 (x)], [p 2 (x), p 3 (x)], [p 3 (x), 1] : x R X }. The number of unions in Q h is 9k + 3k(k 1). Then T = 9kT + 3k(k 1)T, T = 6kT, T / T = k Therefore, T > T for all k 2. As k increases, the improvement achieved by Q h enlarges. If k = 4, then the computation with Q h is 3 times faster. Given m and k, the number of elements of Q m( m 1) h is 2 mk + k 2. Accordingly, the ratio 2 of T to T k( m 1) is + 1, which is explosively increasing as k and m go up. Therefore, the 4 improvement in computation achieved by using Q h becomes greater when k and m are large. Q h is also very straightforward to apply as one need not care about the ordering between threshold functions. Remark 4 Using Q h might be beneficial even if it is not always be core determining. It is plausible that the primary identification power is from only a subcollection of the CDTS. In some numerical examples, more than a half of sets in Q h are shown to deliver only negligible information about the identified set. Therefore, Q h can be numerically nondominated by Q h. Remark 5 Estimation of the identified set sometimes involves only a subcollection of the CDTS because the full collection would deliver an empty set due to finite sample bias. In count data models, this is more likely to happen as the support of the outcome is rich and the number of the CDTS is large. Therefore, the use of Q h is justifiable even though it does not always guarantee the core determining property. 3.3 Additional heterogeneity Unobserved heterogeneity U is so far assumed to be a random scalar. Suppose that some elements of X is unobserved. As those are relevant variables in the structural function of Y, the omitted variable problem arises. This may happen in practice so it needs to be accommodated in the model (10). Let the scalar unobserved heterogeneity assumption be relaxed. Now there is two dimensional unobserved heterogeneity (U, V ) where U is a uniformly distributed latent variable as before and V is a continuously or discretely distributed 16

17 characteristic which is unobservable and so is omitted. Then the model 10 is modified as follows. Y = h(x, U, V ) = 0 if p 0 (X, V ) U p 1 (X, V ) = 1 if p 1 (X, V ) < U p 2 (X, V ) = = m if p m (X, V ) < U p m+1 (X, V ) = (13) where p 0 (X, V ) is normalised to 0. If V is observable, then the set of threshold functions {p m (x, v)} m M,j RV,x R X is the object of identification. Suppose that X is independent of (U, V ) and U and V are mutually independent. Under these assumptions, without observing V, there is no hope of identification. As U Unif(0, 1), it is allowed to specify that p m (x, v) = m 1 y=0 P[Y = y X = x, V = v]. This probability cannot be identified without conditioning on V but one can identify the threshold functions p m (x) such that p m (x) R V p m (x, v)f V (v)dv. As X and V are independent of each other, f V X = f V and f XV = f X f V. Therefore, p m (x) is equivalent of P[Y = m X = x] as follows. p m (x) = = = = R V m 1 y=0 m 1 y=0 m 1 y=0 m 1 y=0 P[Y = y X = x, V = v]f V X (v x)dv f Y XV (y, x, v) R V f XV (x, v) R V f Y V X (y, v x)dv P[Y = y X = x] f XV (x, v) dv f X (x) In the case where X is not independent of V, the marginal response of Y to X is not separately identified from the effect of V as p m (x) P[Y = m X = x] and now p m (x) is 17

18 the counterfactual cdf of Y conditional on X. However, we can set-identify p m (x) as before under the existence of a relevant instrument Z. Given the values of Y and X, the location of (V, U) is partially identified on R V U R V [0, 1]. U(m, x; h) = {(v, [p m 1 (x, v), p m (x, v)]) : v R V } Therefore, U h the support of the U-level set is defined as follows. U h = {(v, [0, p 1 (x, v)]),, (v, [p m 1 (x, v), p m (x, v)]), : x R X, v R V } Let Q h be the collection of all the connected unions of elements of U h. collection of CDTS. Under the condition that Z (V, U), Then Q h is the Corollary 4 Given the joint distribution of (Y, X, Z) and the model 13, the identified set for the structural function h is characterised as follows. H = {h : S Q h, C h (S z) G V U (S) a.e z R Z }. However, this characterisation is hard to use in practice. As V is possibly continuously distributed, there exist an infinite number of elements in Q h even when M is bounded and small. Furthermore, the location of V is unable to be inferred given the values of Y, X and U since no distributional assumption is imposed. To make identification feasible, let Q h be a subcollection of Qh such that Q h {S V 1 (m, x), S V 2 (m, x) : m M, x R X } where S V 1 (m, x) {(v, [0, p m (x, v)] : v R V )} and S V 1 (m, x) {(v, [p m (x, v), 1] : v R V )}. Then the following lemma characterises the outer region of threshold functions p m (x). Lemma 2 For all m M and x R X, the outer region of p m (x) is sup P[Y m 1 X = x Z = z] p m (x) inf P[Y m X = x Z = z]. z Z z Z 4 Identified sets with parametric restrictions Identification analysis is demonstrated on different data generating processes. To avoid dealing with the tremendous number of orderings, Parametric restrictions (Poisson and negative binomial) are imposed. The model is still set identifying with those restrictions but 18

19 the size of the identified set on numerical examples tend to be small enough. A parametric restriction allows all the threshold functions {{p m (x)} m=1} k x=0 to be generated by a smaller number of structural parameters and the ordering is given. Let P denote the approximation of the identified set (I call it the identified set henceforth, not necessarily sharp) delivered by Q h. Suppose that r is the number of structural parameters. Then the algorithm for identification is following. 1. Define dense grid points on R r. Let Θ denote the set of grid points. Then Θ {θ 1, θ 2,, θ J } where J is the number of grid points in Θ. 2. Generate the thresholds {{p m (x)} m=1} m k x=0 by using θ i. Then the ordering l i between the threshold values is given. 3. Compute upper and lower bounds of the threshold functions by using the given ordering l i and Corollary (3). 4. Check whether all the threshold functions lie between their lower and upper bounds. If so, include θ i in P. Otherwise, θ i / P. 5. Repeat the above steps for all i = 1,, J. The above algorithm is used to deliver the identified sets on various data generating processes throughout this paper. 4.1 Poisson restriction Triangular DGP A recursive data generating process (DGP) is specified. This triangular system is particularly useful for identification analysis in the sense that it involves less computational burden 7. [ ] ([ ] [ ]) Z ε N(0, 1), N, V Z is independent of ε and V. X and Z are binary variables generated as follow. Z = 1[Z 0], X = 1[X 0], X = δ 1 + δ 2 Z + V 7 An alternative DGP is a joint Gaussian system where X, Z and ε are all generated by the joint normal distribution. This DGP implements full simultaneity between variables. It has more variables in the multivariate normal distribution so computation of the identified set takes a much longer time than the triangular DGP. 19

20 b no IV ( J = 0 ) moderate IV ( d 1 = 0, d 2 = 1) strong IV ( J = 2 ) super strong IV ( J = 4 ) true value a Figure 1: Identified sets under triangular structures when E[Y ] = 3.9 Unobserved heterogeneity ε is marginalised i.e. U Φ(ε). To generate a count outcome, the Poisson cumulative distribution function (cdf) p m (X) is specified. The threshold values {p m (X)} m M are generated as follow. m exp(α + βx) y p m+1 (X) = exp( exp(α + βx)) y! y=0 (14) Then a function g( X) generates Y by taking U as an argument. This function is distinct from the conditional quantile function of Y given X, even though the expression is very similar. Because p m (X) is not the conditional cdf of Y but the counterfactual conditional cdf. g(τ X) inf{m : p m+1 (X) τ}, Y = g(u X) For identification of the parameter values, the Poisson restriction is imposed under the introduced algorithm. This DPG is convenient to understand identification at infinity. If Z is a perfect predic- 20

21 b moderate-add moderate-mul strong-add strong-mul super-add super-mul identified set a Figure 2: Moment based results and the identified set with the moderate IV tor for X, the identified set boils down to a point as endogeneity of X disappears. The size of the identified set varies with the values of δ 1 and δ 2. Let J be a positive real number. When δ 1 = J and δ 2 = 2J, P[X = z Z = z] 1 as J. Z becomes a perfect predictor for X at values of J smaller than 5. The first IV, classified as moderate, has δ 1 = 0 and δ 2 = 1. The strong and super strong IVs have the values of J equal to 2 and 4 respectively. Figure 1 shows the identified sets associated with instruments. Considering the scale of the figure, the sets are very small. For the moderate IV, α and β lie on [0.497, 0.536] and [0.965, 1.003] respectively. Each identified set links to a collection of counterfactual conditional cdf of Y given X. Thus, all the features of interest can be computed. The interval identified ATE is [1.754, 1.860]. For strong and super strong IVs, the identified sets are extremely small. The ATEs lie between [1.763, 1.766] and [1.7634, ] respectively. For larger values of J e.g. 5, the parameters are point identified. The moment based approaches do not provide correct information about the true parameters. Figure 2 shows the point estimates delivered by the moment conditions on a large sample (n = 100, 000) generated by the triangular DGP. They are all outside the identified sets and are far away from the true point. This is natural because the moment-conditions are in general not satisfied under the DGP specified here. The identified sets are small even if the IV is very weak. The strong identification power 21

22 is from the rich support of Y. Under the true parameter values, Y takes values from 0 to 17. In the case where α = 0.1 and β = 0.1, the mean of Y is fairly small and so is the variance (E[Y ] = 1.18). The support of Y is now {0,, 9}. This represents more unfavourable cases in practice where the support of Y is not rich. Note that under this parameter values, Restriction 5 is satisfied and hence Q h is core determining. Figure 3 shows the identified sets. They are in general large unless instruments are very strong. However, even with Z having no correlation with X, the sign of the ATE is correctly identified. When δ 1 = δ 2 = 0, the identified sets for parameters and the ATE are α [ 0.25, 0.135], β [0.06, 0.59], exp(α + β) exp(α) [0.071, 0.626] where the true ATE is b no IV ( d 1 =d 2 = 0 ) very weak IV ( d 1 = 0, d 2 = 0.1) moderate IV ( d 1 = 0, d 2 = 1) strong IV ( J= 2 ) superstrong IV ( J= 4 ) true value a Figure 3: Identified sets under triangular structures when E[Y ] = 1.18 This result looks surprising. The GIV framework employed here does not necessarily require the rank condition, thus it is applicable in cases where an instrument is independent of U but has no prediction power for X. However, this identification power does not entirely 22

23 b t = 1 t = 3 t = 5 t = a Figure 4: Identifying power of the intervals in Q h come from this framework. In the example, X is positively correlated with U. Therefore, the observable joint distribution of Y, X and Z does not allow for negative values of β. If X is negatively correlated with U, then the identification power disappears. Let the correlation parameter γ be 0.5. Then the identified set contains grid points on which the value β is negative. Therefore, in such a case, the identified set is not informative about the ATE at all. One other interesting experiment is to evaluate identifying power of each intervals in Q h. This exercise would answer a question that where the identifying power is primarily from. Under the triangular DGP with α = 1, β = 0.5, m is 17. For numerical identification, I use Q h = {[0, p m (x)], [1, p m (x)] : 1 m 17, X {0, 1}} which includes in total 68 intervals. Is it possible to deliver the same approximation of the identified set by a smaller number of intervals in Q h. The answer is shown in Figure 4. A collection of intervals {[0, p m (X)], [p m (X), 1]} t m=1 is defined and the figure demonstrates the 23

24 b a= 0.1, b= 0.1 a= 0.5, b= 0.1 a= 1.0, b= 0.1 a= 1.5, b= 0.1 a= 2.0, b= 0.1 true value a Figure 5: Size variation of identified sets outer regions delivered by various values of t. As t increases, the outer region converges to the identified set. Note that convergence is achieved at t = 9 given the scale of the figure. This means that all the identifying power comes from the first 9 values of M and additional information provided by the higher values is very marginal. This experiment shows that identifying power of Q h is strong enough to deliver a good approximation of the identified set even when K = 2 and β is not huge in the sense that it has some redundant intervals which gives marginal information on parameters even though we are not ex-ante aware of them. Lastly, the discreteness of Y is pivotal for the size of the identified set. As the number of points in supp(y ) increases, the discreteness disappears. Figure 5 demonstrates the size variation of the identified sets with regard to the richness of supp(y ). The IV used in identification is moderate(δ 1 = 0, δ 2 = 1). The size of the identified set shrinks with the mean of Y. When E[Y ] = 8.17, the identified set becomes a point-alike. Therefore, if there is high dispersion in count data, then the identified set becomes a point. 24

25 b weak IV ( J = 0.1 ) medium IV ( J = 0.3 ) moderate IV ( J = 1.0 ) strong IV ( J = 2.0 ) true value a Figure 6: The identified set when X is trinary Three points in the support of X The size of the identified set tends to be smaller as the support of X becomes richer. The reason is straightforward in the sense that the number of conditional moment inequalities increases. The set geometry is studied with a higher dimensional X herein. The previous DGP is preserved but now X is trinary. X = 1 if X (, 1) = 0 if X [ 1, 1) = 1 if X [1, ) Figure (6) shows the identification results with IVs. The shapes of the identified sets are rather different from the binary X case where the sets are in general parallelograms. Now the shapes are more like polygons. The projections of the identified set show that given the strength of the instrument the richer support of X delivers the much smaller set. For the moderate IV, α and β lie on [0.488, 0.504] and [0.996, 1.010] respectively which are indeed 25

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Semiparametric Identification in Panel Data Discrete Response Models

Semiparametric Identification in Panel Data Discrete Response Models Semiparametric Identification in Panel Data Discrete Response Models Eleni Aristodemou UCL March 8, 2016 Please click here for the latest version. Abstract This paper studies partial identification in

More information

Counterfactual worlds

Counterfactual worlds Counterfactual worlds Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP22/15 Counterfactual Worlds Andrew Chesher and Adam M. Rosen CeMMAP

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Instrumental variable models for discrete outcomes

Instrumental variable models for discrete outcomes Instrumental variable models for discrete outcomes Andrew Chesher The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP30/08 Instrumental Variable Models for Discrete Outcomes

More information

Limited Information Econometrics

Limited Information Econometrics Limited Information Econometrics Walras-Bowley Lecture NASM 2013 at USC Andrew Chesher CeMMAP & UCL June 14th 2013 AC (CeMMAP & UCL) LIE 6/14/13 1 / 32 Limited information econometrics Limited information

More information

Econometric Analysis of Games 1

Econometric Analysis of Games 1 Econometric Analysis of Games 1 HT 2017 Recap Aim: provide an introduction to incomplete models and partial identification in the context of discrete games 1. Coherence & Completeness 2. Basic Framework

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Characterizations of identified sets delivered by structural econometric models

Characterizations of identified sets delivered by structural econometric models Characterizations of identified sets delivered by structural econometric models Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP44/16

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

Generalized instrumental variable models, methods, and applications

Generalized instrumental variable models, methods, and applications Generalized instrumental variable models, methods, and applications Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP43/18 Generalized

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008 Instrumental Variable Models for Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice and UCL Revised November 13th 2008 Abstract. Single equation instrumental variable models for

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity A Discontinuity Test for Identification in Nonparametric Models with Endogeneity Carolina Caetano 1 Christoph Rothe 2 Nese Yildiz 1 1 Department of Economics 2 Department of Economics University of Rochester

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

A simple alternative to the linear probability model for binary choice models with endogenous regressors

A simple alternative to the linear probability model for binary choice models with endogenous regressors A simple alternative to the linear probability model for binary choice models with endogenous regressors Christopher F Baum, Yingying Dong, Arthur Lewbel, Tao Yang Boston College/DIW Berlin, U.Cal Irvine,

More information

Partial Identification of Nonseparable Models using Binary Instruments

Partial Identification of Nonseparable Models using Binary Instruments Partial Identification of Nonseparable Models using Binary Instruments Takuya Ishihara October 13, 2017 arxiv:1707.04405v2 [stat.me] 12 Oct 2017 Abstract In this study, we eplore the partial identification

More information

What do instrumental variable models deliver with discrete dependent variables?

What do instrumental variable models deliver with discrete dependent variables? What do instrumental variable models deliver with discrete dependent variables? Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP10/13 What

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Sharp identified sets for discrete variable IV models

Sharp identified sets for discrete variable IV models Sharp identified sets for discrete variable IV models Andrew Chesher Konrad Smolinski The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP11/10 Sharp identi ed sets for

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

CHAPTER 7. Connectedness

CHAPTER 7. Connectedness CHAPTER 7 Connectedness 7.1. Connected topological spaces Definition 7.1. A topological space (X, T X ) is said to be connected if there is no continuous surjection f : X {0, 1} where the two point set

More information

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures 36-752 Spring 2014 Advanced Probability Overview Lecture Notes Set 1: Course Overview, σ-fields, and Measures Instructor: Jing Lei Associated reading: Sec 1.1-1.4 of Ash and Doléans-Dade; Sec 1.1 and A.1

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,

More information

Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

More information

Semi and Nonparametric Models in Econometrics

Semi and Nonparametric Models in Econometrics Semi and Nonparametric Models in Econometrics Part 4: partial identification Xavier d Haultfoeuille CREST-INSEE Outline Introduction First examples: missing data Second example: incomplete models Inference

More information

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

Lectures on Identi cation 2

Lectures on Identi cation 2 Lectures on Identi cation 2 Andrew Chesher CeMMAP & UCL April 16th 2008 Andrew Chesher (CeMMAP & UCL) Identi cation 2 4/16/2008 1 / 28 Topics 1 Monday April 14th. Motivation, history, de nitions, types

More information

Graduate Econometrics I: What is econometrics?

Graduate Econometrics I: What is econometrics? Graduate Econometrics I: What is econometrics? Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: What is econometrics?

More information

Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation

Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation The simplest of this general class of models is Tobin s (1958)

More information

An Instrumental Variable Model of Multiple Discrete Choice

An Instrumental Variable Model of Multiple Discrete Choice An Instrumental Variable Model of Multiple Discrete Choice Andrew Chesher y UCL and CeMMAP Adam M. Rosen z UCL and CeMMAP February, 20 Konrad Smolinski x UCL and CeMMAP Abstract This paper studies identi

More information

Using all observations when forecasting under structural breaks

Using all observations when forecasting under structural breaks Using all observations when forecasting under structural breaks Stanislav Anatolyev New Economic School Victor Kitov Moscow State University December 2007 Abstract We extend the idea of the trade-off window

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data Outline A Course in Applied Econometrics Lecture 10 1. Introduction 2. Example I: Missing Data Partial Identification 3. Example II: Returns to Schooling 4. Example III: Initial Conditions Problems in

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

ON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES

ON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES Submitted to the Annals of Probability ON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES By Mark J. Schervish, Teddy Seidenfeld, and Joseph B. Kadane, Carnegie

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Binary Choice Models with Discrete Regressors: Identification and Misspecification

Binary Choice Models with Discrete Regressors: Identification and Misspecification Binary Choice Models with Discrete Regressors: Identification and Misspecification Tatiana Komarova London School of Economics May 24, 2012 Abstract In semiparametric binary response models, support conditions

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Brigham R. Frandsen Lars J. Lefgren December 16, 2015 Abstract This article develops

More information

Censored quantile instrumental variable estimation with Stata

Censored quantile instrumental variable estimation with Stata Censored quantile instrumental variable estimation with Stata Victor Chernozhukov MIT Cambridge, Massachusetts vchern@mit.edu Ivan Fernandez-Val Boston University Boston, Massachusetts ivanf@bu.edu Amanda

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Generating p-extremal graphs

Generating p-extremal graphs Generating p-extremal graphs Derrick Stolee Department of Mathematics Department of Computer Science University of Nebraska Lincoln s-dstolee1@math.unl.edu August 2, 2011 Abstract Let f(n, p be the maximum

More information

Identifying Effects of Multivalued Treatments

Identifying Effects of Multivalued Treatments Identifying Effects of Multivalued Treatments Sokbae Lee Bernard Salanie The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP72/15 Identifying Effects of Multivalued Treatments

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models

Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models Bradley Setzler December 8, 2016 Abstract This paper considers identification and estimation of ceteris paribus

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression. Ryan Godwin. ECON University of Manitoba Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Victor Chernozhukov Iván Fernández-Val Jinyong Hahn Whitney Newey MIT BU UCLA MIT February 4, 2009 1 First version of May 2007.

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

A Guide to Modern Econometric:

A Guide to Modern Econometric: A Guide to Modern Econometric: 4th edition Marno Verbeek Rotterdam School of Management, Erasmus University, Rotterdam B 379887 )WILEY A John Wiley & Sons, Ltd., Publication Contents Preface xiii 1 Introduction

More information

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BRENDAN KLINE AND ELIE TAMER UNIVERSITY OF TEXAS AT AUSTIN AND HARVARD UNIVERSITY Abstract. This paper develops a Bayesian approach to inference

More information

Khinchin s approach to statistical mechanics

Khinchin s approach to statistical mechanics Chapter 7 Khinchin s approach to statistical mechanics 7.1 Introduction In his Mathematical Foundations of Statistical Mechanics Khinchin presents an ergodic theorem which is valid also for systems that

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008 Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS Revised April 2nd 2008 Abstract. This paper studies models for discrete outcomes which permit explanatory

More information

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Maria Ponomareva University of Western Ontario May 8, 2011 Abstract This paper proposes a moments-based

More information

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern

More information

IV estimators and forbidden regressions

IV estimators and forbidden regressions Economics 8379 Spring 2016 Ben Williams IV estimators and forbidden regressions Preliminary results Consider the triangular model with first stage given by x i2 = γ 1X i1 + γ 2 Z i + ν i and second stage

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Simulation-based robust IV inference for lifetime data

Simulation-based robust IV inference for lifetime data Simulation-based robust IV inference for lifetime data Anand Acharya 1 Lynda Khalaf 1 Marcel Voia 1 Myra Yazbeck 2 David Wensley 3 1 Department of Economics Carleton University 2 Department of Economics

More information

The Instability of Correlations: Measurement and the Implications for Market Risk

The Instability of Correlations: Measurement and the Implications for Market Risk The Instability of Correlations: Measurement and the Implications for Market Risk Prof. Massimo Guidolin 20254 Advanced Quantitative Methods for Asset Pricing and Structuring Winter/Spring 2018 Threshold

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

An instrumental variable model of multiple discrete choice

An instrumental variable model of multiple discrete choice Quantitative Economics 4 (2013), 157 196 1759-7331/20130157 An instrumental variable model of multiple discrete choice Andrew Chesher Department of Economics, University College London and CeMMAP Adam

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Econ 2148, fall 2017 Instrumental variables II, continuous treatment

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department of Economics, Harvard University 1 / 35 Recall instrumental variables part I Origins of instrumental variables:

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances Discussion Paper: 2006/07 Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances J.S. Cramer www.fee.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department of

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information