Estimating the causal effect of fertility on economic wellbeing: data requirements, identifying assumptions and estimation methods

Size: px
Start display at page:

Download "Estimating the causal effect of fertility on economic wellbeing: data requirements, identifying assumptions and estimation methods"

Transcription

1 Empir Econ (2013) 44: DOI /s Estimating the causal effect of fertility on economic wellbeing: data requirements, identifying assumptions and estimation methods Bruno Arpino Arnstein Aassve Received: 21 April 2008 / Accepted: 20 November 2009 / Published online: 14 March 2010 Springer-Verlag 2010 Abstract This article aims to answer to what extent fertility has a causal effect on households economic wellbeing an issue that has received considerable interest in development studies and policy analysis. However, only recently has this literature begun to give importance to adequate modelling for estimation of causal effects. We discuss several strategies for causal inference, stressing that their validity must be judged on the assumptions we can plausibly formulate in a given application, which in turn depends on the richness of available data. We contrast methods relying on the unconfoundedness assumption, which include regressions and propensity score matching, with instrumental variable methods. This discussion has a general importance, representing a set of guidelines that are useful for choosing an appropriate strategy of analysis. The discussion is valid for both cross-sectional or panel data. Keywords Fertility Poverty Causal inference Unconfoundedness Instrumental variables VLSMS JEL Classification D19 I32 J13 1 Introduction There is a strong positive correlation between poverty and family size in most developing countries (Schoumaker and Tabutin 1999). Not much is known however, about the extent fertility has a causal impact on households wellbeing. Needless to say, the issue is of critical importance for implementing sound policies. This article considers B. Arpino (B) A. Aassve Department of Decision Sciences, DONDENA Centre for Research on Social Dynamics, Bocconi University, Via Roentgen, Milan, Italy bruno.arpino@unibocconi.it

2 356 B. Arpino, A. Aassve different strategies for establishing the causal effect of fertility on households wellbeing. We take a quasi-experimental approach where fertility is considered as a treatment and the outcome is the equivalised household consumption expenditure. We adopt the potential outcomes framework (Neyman 1923; Rubin 1974, 1978) where recorded childbearing events are used as a measure of fertility. Consequently, each household i has two potential outcomes: Y i1 if it experiences a childbearing event between two points in time (treated) and Y i0 otherwise (untreated or control). However, childbearing is, at least in part, down to individual choice, giving rise to self-selection: households that choose to have more children (self-selected into the treatment) may be very different from households that choose to have fewer children irrespective of the treatment. Hence, if we observe that the first group of households has on average lower per capita expenditure, we cannot necessarily assert that this is due to fertility since the two groups of households are likely to be different in respect to many other characteristics, such as education. Thus, a simple difference in the average consumption (or income) for the two groups of households gives a biased estimate. We discuss different strategies to deal with the self-selection problem, stressing that their validity must be judged on the assumptions we can formulate in a given application, which in turn depends on the richness of available data. A key distinction is between those situations where we can assume that selection depends only on characteristics that are observed by the researcher (selection on observables) and those situations where one or more of the relevant characteristics are unobserved (selection on unobservables). In the first case, we compare units of similar characteristics that differ only by the treatment status. For these units the observed difference in the outcome can be reasonably assumed to be due to the treatment. Propensity score matching (PSM) relies on the selection on observables assumption, which is referred as the unconfoundedness assumption (UNC). Multiple regression is also a method relying on this assumption, though the identifying assumption can be stated in a weaker way (see, e.g. Wooldridge 2002). The empirical analyses use the Vietnam Living Standard measurement Survey, a rich panel data set, which was first surveyed in 1992/1993 and with a follow up in 1997/1998. Exploiting the longitudinal structure of the data, we develop our estimators in a pre post treatment setting. This has several advantages. First, covariates are measured before the exposure to the treatment, which makes it more likely that covariates are not affected by the treatment (e.g. Rosenbaum 1984; Imbens 2004). A second advantage is that the lagged value of the outcome variable, Y t1, can be included in the set of matching covariates all of which being measured at the first wave. This is important because the household s level of living standard prior to treatment is relevant both for the probability of experiencing a childbearing event between the two waves and for the consumption expenditure levels at the second wave, Y t2. Having information at two points in time, the dependent variable can be defined as the difference between the levels of the outcome after and before the treatment. In particular, we match individuals in the treatment group with individuals in the control group having similar first-period values, and their changes in outcomes are compared. An advantage of taking the difference in the pre- and post-treatment outcomes is that this helps removing residual imbalance in the average values of Y t1 between treated and control

3 Estimating the causal effect of fertility on economic wellbeing 357 group. Moreover, it is likely (in our application at least) that the variance is lower when outcome is defined as a change, as opposed to when maintaining the level. Hence, the resulting estimator will be more efficient than the one relying on levels. Importantly, we stress the fact that specifying the outcome as a difference does not change the estimand. The interest remains on the effect of childbearing events between the two waves on the consumption expenditure level at the second wave. Our approach is useful in the sense that the general discussion of methods based on the assumption of selection on observables compared to those based on unobservables, applies independently of whether the application is based on longitudinal or cross-sectional data. The standard solution to deal with selection on unobservables is to use an instrumental variable (IV) method, which of course relies on the availability of a good instrument, which in our case should be a variable which influence fertility and has no direct impact on consumption expenditures. However, even if such a variable is available the estimator can be unsatisfactory. The reason is that, unless we are willing to impose very strong assumptions, IV estimates refer only to the unobserved sub-sample of the population that reacts to the chosen instrument, i.e. the compliers (Imbens and Angrist 1994; Angrist et al. 1996). The corresponding parameter estimate is, consequently, the local average treatment effect (LATE) which, in the presence of heterogeneous treatment effects, may be different from average treatment effect (ATE) and the average treatment effect for the treated (ATT) that usually are the parameters of interest. This is of course important for policy analysis, since only if the instrument coincides with a variable of real policy relevance, can we also argue that the estimated LATE has direct policy usefulness (Heckman 1997). Moreover, the estimated LATE based on different instruments are generally different because the identified sub-populations of compliers are different. We implement the IV approach demonstrating its benefits and drawbacks by using two very different instruments. The first instrument is the sex composition of existing children. This is a widely used instrument (see, e.g. Angrist and Evans 1998; Chun and Oh 2002; Gupta and Dubey 2003) and is based on the fact that parents in Vietnam tend to have a strong preference for boys, especially in the North (Haughton and Haughton 1995; Johansson 1996, 1998; Belanger 2002). Since the preference for sons is a wide-spread phenomenon among Vietnamese households we expect the proportion of compliers to be rather high. The second instrument is the availability of contraception at the community level. This is similar to other well-used instruments related to the availability of services in the neighbourhood or its distance from the dwelling (examples include McClellan et al. (1994) who use proximity to cardiac care centres or Card (1995) who uses college proximity). An interesting aspect of this instrument is that it corresponds to a potential policy variable on which policy makers can act to both reduce fertility and, through it, make an impact on poverty. However, areas without availability of contraceptives in Vietnam are few (Nguyen-Dinh 1997; Duy et al. 2001; Anh and Thang 2002), which means that it cannot be considered as a general policy tool. From a statistical point of view, a key difference between the two instruments is that the second one cannot be considered as randomised, because availability of contraception is related to other characteristics of the community, which in turn may influence households wellbeing. As a consequence, detailed control for covariates

4 358 B. Arpino, A. Aassve is required, which is usually accomplished by imposing functional form and additive separability in the error term. However, these and other strong assumptions can be avoided if implementing a non-parametric approach, such as the one suggested by Frölich (2007). Whereas we have in our application access to valid instruments, this is not always so. In those situations, IV estimators cannot be used and it becomes important to implement sensitivity analysis for estimators based on selection on observables. So far, this is not very common in the applied literature, but is a critical tool as a means to assess the credibility of the identifying assumption. The key idea of the approach is to evaluate how strong the associations among an unmeasured variable, the treatment and outcome variables must be in order to undermine the results of the analysis based on the UNC. If the results are highly sensitive, the validity of the identifying assumption becomes questionable. Among the different approaches for sensitivity analysis proposed in the literature, we discuss and apply those suggested by Rosenbaum (1987b) and Ichino et al. (2008). The article is organised as follows. Section 2 reviews the statistical issues, Sect. 3 provides background information about the application, Sect. 4 shows the results, and Sect. 5 concludes. 2 Causal inference in observational studies under the potential outcomes approach The potential outcomes approach was introduced by Neyman (1923) and extended by Rubin (1974) to observational studies. We invoke the stable unit treatment value assumption (SUTVA) (Rubin 1980), which states that the potential outcomes for each unit are not affected by the treatments assigned to any other units and that there are no hidden versions of the treatment. Potential outcomes are denoted by Y 1, to indicate the outcome that would have resulted if the unit was exposed to the treatment and Y 0 if it was exposed to the control (Rosenbaum and Rubin 1983a). Since each unit receives only the treatment or control, either Y 1 or Y 0 is observed for each unit. Assume that we have a random sample of N individual units under study {d i, y i, x i } i=1 N. D represents the treatment indicator that takes the value 1 for treated units and 0 for untreated or the controls, Y indicates the observed outcome, and X indicates the set of covariates or confounders. 1 The two causal parameters usually of interest are the ATE and the ATT which are defined as: ATE = E (Y 1 Y 0 ) (1) ATT = E (Y 1 Y 0 D = 1). (2) The ATE is the expected effect of the treatment on a randomly drawn unit from the population while the ATT gives the expected effect of the treatment on a randomly 1 As a convention, capital letters usually denote random variables, whereas small letters indicate their realisations. For simplicity, population units are usually not indexed by unit indicators unless this is necessary for clarity.

5 Estimating the causal effect of fertility on economic wellbeing 359 drawn unit from the population of treated. It is consequently the parameter that tends to be of interest to policy makers (Heckman et al. 1997). 2.1 Identifying assumptions Those situations where selection depends only on observed characteristics represent a critical distinction from the case where selection also depends on unobserved characteristics. The selection on observables assumption is also known as the UNC and represents the fundamental identifying assumption for a large range of empirical studies: 2 Assumption A.1 (Unconfoundedness) Y 1, Y 0 D X where in the notation introduced by Dawid (1979) indicates independence. Assumption A.1 implies that after conditioning on variables influencing both the selection and the outcome, the dependence among potential outcomes and the treatment is cancelled out. Regression and matching techniques, as well as stratification and weighting methods, all rely on this assumption. In the regression analysis, it suffice to assume that conditional independence of potential outcomes on the treatment hold in expected values (see, e.g. Wooldridge 2002, p. 607). That is, we can substitute assumption A.1 with the weaker: E(Y 1 D, X) = E(Y 1 X) and E(Y 0 D, X) = E(Y 0 X). The fundamental idea behind these methods is to compare treated units with control units that are similar in their characteristics. Another assumption, termed overlap, is also required. Assumption A.2 (Overlap) 0 < P(D = 1 X) <1. where P(D = 1 X) is the conditional probability of receiving the treatment given covariates, X. Assumption A.2 implies equality in the support of X in the two groups of treated and controls (i.e. Support (X D = 1) = Support(X D = 0)) which guaranties that ATE is well defined (Heckman et al. 1997). If the assumption does not hold, then it is possible that for some values of the covariates there are no comparable units. The most common approach to deal with selection on unobservables is to exploit the availability of an IV a variable assumed to impact the selection into treatment but to have no direct influence on the outcome. The concrete possibility to use an IV method relies, of course, on the availability of such a variable. In practice, instruments are often difficult to find. In this case, a sensitivity analysis becomes very useful because it can be used to assess the importance of the violation of the UNC for the estimated causal effect. Of course, it does not represent an alternative to the IV approach. 2 The unconfoundedness assumption is sometimes referred to as the conditional independence or the exogeneity assumption (Imbens 2004).

6 360 B. Arpino, A. Aassve As with methods based on the UNC also IV methods impose a range of critical identifying assumptions. Let us consider a binary instrument indicated by Z. In randomised settings, the levels of the instruments can be seen as the assignment to the treatment, which is different from the treatment actually taken due to non-compliance. Under SUTVA, we indicate with D z and Y z,d, respectively, the binary potential treatment indicator and the potential outcomes for unit i. The identifying assumptions for the estimation of causal effects using the availability of one IV are clarified by Angrist et al. (1996, in the following AIR). Apart from the SUTVA and the randomisation of the instrument, the fundamental assumptions are: Assumption B.1 (Exclusion Restriction) Y z,d = Y d Assumption B.2 (Nonzero Average Causal Effect of Z on D) Assumption B.3 (Monotonicity) E[D 1 D 0 ] = 0 D i1 D i0 for all i = 1,...,N. The exclusion restriction means that the instrument Z impacts on Y only through D and corresponds to validity of the instrument. Assumption B.2 requires that for at least some unit the instrument changes the treatment status and corresponds to the hypothesis of nonzero correlation between the instrument and the endogenous variable (i.e. relevance of the instrument). The assumption of monotonicity is critical when comparing the IV approach to methods based on the UNC. To see how, we have to characterise units by the way they might react to the level of the instrument. A first group is termed compliers and defined by units that are induced to take the treatment by the instrument: D i1 D i0 = 1. Other units may not be influenced by the instrument and are defined as either always-takers, where D i1 = D i0 = 1 (they always take the treatment whatever being the level of the instrument), or never-takers,ifd i1 = D i0 = 0 (they always take the control). Finally, we might encounter defiers, who are units that do the opposite of their assignment status. The monotonicity assumption implies that there are no defiers and is crucial for identification since otherwise the treatment effect for those who shift from non-participation to participation when Z shift from 0 to 1 can be cancelled out by the treatment effect of those who shift from participation to non-participation (Imbens and Angrist 1994). Importantly, the monotonicity assumption, likewise the exclusion restriction and the UNC, is untestable and its plausibility has to be evaluated in the context of the given application. AIR demonstrate that under the aforementioned assumptions we can only identify the average causal effect calculated on the sub-population of compliers, which is termed the LATE: LATE = E [Y 1 Y 0 D 1 D 0 = 1]. (3)

7 Estimating the causal effect of fertility on economic wellbeing 361 Critically important for empirical work is that in case of heterogeneous treatment effects LATE is in general different from the ATE and the ATT, which tend to be the parameters of interest. This is because LATE refers only to the sub-population of compliers, while ATE and ATT are defined, respectively, on the whole population and on the sub-population of treated. Moreover, a serious drawback of the LATE is that the sub-population of compliers is not identifiable by the data. Finally, the estimated LATE depends on the instrument used because different instruments identify different sub-population of compliers. In specific applications LATE becomes an interesting parameter for policy. Suppose that the policy maker wants to know the (average) causal effect of D on Y when we obtain a change in D by manipulating it through Z. In this case, the interest lies in the (average) causal effect of D on Y for units that react to the policy intervention on Z (the compliers). In this situation, however, the policy maker cannot identify which are the compliers, but can only estimate the dimension of this group. The presumption in such cases is that the average causal effect calculated on the sub-population of units whose behaviour was modified by assignment is likely to be informative about sub-populations that will comply in the future. 2.2 Strategies for the estimation of causal effects We discuss here three strategies for the estimation of causal effects. The first is based on assumptions A.1 and A.2, and includes regression and PSM. The second strategy consists of combining these methods with a sensitivity analysis. In essence, the sensitivity analysis assesses the robustness of estimates when we suspect failure of the UNC assumption. The third method is the IV approach. Rather than discussing the technical details of each estimator in depth we present instead the general ideas and limitations of the different techniques. For a formal comparison of these methods, we refer to Blundell et al. (2005) and Imbens (2004) Strategy 1: methods based on the UNC In the standard multivariate regression model, we assume a linear relationship between outcome and independent variables and homogeneity of treatment effects; in fact, in the simplest regression model the treatment variable is not interacted with covariates and its coefficient is the same for all units. This model constrains the ATE to coincide with the ATT and if treatment effects are heterogeneous we are not able to make separate estimates of the two quantities. 3 Moreover, if the true model is nonlinear, the OLS estimates of the treatment effects would be in general biased. In parametric regression, the overlap assumption is not required in so far we can be sure to have the correct specification of the model. Otherwise the comparison of treated and control units outside the common support rely heavily on the linear extrapolation. Of course, the standard model can be extended and made flexible to overcome these limitations. For example, 3 In general, ATE and ATT are expected to differ if the distribution of covariates in the treated and control group are different and if the treatment interacts with covariates (at least some of them).

8 362 B. Arpino, A. Aassve the common support problem can be circumvented by first estimating it and running the regression conditioning on it. Moreover, we can avoid to assume homogeneous treatment effects by including a complete set of interactions between each one of the covariate X and the treatment indicator D. This gives rise to the so-called fully interacted linear model (FILM in the following see Goodman and Sianesi 2005). Since in the FILM, differently from a fully saturated model, covariates are not recoded into qualitative variables, this approach is still parametric with respect to the way continuous covariates enter the regression function and interact with the treatment. Also, the linearity assumption can be avoided if we use a non-parametric method, such as a kernel estimator (see Hardle and Linton 1994), which allows the functional form between outcome and independent variables to be determined by the data themselves. Non-parametric methods, however, have computational drawbacks when the set of covariates is large and many of them are multi-valued, or, worse, continuous. This problem, known as curse of dimensionality, is also relevant for matching methods. A popular way to overcome the dimensionality problem is to implement the matching on the basis of a univariate propensity score (Rosenbaum and Rubin 1983a). This is defined as the conditional probability of receiving a treatment given pre-treatment characteristics: e(x) Pr{D = 1 X} =E{D X}. When the propensity scores are balanced across the treatment and control groups, the distribution of all covariates X, are balanced in expectation across the two groups (balancing property of the propensity score). Therefore, matching on the propensity score is equivalent of matching on X. Once the propensity score is estimated, several methods of matching are available. The most common ones are kernel (gaussian and epanechnikov), nearest neighbour, radius and stratification matching (for a discussion about these methods see Caliendo and Kopeinig 2005; Smith and Todd 2005; Becker and Ichino 2002). Asymptotically, all PSM estimators should yield the same results (Smith 2000), while in small samples the choice of the matching algorithm can be important and generally a trade-off between bias and variance arises (Caliendo and Kopeinig 2005). As noted by Bryson et al. (2002) it is sensible to try a number of approaches. If they give similar results, the choice is irrelevant. Otherwise, further investigation is needed in order to reveal the source of the disparity. As will be explained in Sect. 4, we adopt this pragmatic approach and assess the sensitivity of results with respect to the matching method. Consistent with many other previous studies (see, e.g. Smith and Todd 2005), the different estimators yield very similar results (both in terms of point estimate and standard errors). The analysis in Sect. 4 is based on a nearest neighbour matching method meaning that for each treated (control) unit the algorithm finds the control (treated) unit with the closest propensity score. We use the variant with replacement implying that we allow a control (treated) individual to be used more than once as a match for individuals in the treated (control) sample. Among the other methods we tried (nearest neighbour without replacement, k-nearest neighbour, radius and kernel) this approach guarantees the best quality of matches, because only units with the closest propensity score are matched, but at the cost of higher variance (Caliendo and Kopeinig 2005). Focussing on the estimation of the ATT, to estimate the treatment effect for a treated person i, the observed outcome y i1 is compared to the outcomes y j0 for the matched unit j in the untreated sample. The ATT estimator can be written as:

9 Estimating the causal effect of fertility on economic wellbeing 363 ATT ˆ = 1 n D i:d i =1 [ yi1 y m(i)0 ], (4) where n D is the number of treated that find a match in the untreated group and m(i) indicates the matched control for treated unit i. Under assumptions A.1 and A.2, regression and matching techniques can be used with cross-sectional data to estimate ATE and ATT, in which case Y, X, D are all measured at the same time. Longitudinal data available for at least two time points offers some important practical advantages. First, one is in a better position to measure covariates before the exposure to treatment. As is well known, one should only control for those covariates not being affected by the treatment itself (e.g. Rosenbaum 1984). Hence, being able to measure variables before the treatment makes this condition more likely to hold (Imbens 2004). To make this explicit we indicate covariates as X t1, while the outcome as Y t2. The treatment indicator, D, measures childbearing events between t 1 and t 2 and the ATT estimator can be written as: ATT ˆ = 1 n D i:d i =1 [ yi1 t 2 y m(i)0t2 ] A second advantage is that we can include in the matching set the outcome variable of interest measured before the exposure to treatment. In our application, where the outcome is the consumption expenditure (see Sect. 4), we include this variable, Y t1, in the conditioning set measured at the first wave. This reflects the households level of living standards prior to treatment, and is likely to be of relevance both for the probability to experience a childbearing event between the two waves and for the consumption expenditure levels at the second wave, Y t2. The UNC assumption can be more explicitly written as: Assumption A.3 (Unconfoundedness) Y 1t2, Y 0t2 D X t1, Y t1 As noted by Athey and Imbens (2006), assumption A.3 implies that individuals in the treatment group should be matched with individuals in the control group with similar (identical if the matching could be perfect) first-period outcome, as well as other pre-treatment characteristics, and their second-period outcomes should be compared. However, perfect matching is not feasible and matching on the propensity score guarantees that, on average, covariates (including Y t1 in our case) are balanced in the matched treated and control group. Importantly, taking the difference in the pre- and post-treatment outcomes helps in reducing any remaining unbalance in Y t1. This approach is similar in spirit to the bias-correction proposal of Abadie and Imbens (2002) to reduce bias due to residual imbalance in covariates after matching. The fact that the dependent variable is now defined as the difference in the levels of the outcome after and before the treatment implies that the ATT estimator can be written as:

10 364 B. Arpino, A. Aassve ATT ˆ = 1 n D i:d i =1 [( yid,t2 y id,t1 ) ( ym(i)u,t2 y m(i)u,t1 )], (5) where the subscripts D and U make explicit that the two first outcomes in (5) are measured on treated units and the other two on untreated units. From formula (5) we can see that if the matching is exact on the variable Y t1 then the estimate obtained using the difference as outcome (5) is exactly equal to that in (4). However, even if the matching is not exact but the PSM works well (i.e. we succeed in balancing Y t1 ) then the two estimators are expected to give similar results. It is worth noting that despite the fact that estimator (5) used as dependent variable the change instead of levels at time 2, the estimands of interest, namely, ATE and ATT as defined in (1) and (2) are the same. For example, for the ATT we can note that: ATT = E(Y 1t2 Y 0t2 D = 1) = E[(Y 1t2 Y t1 ) (Y 0t2 Y t1 ) D = 1]. Another advantage from taking the difference is that we expect the resulting estimator to be more efficient. This is likely to be the case in our application (and we suspect in many other applications), since there will be more heterogeneity in the levels of the consumption expenditure at time t 2 compared to the consumption growth between the two waves. In other words, the variable Y t2 is likely to have a higher variance than (Y t2 Y t1 ) although this is not true in general. A related literature motivates the advantages of considering the difference in the pre post levels of the outcome as a way to improve the robustness of the matching method through elimination of possible time-invariant unobservables (Heckman et al. 1997; Smith and Todd 2005; Aassve et al. 2007). The resulting estimator is similar to ours, apart from the fact that Y t1 is not included in the set of matching covariates. The estimator is labelled as matching-difference-in-difference (MDID) and relies on an identifying assumption that is different from A.3. For example, for the ATT the identifying assumption can be written as 4 :(Y 0t2 Y 0t1 ) D X t1. As noted by Athey and Imbens (2006, p. 448) the two assumptions coincide under special conditions imposed on the unobserved components. Otherwise, the two identifying strategy, even though similar, are different and the A.3 remains a selection on observables assumption. The choice is subject matter and depends on what the researcher believes is the best identifying strategy for his/her application. We use A.3 as a starting point and compare treated and control with similar background characteristics X and initial values of the outcome instead of relying on assumptions of conditional parallel trend in the outcome as with the MDID. Having maintained an unconfoundedness-type assumption, the discussion in this section applies also to cross-sectional studies. To deal with the possible presence of unobservables we discuss methods for sensitivity analysis and IV methods Strategy 2: sensitivity analysis of methods based on the UNC The UNC becomes implausible once one or more relevant confounders are unobserved. If an instrument is available then one can proceeds with an IV estimator that we discuss in the next sub-section. Several approaches are proposed in the literature 4 If only ATE are to be identified, the assumption can be stated in a weaker form as mean independence instead of full independence (e.g. Heckman et al. 1997).

11 Estimating the causal effect of fertility on economic wellbeing 365 to deal with situations where instruments are not available and where the plausibility of the unconfoundedness is doubtful. One approach is to implement indirect test of the UNC assumption, relying on the estimation of a pseudo causal effect that is known to be zero (Imbens 2004). A first type is to focus on estimating the causal effect of the treatment of interest on a variable that is known to be unaffected by it. Another type of tests relies on the presence of multiple control groups (Rosenbaum 1987a; Heckman et al. 1997) that arise, for example, when rules for eligibility are in place. The presence of ineligibility rules is also the basis for the bias-correction method proposed by Costa Dias et al. (2008). An important alternative to the indirect tests is the implementation of sensitivity analyses. The fundamental idea of this approach is to relax the unconfoundedness with the aim to assess how strong an unmeasured variable must be in order to undermine the implications of the matching analysis. If the results are highly sensitive, then the validity of the identifying assumption becomes questionable and alternative estimation strategies must be considered. Different approaches for sensitivity analysis have been proposed in the literature. Rosenbaum and Rubin (1983b) and Imbens (2003) propose methods to assess the sensitivity of ATE estimates in parametric regression models. Here, we apply the approaches suggested by Rosenbaum (1987b) and Ichino et al. (2008, in the following IMN) that does not rely on any parametric models for the estimation of the treatment effects. The underlying hypothesis in all of these approaches is that assignment to treatment may be confounded given the set of observed covariates but it is unconfounded given observed and an unobservable covariate, U: Y 1, Y 0 D X, U. In the Rosenbaum s approach, sensitivity is measured using only the relation between the unobserved covariate and the treatment assignment. To briefly describe the Rosenbaum approach, we link the probability that to receives the treatment, π, to observed characteristics, X, and an unobserved covariate, U, with a logisticregression function: ( ) π log = κ (X) + γ U; with 0 U 1. 1 π Under these assumptions, Rosenbaum shows that the odds ratios between two units i and j with the same X values can be bounded in the following way: 1 π i/(1 π i ) π j / ( 1 π j ), where = e γ. If = 1 this means that unconfoundedness holds and that no hidden bias exists. Increasing values of imply an increasingly important role for unobservables on the selection into treatment. Rosenbaum suggests to progressively increase the values of in order to assess the association required to overturn, or change substantially, p-values of statistical tests of no effect of the treatment. If this happens at high values of this means that the results of the analysis based on the UNC are sensitive to the presence of an unobservable only if this was strongly associated with treatment

12 366 B. Arpino, A. Aassve selection. The plausibility of the presence of such an unobservable has to be judged by the research, depending on the richness of information included in the analysis. Unlike Rosenbaum, the approach by IMN assesses the sensitivity of point estimates of the ATT under different possible scenarios of deviation from the UNC. 5 The underlying hypothesis is, as in the previous approaches, that assignment to treatment may be confounded given the set of observables covariates but it is unconfounded given observed and an unobservable covariate, U. The procedure can be summarised in the following steps: (1) Calculate ATT using PSM on X; (2) Simulate a variable U representing a potential unobserved confounder; (3) Include U together with X in the matching set and calculate ATT; (4) Repeat steps 2 and 3 several times (e.g. 1,000) and calculate average ATT to be compared with the baseline estimate obtained in (1) under UNC. In the simulation process, IMN assume that U and the outcome are binary variables. In case of continuous outcomes, as in our application, a transformation is needed so that the outcome takes the value 1 if it is above a certain threshold (the median for example) and 0 otherwise, alternatively one could consider other outcome variables such as poverty status which essentially is a dichotomous transformation of consumption expenditure. 6 However, this transformed variable is only required to simulate the values of U (step 2) and it is not used as the outcome variable when estimating the ATT (step 3). Since all the involved variables in the simulation are binary, the distribution of U is specified by the four key parameters: p kw = P (U = 1 D = k, Y = w) = P(U = 1 D = k, Y = w, X) k,w = 0, 1 (6) It is assumed here that U is independent to X conditional to D and Y. In order to choose the signs of the associations between U, Y 0 and D, IMN note that if q = p 01 p 00 > 0 then U has a positive effect on Y 0 (conditioning on X), whereas if s = p 1 p 0 > 0, where p k = P(U = 1 D = k), then U has a positive effect on D. Ifwesetp u = P(U = 1) and q = p 11 p 10 the four parameters p kw are univocally identified from specifying the values of q and s. Hence, by changing the values of q and s we can produce different scenarios for U. For example, if we want to mimic the effect of unobserved ability we can set q to a positive value (positive effect on consumption) and s to a negative value (negative effect on fertility). It is important to note that with this approach we can only choose the signs of the associations of U with D and Y 0 according to the values of q and s. However, for increasingly higher absolute values of q and s the strength of the associations increases. Therefore, the idea is to use this sensitivity analysis as in the Rosenbaum approach. The difference is now that, by progressively increase the values of both q and s, we can increase the levels of association between U and treatment and outcome instead of treatment only. In order to have an 5 Under the assumption of an additive treatment effect, Rosenbaum also derives bounds on the Hodges Lehmann point estimate of the treatment effect (see Rosenbaum 2002 for details). 6 For more details on the simulations, see Ichino et al. (2008) andnannicini (2007) for details on the STATA module sensatt which implements this method.

13 Estimating the causal effect of fertility on economic wellbeing 367 easily interpretable measure of these associations, IMN propose to use the following parameters: Ɣ = rep r=1 1 rep [ ] Pr (Y = 1 D = 0, U r = 1, X)/P r (Y = 0 D = 0, U r = 1, X) P r (Y = 1 D = 0, U r = 0, X)/P r (Y = 0 D = 0, U r = 0, X) and = rep r=1 1 rep [ ] Pr (D = 1 U r = 1, X)/P r (D = 0 U r = 1, X) P r (D = 1 U r = 0, X)/P r (D = 0 U r = 0, X) where rep indicates the number of replications. The parameter Ɣ is the average odds ratio from the logit model of P(Y = 1 D = 0, U, X) calculated over several replications of the simulation procedure. It is in other words a measure of the effect of U on Y, and is in this sense an outcome effect. The parameter refers to the average odds ratio from the logit model of P(D = 1 U, X). This is a measure of the effect of U on D, and is therefore a measure of the selection effect. At each replication of the simulation exercise, together with the two mentioned odd ratios, the ATT is estimated using as covariates the set X and the simulated U. The final simulated ATT estimate is the average of the estimates obtained in all the replications Strategy 3: IV methods When UNC is implausible and an instrument is available one would naturally implement IV methods. The way they are implemented depends on whether the available instrument can be thought of as randomised or not. In the previous discussion, we assumed that the instrument is randomised, which means that there is no need to control for covariates. In this case, AIR shows that LATE can be simply estimated by the Wald estimator. However, in many applications Z is not randomly assigned and can be confounded with D or with Y or both. The implication is that in this contexts usually the IV assumptions, as the exclusion restriction, can be thought as being reliable only conditional on a set of covariates. In other words, in these situations Z can be considered unconfounded only conditional on covariates. The conventional approach to accommodate covariates in IV estimation consists of parametric or semi-parametric methods two stages least squares being the most common and classic examples include Card (1995) and Angrist and Krueger (1991). A serious drawback of these methods is that most of them impose additive separability in the error term, which amounts to rule out unobserved heterogeneity in the treatment effects. One approach that overcomes the strong assumptions used by the aforementioned IV methods is the 7 A complementary approach proposed by Manski (1990) consists to drop the UNC assumption entirely and construct bounds for ATT that rely on alternative identifying assumptions, for example that outcome is bounded. IMN show how this approach is related with their sensitivity analysis and argue that non-parametric bounds are too much a conservative method and bounds calculations rely on extreme circumstances that are implausible. Moreover in our application the outcome is continuous and has no natural bounds.

14 368 B. Arpino, A. Aassve non-parametric approach suggested by Frölich (2007). The identifying assumptions in this case are basically the same as is the case of a randomised instrument but stated in terms of conditioning on covariates. In this way, we can identify the conditional LATE, which is the LATE defined for units with specific observed characteristics. The marginal LATE is identified as follows: 8 LATE = (E[Y X, Z = 1] E[Y X, Z = 0]) dfx (E[W X, Z = 1] E[W X, Z = 0]) dfx. (7) When the number of covariates included in the set X is high, non-parametric estimation of equation (7) becomes difficult, especially in small samples. An alternative is to make use of the aforementioned balancing property of the propensity score that allows us to substitute the high dimensional set X in (7) by a univariate variable: π = P(Z = 1 X). 3 Fertility and economic wellbeing and the Vietnamese context Our application is concerned with estimating the causal effect of fertility on economic wellbeing. The interrelationship between the two has received considerable interest in development studies and the economics literature. The traditional micro-economic framework considers children as an essential part of the household s work force as they generate income. This is especially true for male children. In rural underdeveloped regions of the world, which rely largely on a low level of farming technology and where households have no or little access to state benefits, this argument makes a great deal of sense (Admassie 2002). In this setting households will have a high demand for children. The down side is that a large number of children participating in household production hamper investment in human capital (Moav 2005). There are of course important supply side considerations in this regard: rural areas in developing countries have poor access to both education and contraceptives, both limiting the extent couples are able to make choices about fertility outcomes (Easterlin and Crimmins 1985). As households attain higher levels of income and wealth, they also have fewer children, either due to a quantity quality trade-off, as suggested by Becker and Lewis (1973), or due to an increase in the opportunity cost of women earning a higher income, as suggested by Willis (1973). An important aspect with regard to Vietnam is that the country has experienced a tremendous decline in fertility over the past two decades, and at present one can safely claim that the country has completed the fertility transition. The figures speak for themselves: in 1980 the total fertility rate (TFR) was 5.0, in 2003 it was 1.9. Contraceptive availability and knowledge is widespread and family planning programs were initiated already in 1960s (Scornet 2007). 9 8 It is important to note that a common support assumption is needed, as stated by Frölich: Supp(X/Z=1) = Supp(X/Z=0). However, here we give only some intuitions about the assumptions underlying this method. For a detailed and more formal discussion we refer to Frölich s paper. 9 An important factor in this change was the introduction of the Doi Moi (renewal) policy in the late eighties which consisted of replacement of collective farms by allocation of land to individual households;

15 Estimating the causal effect of fertility on economic wellbeing 369 In light of our technical discussion in Sect. 2, the key issue in this application is that fertility decisions can be driven by both observed and unobserved selection. In terms of observables, predicting their effects is relatively straightforward within an economics framework. The key is to understand the drivers behind women s perceived opportunity cost of childbearing. Higher education and labour force participation among women increase women s opportunity cost, producing a negative effect on fertility. It will also increase their income level and hence consumption expenditure. Typically, any increase in the opportunity costs dominates the positive income effect. Increased education among men, and therefore higher earnings, translate into a positive income effect, and hence having a positive effect on fertility (Ermisch 1989). However, empirical analysis shows that there is not necessarily a positive relationship between income and family size (i.e. number of children), the key explanation being that couples make trade-offs between quantity and quality (Becker and Lewis 1973), especially as the country in question develops and pass through the fertility transition. As for the unobservables these can operate through different mechanisms. The key unobserved variables are ability and aspirations and they play an important role in our application. In general, we would expect those with higher ability or aspirations in terms of work and career to have lower fertility because of their higher opportunity cost. Thus, ability is negatively correlated with fertility but is positively associated with consumption expenditure. Moreover, fertility is commonly measured in terms of childbearing events as we do here. However, the childbearing outcomes are the direct result of contraceptive practices, which are typically unmeasured in household surveys. Better knowledge and higher uptake of contraceptives reduces unwanted pregnancies, which would reduce fertility. However, unobserved ability is positively associated with contraceptive use, which reinforces the negative effect of ability on fertility. Fertility is of course based on the joint decision of a couple, and not the woman alone. Hence, behind the childbearing outcomes, there is also a bargaining process taking place. Again, unobserved ability may play an important role. High ability women, may have stronger bargaining power, either as a result of the ability itself (e.g. they are better negotiators), or through the effect higher ability has on their labour supply and hence earnings. Whereas ability works through different mechanisms, the prediction of its effect is rather clear in the sense that high ability is associated with lower fertility, but higher income and hence consumption expenditure. Consequently, its omission implies a negative bias in the estimation of the effect of fertility on consumption expenditure. The data we use comes from the Vietnam living standard measurement survey (VLSMS) first surveyed in 1992/1993 with a follow-up in 1997/1998. The longitudinal nature of the data set allows us to measure if any women in the household experienced another birth between the two waves. The treatment is then defined as a binary Footnote 9 continued legalisation of many forms of private economic activity; removal of price controls; and legalisation and encouragement of Foreign Development Investment (FDI). Since the introduction of Doi-Moi, the country embarked on a remarkable economic recovery, followed by a substantial poverty reduction (Glewwe et al. 2002).

16 370 B. Arpino, A. Aassve Table 1 Average equivalised household consumption expenditure at the two waves and its growth by number of children born between the two waves Number of children born between the two waves Observations Average consumption in 1992 Average consumption in , ,436 1, ,892 1, , , At least ,835 1,004 Total 2, ,201 1,285 Average consumption growth in Notes: We consider the number of children of all household members born between the two waves and still alive at the second wave. All consumption measures are valued in dongs and rescaled using prices in The 2,023 households represented in the table are selected taking only households with at least one married woman aged between 15 and 40 in the first wave. Consumption is expressed in thousands of dongs variable taking value 1 if the household experiences a childbearing event between the two waves (treated) and 0 otherwise (untreated or control). The outcome of interest is the equivalised consumption expenditure level in the second wave. In the empirical implementation presented in the next section, we control for a range of explanatory variables measured in the first wave. The data follows otherwise the standard format of the World Bank LSMS, including detailed information about education, employment, fertility, expenditure and incomes. The survey also provides detailed community information from a separate community questionnaire. This information is available for the 120 rural communities sampled and consists of data on health, schooling and main economic activities. The availability of this information is important for two reasons. First, characteristics of communities where households reside are likely to influence both economic wellbeing and fertility and, hence, are potentially relevant confounders. Second, from this information we get an interesting IV, represented by the availability of contraceptives in the community. The conventional approximation for the household s welfare is to use the household s observed consumption expenditure, which requires detailed information on consumption behaviour and its expenditure pattern (Coudouel et al. 2002; Deaton and Zaidi 2002). The expenditure variables are calculated by the World Bank procedure which is readily available with the VLSMS. We choose a relatively simple equivalence scale giving to each child aged 0 14 in the household a weight of 0.65 relative to adults. 10 Table 1 shows simple descriptive analysis highlighting a clear negative association between number of children and economic wellbeing. Our choice of covariates is based mainly on dimensions which are important for both household s standard of living and fertility behaviour and hence are potentially confounders that have to be included in the conditioning set X to make the UNC plausible. All these variables can theoretically have an impact on change in consumption 10 We assessed the robustness of results to the imposed equivalence scale. Results are consistent to those presented here for reasonable equivalence scales. This analysis is available from authors upon request.

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

The effect of fertility on poverty: an example of causal inference with multilevel data in demographic research

The effect of fertility on poverty: an example of causal inference with multilevel data in demographic research The effect of fertility on poverty: an example of causal inference with multilevel data in demographic research Bruno Arpino epartment of ecision Sciences and Carlo F. ondena Center for Research on Social

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino University of Bologna and Cepr June 11, 2007 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid UK Stata Users Group Meeting London, September 10, 2007 CONTENT Presentation of a Stata

More information

Recitation Notes 6. Konrad Menzel. October 22, 2006

Recitation Notes 6. Konrad Menzel. October 22, 2006 Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function

More information

CEPA Working Paper No

CEPA Working Paper No CEPA Working Paper No. 15-06 Identification based on Difference-in-Differences Approaches with Multiple Treatments AUTHORS Hans Fricke Stanford University ABSTRACT This paper discusses identification based

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Impact Evaluation Technical Workshop:

Impact Evaluation Technical Workshop: Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, 2016 Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) Our team! 2 Avi Feller (Berkeley) Jane Furey (Abt Associates)

More information

150C Causal Inference

150C Causal Inference 150C Causal Inference Instrumental Variables: Modern Perspective with Heterogeneous Treatment Effects Jonathan Mummolo May 22, 2017 Jonathan Mummolo 150C Causal Inference May 22, 2017 1 / 26 Two Views

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Impact Evaluation of Rural Road Projects. Dominique van de Walle World Bank

Impact Evaluation of Rural Road Projects. Dominique van de Walle World Bank Impact Evaluation of Rural Road Projects Dominique van de Walle World Bank Introduction General consensus that roads are good for development & living standards A sizeable share of development aid and

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects The Problem Analysts are frequently interested in measuring the impact of a treatment on individual behavior; e.g., the impact of job

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Yona Rubinstein July 2016 Yona Rubinstein (LSE) Instrumental Variables 07/16 1 / 31 The Limitation of Panel Data So far we learned how to account for selection on time invariant

More information

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy Department of Economics, Harvard University 1 / 40 Agenda instrumental variables part I Origins of instrumental

More information

A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid North American Stata Users Group Meeting Boston, July 24, 2006 CONTENT Presentation of

More information

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1 Imbens/Wooldridge, IRP Lecture Notes 2, August 08 IRP Lectures Madison, WI, August 2008 Lecture 2, Monday, Aug 4th, 0.00-.00am Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Introduction

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

Causal inference for observational studies extended to a multilevel setting. The impact of fertility on poverty in Vietnam

Causal inference for observational studies extended to a multilevel setting. The impact of fertility on poverty in Vietnam Università degli Studi di Firenze Dipartimento di Statistica G. Parenti Dottorato di Ricerca in Statistica Applicata XX ciclo SECS-S/01 Causal inference for observational studies extended to a multilevel

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

PSC 504: Instrumental Variables

PSC 504: Instrumental Variables PSC 504: Instrumental Variables Matthew Blackwell 3/28/2013 Instrumental Variables and Structural Equation Modeling Setup e basic idea behind instrumental variables is that we have a treatment with unmeasured

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino European University Institute April 15, 2014 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

Implementing Matching Estimators for. Average Treatment Effects in STATA

Implementing Matching Estimators for. Average Treatment Effects in STATA Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University West Coast Stata Users Group meeting, Los Angeles October 26th, 2007 General Motivation Estimation

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 18 Instrumental Variables

More information

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures Andrea Ichino (European University Institute and CEPR) February 28, 2006 Abstract This course

More information

The Economics of European Regions: Theory, Empirics, and Policy

The Economics of European Regions: Theory, Empirics, and Policy The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II A Course in Applied Econometrics Lecture Outline Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Assessing Unconfoundedness (not testable). Overlap. Illustration based on Lalonde

More information

Controlling for overlap in matching

Controlling for overlap in matching Working Papers No. 10/2013 (95) PAWEŁ STRAWIŃSKI Controlling for overlap in matching Warsaw 2013 Controlling for overlap in matching PAWEŁ STRAWIŃSKI Faculty of Economic Sciences, University of Warsaw

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Research Design: Causal inference and counterfactuals

Research Design: Causal inference and counterfactuals Research Design: Causal inference and counterfactuals University College Dublin 8 March 2013 1 2 3 4 Outline 1 2 3 4 Inference In regression analysis we look at the relationship between (a set of) independent

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Noncompliance in Randomized Experiments Often we cannot force subjects to take specific treatments Units

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects. A Course in Applied Econometrics Lecture 5 Outline. Introduction 2. Basics Instrumental Variables with Treatment Effect Heterogeneity: Local Average Treatment Effects 3. Local Average Treatment Effects

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Empirical Methods in Applied Microeconomics

Empirical Methods in Applied Microeconomics Empirical Methods in Applied Microeconomics Jörn-Ste en Pischke LSE November 2007 1 Nonlinearity and Heterogeneity We have so far concentrated on the estimation of treatment e ects when the treatment e

More information

Recitation Notes 5. Konrad Menzel. October 13, 2006

Recitation Notes 5. Konrad Menzel. October 13, 2006 ecitation otes 5 Konrad Menzel October 13, 2006 1 Instrumental Variables (continued) 11 Omitted Variables and the Wald Estimator Consider a Wald estimator for the Angrist (1991) approach to estimating

More information

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest Edward Vytlacil, Yale University Renmin University, Department

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models An obvious reason for the endogeneity of explanatory variables in a regression model is simultaneity: that is, one

More information

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d

More information

A Simulation-Based Sensitivity Analysis for Matching Estimators

A Simulation-Based Sensitivity Analysis for Matching Estimators A Simulation-Based Sensitivity Analysis for Matching Estimators Tommaso Nannicini Universidad Carlos III de Madrid Abstract. This article presents a Stata program (sensatt) that implements the sensitivity

More information

(Mis)use of matching techniques

(Mis)use of matching techniques University of Warsaw 5th Polish Stata Users Meeting, Warsaw, 27th November 2017 Research financed under National Science Center, Poland grant 2015/19/B/HS4/03231 Outline Introduction and motivation 1 Introduction

More information

One Economist s Perspective on Some Important Estimation Issues

One Economist s Perspective on Some Important Estimation Issues One Economist s Perspective on Some Important Estimation Issues Jere R. Behrman W.R. Kenan Jr. Professor of Economics & Sociology University of Pennsylvania SRCD Seattle Preconference on Interventions

More information

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011 INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA Belfast 9 th June to 10 th June, 2011 Dr James J Brown Southampton Statistical Sciences Research Institute (UoS) ADMIN Research Centre (IoE

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Lecture 11/12. Roy Model, MTE, Structural Estimation

Lecture 11/12. Roy Model, MTE, Structural Estimation Lecture 11/12. Roy Model, MTE, Structural Estimation Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy model The Roy model is a model of comparative advantage: Potential earnings

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

Introduction to Causal Inference. Solutions to Quiz 4

Introduction to Causal Inference. Solutions to Quiz 4 Introduction to Causal Inference Solutions to Quiz 4 Teppei Yamamoto Tuesday, July 9 206 Instructions: Write your name in the space provided below before you begin. You have 20 minutes to complete the

More information

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University Stata User Group Meeting, Boston July 26th, 2006 General Motivation Estimation of average effect

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes Chapter 1 Introduction What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes 1.1 What are longitudinal and panel data? With regression

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Development. ECON 8830 Anant Nyshadham

Development. ECON 8830 Anant Nyshadham Development ECON 8830 Anant Nyshadham Projections & Regressions Linear Projections If we have many potentially related (jointly distributed) variables Outcome of interest Y Explanatory variable of interest

More information

A Measure of Robustness to Misspecification

A Measure of Robustness to Misspecification A Measure of Robustness to Misspecification Susan Athey Guido W. Imbens December 2014 Graduate School of Business, Stanford University, and NBER. Electronic correspondence: athey@stanford.edu. Graduate

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil Four Parameters of Interest in the Evaluation of Social Programs James J. Heckman Justin L. Tobias Edward Vytlacil Nueld College, Oxford, August, 2005 1 1 Introduction This paper uses a latent variable

More information

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1 Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1 Lectures on Evaluation Methods Guido Imbens Impact Evaluation Network October 2010, Miami Methods for Estimating Treatment

More information

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern

More information

Front-Door Adjustment

Front-Door Adjustment Front-Door Adjustment Ethan Fosse Princeton University Fall 2016 Ethan Fosse Princeton University Front-Door Adjustment Fall 2016 1 / 38 1 Preliminaries 2 Examples of Mechanisms in Sociology 3 Bias Formulas

More information

Composite Causal Effects for. Time-Varying Treatments and Time-Varying Outcomes

Composite Causal Effects for. Time-Varying Treatments and Time-Varying Outcomes Composite Causal Effects for Time-Varying Treatments and Time-Varying Outcomes Jennie E. Brand University of Michigan Yu Xie University of Michigan May 2006 Population Studies Center Research Report 06-601

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Difference-in-Differences Methods

Difference-in-Differences Methods Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics

More information

Causal inference in multilevel data structures:

Causal inference in multilevel data structures: Causal inference in multilevel data structures: Discussion of papers by Li and Imai Jennifer Hill May 19 th, 2008 Li paper Strengths Area that needs attention! With regard to propensity score strategies

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information