Matching to estimate the causal effects from multiple treatments

Size: px
Start display at page:

Download "Matching to estimate the causal effects from multiple treatments"

Transcription

1 Matching to estimate the causal effects from multiple treatments Michael J. Lopez, and Roee Gutman Department of Biostatistics, Box G-S121-7, 121 S. Main Street, Providence, RI michael roee Abstract: The propensity score is a common tool for estimating the causal effect of a binary treatment using observational data. In this setting, matched methods, defined as either individual matching, subclassifying, or using inverse probability weighting on the propensity score, can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, matched methods require additional assumptions and techniques, the implementations of which have varied across disciplines, including public health and economics. This paper blends current methods together, identifying and contrasting the treatment effects each one estimates. Further, we propose a matching technique for use with multiple, nominal categorical treatments, and use simulations to show improved covariate similarity between those in the matched sets compared both the pre-matched cohort and to other current matching strategies. In summary, this manuscript provides a central location for those looking how to notate and use causal methods for categorical treatments. (< 150 words) Keywords and phrases: causal inference, propensity score, multiple treatments, matching, observational data. 1. Introduction The primary goal of many scientific applications is to identify causes and effects. While clinical trials are the gold standard for estimating a causal relationship, trials for certain exposures may be implausible due to logistical, ethical, or financial issues. Further, clinical trials may not be as generalizable as observational studies due to the restricted population used in the trials. When assignment to treatment is not randomized, as in observational data, those that receive one level of the treatment may differ from those that receive another with respect to variables that also influence the outcome. In such settings, establishing causes and effects requires additional assumptions and more sophisticated statistical tools. Matched methods using propensity scores are popular for estimating causal effects from observational studies with binary treatment (Rosenbaum and Rubin, 1983; Stuart, 2010). Propensity score is defined as the probability of receiving treatment conditional on a set of observed covariates. When treatment assignment is not confounded by unmeasured variables, individuals with the same propensity score can be considered subjects of a randomized trial, and differences in their outcomes provide unbiased unit-level estimates of the treatment s causal effect (Rosenbaum and Rubin, 1983). As such, the propensity score can be used to replicate the results which would have occurred in a clinical trial using observational data. Generalizations and applications of propensity score methods for multiple treatments, however, remain scattered in the literature, in large part because the advanced techniques required are unfamiliar and inaccessible. Our first aim is to coalesce existing propensity score methods previously suggested for multiple treatments, with a specific focus on methods for nominal categorical ones (e.g., a comparison of pain-killers Motrin, Advil, and Tylenol). We contrast these methods required assumptions and define the causal effects they each attempt to estimate. In doing so, potential pitfalls in the commonly used practice of applying binary propensity score tools to multiple treatments are identified. Further, we explain the elevated importance of defining a common support region when studying multiple treatments, where minor differences in the execution of certain approaches can vary the causal estimands and change the study population to which inference is generalizable. We provide a technique for generating matched sets when there are more than two treatments, supplementing our algorithm with extensive simulations to demonstrate the resulting similarity in the covariates distribution between those matched. In short, we both demonstrate proper specification of causal effects for multiple treatments and how to implement a matching algorithm to estimate these effects. In the remainder of Section 1, we introduce our notation and required assumptions. Section 2 explores existing causal methods for multiple treatments, and Sections 3 proposes a new algorithm for matching with multiple treatments. Section 4 uses simulations to contrast matched approaches, while Section 5 concludes The setting Randomized designs are attractive for estimating causal effects because proper randomization ensures that the distributions of both observed and unobserved covariates among each treatment group are equivalent in expectation. Data from observational studies, meanwhile, frequently suffer from treatment selection bias, where those who receive one treatment differ from those that receive another. For example, in a study estimating the causal effects of neighborhood choice on employment, persons who move into deprived neighborhoods differ from those who move into privileged ones on a variety of characteristics which also effect the outcome, such as socioeconomic status and education levels (Hedman and Van Ham, 2012). As such, it may be difficult to distinguish between neighborhood effects and the differences between subjects which existed before they chose their neighborhoods. MJ Lopez was supported by the National Institute for Health (IMSD Grant #R25GM083270) and the National Institute on Aging (NIA grant #F31AG046056) 1

2 Lopez & Gutman/Matching with multiple treatments 2 Matched methods attempt to replicate the covariate balance which would have occurred had treatment assignment been randomized, where no systematic covariate differences between those receiving each treatment exist. This is done by selecting subsets of the study population which are balanced on measured covariates. Balance refers to the distribution of each covariate and all covariate interactions; the matched set chosen should be the one in which subjects at different treatments are as similar as possible, with differences in covariates between those at different treatments no more than the differences which would of occurred in a randomized experiment (Stuart and Rubin, 2008). By ensuring that those receiving different treatments are similar, matching works to eliminate the effects of treatment selection bias on causal estimates. A commonly used option to account for observed differences in the background variables between those at different treatment levels is regression adjustment, in which functions of these observed covariates are used in a model of outcome on treatment. Matching using propensity scores can provide several advantages over standard regression adjustment. First, regression adjustment is both biased and requires extrapolation when there is insufficient overlap between those at different treatments (Dehejia and Wahba, 1998, 2002). Further, causal effects from matched methods do not require the formulation of an outcome model, a critical advantage because model misspecification can lead to biased causal effects when using regression adjustment (Rubin, 1973a, 1979; Ho et al., 2007). Lastly, in many scenarios, matching methods should be used to supplement regression adjustment; the combination of subclassification with regression adjustment, for example, has been shown to work better than either method alone (Cochran and Rubin (1973); Rubin (1979); Lunceford and Davidian (2004); Hade and Lu (2013); Lopez and Gutman (2013)) Notation for binary treatment Our notation stems from framework of potential outcomes and the Rubin Causal Model (Rubin, 1975; Holland, 1986). Let Y i, X i, and T i T be the observed outcome, set of covariates, and binary treatment assignment, respectively, for each subject i = 1,..., N, with N<N, where N is the population size which is possibly infinite. For T = {t 1, t 2 }, let n 1 and n 2 be the number of subjects receiving treatments t 1 and t 2, respectively. The RCM defines Y i (T i = t) = Y i (t) as the potential outcome for subject i if the subject were to have received treatment t. Two estimands of interest are the population average treatment effect, P AT E t1,t 2, and the population average treatment effect among those treated, P AT T t1,t 2. P AT E t1,t 2 = E[Y (t 1 ) Y (t 2 )] (1) P AT T t1,t 2 = E[Y (t 1 ) Y (t 2 ) T = t 1 ] (2) Letting I(T i = t 1 ) be the indicator function for an individual receiving treatment t 1, P AT E t1,t 2 and P AT T t1,t 2 are generally estimated by using sample average treatment effects, SAT E t1,t 2 and SAT T t1,t 2, which can be estimated using the data. SAT E t1,t 2 = 1 N N (Y i (t 1 ) Y i (t 2 )) (3) i=1 SAT T t1,t 2 = 1 N (Y i (t 1 ) Y i (t 2 )) I(T i = t 1 ) (4) n 1 i=1 Because each individual receives only one treatment, only Y i (t 1 ) or Y i (t 2 ) is observed for each subject, an issue known as the fundamental problem of causal inference (Holland, 1986). As a result, two assumptions are needed to estimate (3) and (4) using the RCM. First, the Stable Unit Treatment Value Assumption (SU T V A) specifies no interference between subjects and no hidden treatment versions, entailing that the set of potential outcomes for each subject is independent of the treatment assignment of others. A second assumption is that of strong unconfoundedness, under which contrasting outcomes for subjects with the same X are unbiased estimates of the treatment s causal effect at that X. Specifically, strong unconfoundedness implies (i) P (T = t {Y (t 1 ), Y (t 2 )}, X) = P (T = t X) and (ii) 0<P (T = t X)<1 t T (Rosenbaum and Rubin, 1983). Weaker versions of unconfoundedness are sufficient for some estimation techniques (Imbens, 2000), and will be explored in Section As the dimensions of X increase, matching on X can be either computationally intensive or impossible, and propensity scores allow inference under the RCM even in a high dimensional setting. Let e t1,t 2 (X) = P (T = t 1 X) be the propensity score (PS), with ê t1,t 2 (X) the estimated PS, traditionally calculated using logistic or probit regression. If treatment assignment is strongly unconfounded given X, then it is possible to estimate unbiased unit-level causal effects between those at different treatment assignments with equal PS s (Rosenbaum and Rubin, 1983). Propensity scores are often used for one-to-one or one-to-many matching (generally to estimate (4)) or via inverse probability weighting or subclassification (to estimate (3)), the decision of which should lie in the hands of the investigators (Stuart, 2010) Matching with binomial treatment Before looking at a multiple treatment setting, we first differentiate how estimation of the P AT T t1,t 2 for a binomial treatment is driven by the study population which is matched. Figure 1 shows different sets of covariate overlap between those receiving t 1 and

3 Lopez & Gutman/Matching with multiple treatments 3 t 2 t 1 t 1 t 2 t 1 t 2 Scenario 1a Scenario 1b Scenario 1c Fig 1: Three scenarios of covariate overlap for binary treatment: shaded areas represent subjects included in a matched analysis t 2. Each circle in Figure 1 represents a hypothetical distribution of X among those on each treatment, allowing for an infinitesimally small number of subjects outside of it. For example, each circle could represent the 99th percentile of a two-dimensional multivariate normal distribution. In Figure 1, shaded regions correspond to the subjects receiving t 1 who were matched to a subject receiving t 2. When all subjects receiving t 1 have at least one subject to be matched to among those receiving t 2, the population matched reflects the covariates distribution of those receiving treatment t 1, making P AT T t1,t 2 generalizable to the population of those treated on t 1 (Scenario 1a). In Scenario 1b, P AT T t1,t 2 also intends to reflect the population receiving t 1. However, because subjects receiving t 1 without obvious matches are also included in the analysis, there may be non-random differences in the covariates among those matched, yielding biased causal effects. Moreover, it is impossible to approximate this estimand without making unassailable assumptions with respect to those falling outside the common support. A common step to handle these issues is to eliminate extreme propensity scores using a common support region, where those with either X or ê (t1,t 2)(X) beyond the range of X or ê (t1,t 2)(X) of those on the other treatment are excluded from the analysis phase (Dehejia and Wahba, 1998). It is also possible to reduce differences between the subjects matched by using a caliper in the matching procedure, where subjects are dropped if no eligible matches have similar ê t1,t 2 (X) (Caliendo and Kopeinig, 2008; Stuart, 2010). With either caliper matching or use of a common support region, the treatment effect generalizes to those receiving t 1 who were eligible to be treated with treatment t 2 (Figure 1, Scenario 1c). Let E 1 be an indicator for having a propensity score within the common support, such that E 1 = { 1 if e (t1,t 2)(X) e (t1,t 2)(X, T = t 1 ) e (t1,t 2)(X, T = t 2 ) 0 if otherwise Here, the specific estimand of interest changes to P AT T E1(t 1,t 2) = E[Y (t 1 ) Y (t 2 ) T = t 1, E 1 = 1], the average treatment effect among those receiving t 1 who were eligible to have received t 2, with eligibility, E 1, defined by having a propensity score within the common support region Notation for multiple treatments The choices of population subsets to which inference is generalizable grow with increasing treatment choice. Let T = {t 1, t 2,..., t Z } be the treatment support for Z total treatments, with Y i = {Y i (t 1 ), Y i (t 2 ),..., Y i (t Z )} the set of potential outcomes for subject i, and let P AT E s and P AT T s refer to the class of estimands attempting to replicate effects (1) and (2) in the multiple treatment setting. There are ( Z 2) possible P AT E s of interest, one for every pairwise comparison of treatments, such that P AT E t1,t 2 = E[Y (t 1 ) Y (t 2 )] {t 1, t 2 } T, t 1 t 2 (5) P AT E s for multiple treatments are transitive P AT E t1,t 2 + P AT E t2,t 3 = P AT E t1,t 3. To contrast two treatments among subsets of the population which received one of the Z treatments, formulation of P AT T s for multiple treatments requires extended notation. For example, let P AT T t1 t 1,t 2 be the population average treatment effect of t 1 versus t 2 among those receiving t 1. With a reference treatment group t 1, there would traditionally be Z 1 P AT T s to estimate, one for each of the treatments which the reference group did not receive (McCaffrey et al., 2013). P AT T Et1 t 1,t 2 = E[Y (t 1 ) Y (t 2 ) T = t 1 ] (6) P AT T t1 t 1,t 3 = E[Y (t 1 ) Y (t 3 ) T = t 1 ] (7)... =... P AT T t1 t 1,t Z = E[Y (t 1 ) Y (t Z ) T = t 1 ] (8)

4 Lopez & Gutman/Matching with multiple treatments 4 Conditional on receiving one specific treatment, P AT T s are transitive, such that P AT T t1 t 1,t 2 +P AT T t1 t 1,t 3 = P AT T t1 t 2,t 3, but this property generally does not extend when conditioning upon receiving a different treatment. For example, unless the covariates distribution of those receiving treatments t 1 and t 2 are identical, P AT T t1 t 1,t 2 + P AT T t2 t 2,t 3 P AT T t1 t 1,t 3. To estimate P AT E s and P AT T s across exposure pairs, assumptions are expanded as follows. First, the SUTVA expands across a subject s vector of potential outcomes. Second, a strongly ignorable treatment assignment for multiple exposures entails (i) P r[y T = t, X] = P r[y X] and (ii) 0<P r[t = t X] t T (Imai and Van Dyk, 2004) The generalized propensity score The generalized propensity score (GPS), r(t, X) = P r(t = t X = x), extends the PS from a binary to the multiple treatment setting (Imbens, 2000; Imai and Van Dyk, 2004). Unlike the PS, which allows for matching, subclassification, or weighting on a scalar, conditioning with multiple treatments often must be done on a vector of GPS s, R(X) = (r(t 1, X),..., r(t Z, X)), or a function thereof (Imai and Van Dyk, 2004). Specifically, with multiple treatments, two individuals with the same r(t, X) for treatment t may have differing R(X) s. For example, for Z = 3 and T = {t 1, t 2, t 3 }, let R(X i ), R(X j ), and R(X k ) be the GPS vectors for subjects i, j, and k, respectively, where T i = t 1, T j = t 2, and T k = t 3, with R(X i ) = (0.30, 0.60, 0.10) R(X j ) = (0.30, 0.35, 0.35) R(X k ) = (0.30, 0.10, 0.60) Even though subject s i, j, and k have the same r(t 1, X) = 0.30, because these subjects r(t 2, X) and r(t 3, X) differ, differences in outcomes between these subjects would generally not provide unbiased causal effect estimates (Imbens, 2000). In part due to this limitation, Imbens (2000) called individual matching less well-suited to multiple treatment settings. Only under the scenario of R(X i ) = R(X j ) = R(X k ) would contrasts in the outcomes of subjects i, j, and k provide unbiased unit-level estimates of the causal effects between treatments (Imbens, 2000; Imai and Van Dyk, 2004). If this conditioning on equivalent GPS vectors were possible, however, and one set were available for each subject receiving treatment t 1, these comparisons could be aggregated across matched sets to estimate P AT T s, generalizable to those receiving t 1. As a result, grouping subjects with similar GPS vectors is the primary goal in estimating causal effects from multiple treatments. GPS vectors are oten estimated using the multinomial logistic, multinomial probit, and nested logit models for nominal categorical treatments, with the proportional odds model recommended with ordinal treatments (Tchernis, Horvitz-Lennon and Normand, 2005). An important point which deserves mention is an assumption made for the multinomial logistic model, called independence from irrelevant alternatives. (IIA). IIA states that the ratio of any two predicted probabilities does not depend on changes in the remaining Z 2 treatments. For example, in a setting with two conventional and two atypical antipsychotic drugs, physicians often first choose drug type (conventional or atypical) before choosing an exact prescription (Tchernis, Horvitz-Lennon and Normand, 2005). If, for example, one of the two atypical drugs were to be taken off the market, the only treatment assignment probability which might change would be that of the atypical drug remaining on the market. In these settings, the IIA assumption would be violated, in which case a nested logit or a multinomial probit treatment assignment model, ones allowing for flexible levels of pair-wise correlations, would be preferred to a multinomial logistic one (Tchernis, Horvitz-Lennon and Normand, 2005). In the sections that follow, we explain how matching methods have been implemented for both ordinal and nominal categorical treatments. One strategy we do not explore is covariate adjustment for the GPS or a function of the GPS in a regression model (Filardo et al., 2009, 2007; Dearing, McCartney and Taylor, 2009; Spreeuwenberg et al., 2010). Such techniques are subject to the same model misspecification and extrapolation problems as standard regression adjustment, and simulations have shown that these strategies perform worse than matching, stratification, or weighting (Hade and Lu, 2013) Ordinal treatments With ordinal treatments, such as ones measured on a scale (e.g. never - sometimes - always) or received as a dose (e.g. low - medium - high), it is possible to condition on a scalar balancing score in place of a GPS vector (Joffe and Rosenbaum, 1999; Imai and Van Dyk, 2004). This is done by estimating treatment assignment as a function of X using the proportional odds model (McCullagh, 1980), such that ( ) P (Ti <t) log = θ t β T X i, t = 1,..., Z 1 (9) P (T i t) Letting β T = (β 1,..β p ) T, Imai and Van Dyk (2004) showed that differences in outcomes between subjects with different exposure levels but equal β T X scores can provide unbiased unit-level estimates of causal effects at that β T X. The balancing property of β T X can be used to match or subclassify subjects receiving different levels of an ordinal exposure. Lu et al. (2001) used non-bipartite matching to form matched sets based on a function of β T X and the relative distance between

5 Lopez & Gutman/Matching with multiple treatments 5 exposure levels. While this method does not specify an exact causal estimand, it is popular for generating an overall sense of whether or not a dose-response relationship exists between T and Y. A September, 2013 google scholar search for this manuscript yielded 115 citations (including, among others, Armstrong, Jagolinzer and Larcker (2010); Frank, Akresh and Lu (2010); Snodgrass et al. (2011)). Imai and Van Dyk (2004); Zanutto, Lu and Hornik (2005); Yanovitzky, Zanutto and Hornik (2005) and Lopez and Gutman (2013) used Equation (9) to estimate treatment assignment and subclassify subjects with similar β T X values. Subclassification is attractive with ordinal exposures because individuals with roughly equivalent β T X estimates are likely to have similar distributions of X. After subclassifying, unbiased causal effects can be estimated within each subclass, and aggregated across subclasses using a weighted average to estimate either P AT E s or P AT T s (Zanutto, Lu and Hornik, 2005). Lopez and Gutman (2013) recommended that regression adjustment be used within each subclass when possible, with results likewise combined to estimate causal effects. A final strategy for estimating the causal effects of ordinal exposures is to dichotomize the treatment using a pre-specified cutoff for use with binary propensity score methods (Chertow, Normand and McNeil, 2004; Davidson et al., 2006; Schneeweiss et al., 2007). However, this procedure may result in a loss of information, as all subjects on one side of the cutoff are treated as having the same exposure level. Royston, Altman and Sauerbrei (2006) identified a loss of power, residual confounding of the treatment assignment mechanism, and possible bias in estimates as the result of dichotimization. Moreover, dichotomization makes identification of an optimal exposure level impossible. As a result, matching or subclassifications methods which maintain all exposure levels while balancing on β T X are preferred for causal inference with ordinal exposures (Imai and Van Dyk, 2004) Nominal treatments Matching on a scalar is feasible for either a binary or ordinal treatment assignment, using the propensity score and β T X from Equation (9), respectively. Such shortcuts, however, do not exist for nominal categorical treatments, where one must condition on two or more elements of a GPS vector to estimate causal effects (Imbens, 2000; Imai and Van Dyk, 2004). Because it is difficult to calculate matched sets with nominal treatments, researchers have turned to the application of binary matching tools to handle these contrasts. Two approaches, a series of binomial comparisons (Lechner, 2001) and common referent matching (Rassen et al., 2011), are discussed next. Following a description of each, we provide a simple example of how applying a binary PS approach to multiple exposures could fail to match balanced subjects Series of binomial comparisons Lechner (2001, 2002) estimated P AT T s between multiple treatments using a series of binary comparisons (SBC). With Z treatments, SBC implements binary propensity score methods within each of the ( Z 2) pairwise population subsets. For example, a treatment effect comparing t 1 to t 2 uses only subjects receiving either t 1 or t 2, with matching done on a scalar PS. Lechner advocates matching on either e (t1,t 2)(X), estimated using logistic or probit regression, or r(t 1, X)/(r(t 1, X) + r(t 2, X)), where r(t 1, X) and r(t 2, X) are estimated using a multinomial model. Figure 2 (Scenario 2a) depicts the unique common support regions for Z = 3 when using SBC, where treatment effects reflect different subsets of the population. For example, let E 2 (t 1, t 2 ) be the indicator for having a binary propensity score for treatments t 1 and t 2 within a common support: E 2 (t 1, t 2 ) = { 1 if e (t1,t 2)(X) e (t1,t 2)(X, T = t) e (t1,t 2)(X, T = t 2 ) 0 if otherwise SBC estimates the causal effect of treatment t 1 versus treatment t 2, among those on t 1, as P AT T E2(t 1 t 1,t 2), where P AT T E2(t 1 t 1,t 2) = E[Y (t 1 ) Y (t 2 ) T = t 1, E 2 (t 1, t 2 ) = 1] (10) Each pairwise treatment effect from SBC generalizes only to subjects eligible for that specific pair of treatments, as opposed to those eligible for all treatments. Such pairwise treatment effects are not transitive. For example, P AT T E2(t 1 t 1,t 2) and P AT T E2(t 1 t 1,t 3) are unlikely to provide useful comparisons of treatments t 2 and t 3 because each effect generalizes to separate subsets of subjects who received t 1 (i.e., subjects with E 2 (t 1 t 1, t 2 ) = 1 are different from those with E 2 (t 1 t 1, t 3 )=1). This lack of transitivity jeopardizes the generalizability of SBC when pairwise comparisons are used to make inference across a population. Despite this limitation, versions of SBC have been replicated often in the fields of economics, politics, and public health (Bryson, Dorsett and Purdon, 2002; Dorsett, 2006; Levin and Alvarez, 2009; Drichoutis, Lazaridis and Nayga Jr, 2005; Kosteas, 2010), with the original papers boasting 562 (Lechner, 2001) and 316 (Lechner, 2002) google scholar citations, respectively, as of March, 2014.

6 Lopez & Gutman/Matching with multiple treatments 6 t 1 t 2 t 1 t 2 t 3 Scenario 2a t 3 Scenario 2b Fig 2: Two scenarios of eligible subjects with three treatments: shaded areas represent subjects included in a matched analysis Common referent matching Rassen et al. (2011) proposed common referent matching (CRM) to create sets with one individual from each treatment type. For T = {t 1, t 2, t 3 }, the steps of CRM are as follows. First, identify t 1 as the reference treatment, where n 1 = min {n 1, n 2, n 3 }. Among those receiving each pair of treatments, {t 1, t 2 } or {t 1, t 3 }, use logistic or probit regression to estimate e t1,t 2 (X) and e t1,t 3 (X), respectively. Next, match pairs of subjects receiving t 1 or t 2 using ê t1,t 2 (X), and likewise match pairs of subjects with overlapping propensity scores on treatment t 1 or t 3 using ê t1,t 3 (X). The authors advocate 1:1 caliper matching to calculate these sets. Lastly, these two cohorts are used to construct 1:1:1 matched triplets using the patients receiving t 1 who were matched to both a patient receiving t 2 and a patient receiving t 3, along with their associated matches. Matched pairs from treatments t 1 and t 3 are discarded if the subject receiving t 1 was not matched with a subject on treatment t 2, and likewise pairs of subjects receiving t 1 and t 2 are discarded there was no match for the referral subject to someone receiving t 3. If all subjects receiving t 1 are matched to both someone in treatment groups t 2 and in t 3, then matched pairs using CRP M would be equivalent to those produced using SBC. Let E 3 be the indicator for having two pairwise binary PS s within their respective common supports, such that E 3 = CRM estimates the following treatment effects: { 1 if E 2 (t 1, t 2 ) = 1 and E 2 (t 1, t 3 ) = 1 0 if otherwise P AT T E3(t 1 t 1,t 2) = E[Y (t 1 ) Y (t 2 ) T = t 1, E 3 = 1] P AT T E3(t 1 t 1,t 3) = E[Y (t 1 ) Y (t 3 ) T = t 1, E 3 = 1] P AT T E3(t 1 t 2,t 3) = E[Y (t 2 ) Y (t 3 ) T = t 1, E 3 = 1] A possible weakness of CRM is that there is no guarantee that subjects from treatment groups t 2 and t 3, paired to the same subject receiving t 1, are equivalent across R(x). Thus, inference using these triplets to compare treatments could be confounded by residual covariate differences between the groups Evidence against a binary PS approach The following hypothetical example illustrates possible issues with implementing standard binary PS tools, as in SBC and CRM, when there are multiple treatments. Let R(X i ), R(X j ), and R(X k ) be different sets of treatment assignment vectors for subjects i, j, and k, with T i = t 1, T j = t 2, and T k = t 3, where R(X i ) = (0.33, 0.33, 0.33) R(X j ) = (0.05, 0.05, 0.90) R(X k ) = (0.49, 0.02, 0.49) Conditional on receiving either treatments t 1 or t 2, subjects i and j are equally likely to receive t 1, P (T = t 1 X i, T i {t 1, t 2 }) = P (T = t 1 X j, T j {t 1, t 2 }) = (11) Similar results hold for subjects i and k, conditional on receiving treatment t 1 or t 3. P (T = t 1 X i, T i {t 1, t 3 }) = P (T = t 1 X k, T k {t 1, t 3 }) = 0.50 (12)

7 Lopez & Gutman/Matching with multiple treatments 7 As a result, CRM could form a triplet using subjects i, j, and k, and SBC could match subjects i with j and i with k for pairwise comparisons. However, these individuals should not be grouped together because R(X i ) R(X j ) R(X k ), and as a result, the estimation of their unit-level causal effects would be biased and inappropriate (Rubin, 1973b). While such differences could average out when aggregating across several matched sets, it would only take a few poorly matched triplets to yield biased causal effects in small samples. Moreover, for even moderately non-linear relationships between Y and X, closely matched individuals, in addition to closely matched groups, are needed in order to estimate unbiased causal effects (Rubin, 1973b) Inverse probability weighting A common approach to estimate the causal effects from multiple treatments uses the inverse probability of treatment assignment as weights (IPW) (Imbens, 2000; Feng et al., 2011; McCaffrey et al., 2013). In this setting, only the assumption of weak unconfoundedness is needed (Imbens, 2000). Let I(t) = {1 if T = t, 0 otherwise}. Weak unconfoundedness states that t T, P (I(t) = 1 Y (t), X) = P (I(t) = 1 X), and allows the estimation of average outcomes within subpopulations defined by pretreatment status (Imbens, 2000). While weak unconfoundedness is a relaxed form of strong unconfoundedness, Imbens (2000) acknowledges that the contrast between the two in substantive terms...is not very different (p. 707). Feng et al. (2011) implemented IP W to estimate P AT E s between each pair of treatments, such that to contrast t 1 and t 2, P AT E t1,t 2 = E[Y (t1 )] E[Y (t 2 )] where (13) ( N )( N ) 1 E[Y (t 1 )] = and E[Y (t 2 )] = i=1 ( N i=1 I(T i = t 1 )Y i r(t 1, X i ) I(T i = t 2 )Y i r(t 2, X i ) i=1 )( N i=1 I(T i = t 1 ) r(t 1, X i ) I(T i = t 2 ) r(t 2, X i ) To estimate weights, the ordinal logistic model (Dearing, McCartney and Taylor, 2009; Crosnoe, Augustine and Huston, 2012), the multinomial probit model (Penson et al., 2003), the multinomial logistic model (Toh, García Rodríguez and Hernán, 2012), and generalized boosted models (McCaffrey et al., 2013) have all been used. A plausible issue with IP W is that extreme weights could yield erratic causal estimates with large sample variances (Little, 1988; Kang and Schafer, 2007; Stuart and Rubin, 2008), an issue which is increasingly likely for Z>2, where treatment assignment probabilities generally decrease. As in the binary treatment setting, it is possible to trim or eliminate subjects with extreme weights. However, Kilpatrick et al. (2012) found that inclusion of high weights was necessary to maintain covariate balance in the multiple treatment setting, and that removal of extreme weights could result in bias in an unknown direction. ) Matching for multiple treatments More recently, attempts have been made to more formally match on the GPS, with the goal of grouping several subjects together with similar R(X) s, including at least one subject receiving each treatment. With Z = 3, Rassen et al. (2013) proposed withintrio matching (W ithint rio) to form triplets of subjects. The goal of W ithint rio is to optimize triplet similarities based on a subject s GPS s for treatments t 1 and treatments t 2, done using a distance function between all possible pairs of triplets via the KD-tree algorithm (Hott, Brunelle and Myers, 2012). In simulations, Rassen et al. (2013) found triplets produced using W ithint rio yielded both higher, equivalent, and lower standardized covariate bias when compared to CRM and SBC. As currently constructed, W ithint rio generates n 1 matched sets in random order, for n 1 = min {n 1, n 2, n 3 }. As a result, because all subjects receiving t 1 are matched, there is the potential to form dissimilar triplets, if, for example, all close matches to a subject who received t 1 were already taken. Moreover, P AT T s generalizable to those receiving treatment t 2 or t 3 cannot be estimated using W ithint rio. Lastly, W ithint rio cannot currently handle scenarios involving more than three treatment types, and widespread public software for this algorithm s implementation remains unavailable. Lastly, to estimate P AT E s from multiple treatments, Tu, Jiao and Koh (2012) used different clustering algorithms to bin subjects into subclasses based on their R(X) s. The authors found the K means clustering (KMC) on the logit transformation of the GPS vector, where logit ( R(X) ) ( = log( r(t1, X)/(1 r(t 1, X))),..., log( r(tz, X)/(1 r(t Z, X))), generally provided the highest within subclass covariate similarity between those receiving different treatments. Although the authors do not provide guidelines regarding which subjects should be included in generating the clusters (e.g., a common support), if all subjects were subclassified, causal effects could be estimated within each subclass and then aggregated across subclasses using a weighted average to estimate either P AT E s or P AT T s. There are no known implementations on real data of K means clustering (KMC) to estimate causal effects for a nominal exposure.

8 Lopez & Gutman/Matching with multiple treatments 8 2. Matching on a vector of generalized propensity scores Problems associated with binary PS methods, limited availability and adaptability of current multivariate strategies, and the potential for extreme weights to result in erratic causal estimates motivates the need for an improved, easy to implement algorithm which can match subjects with similar GPS vectors. We propose vector matching (V M) to form matched sets for inference under multiple, nominal treatments to estimate causal effects among subjects eligible for all all treatments Eligibility criterion Whereas SBC and CRM methods may be of interest for specific research questions, in general, these effects generalize to specific subsets of a population, making them insufficient for clinicians and policy makers, who generally are looking to compare three or more active treatments at once (Rassen et al., 2011; Hott, Brunelle and Myers, 2012). Those sufficiently eligible for all treatments simultaneously, however, are more representative of the clinical trial we are hoping to replicate, in which each subject has a realistic chance of receiving all treatments. Rassen et al. (2013) refer to this balance with respect to measured covariates and trial eligibility as clinical equipoise. In this sense, both a decision to use only subjects eligible for all treatments and this definition of eligibility represent important distinctions from pairwise algorithms. We expand the common support for a binary treatment (Dehejia and Wahba, 1998) to one for multiple treatments as follows. First, estimate R(X) using a multinomial model. For each treatment t, t T, define r(t, X) (low) and r(t, X) (high) as follows r(t, X) (low) = max(min(r(t, X T = t 1 ), min(r(t, X T = t 2 ),..., min(r(t, X T = t Z ) (14) r(t, X) (high) = min(max(r(t, X T = t 1 ), max(r(t, X T = t 2 ),..., max(r(t, X T = t Z ) (15) Because subjects with r(t, X) ( r(t, X) (low), (r(t, X) (high)) may not be equivalent to the remaining study population on X, these subjects would be discarded from V M. This exclusion criteria is used t T, after which it is recommended to re-fit the GPS model, to ensure estimated GPS s are not disproportionately impacted by those dropped (suggested for binary treatment in Imbens and Rubin (2013)). Let E 4 be the indicator for all treatment eligibility, where E 4 = { 1 if r(t, X) ( r(t, X) (low), (r(t, X) (high)) t T 0 if otherwise Our goal is to estimate P AT T s among subjects eligible for all treatments. For example, with reference group t 1, we have P AT T E4 (t 1 t 1,t 2) = E[Y (t 1 ) Y (t 2 ) T = t 1, E 4 = 1] (16) P AT T E4 (t 1 t 1,t 3) = E[Y (t 1 ) Y (t 3 ) T = t 1, E 4 = 1] (17)... =... P AT T E4 (t 1 t 1,t Z ) = E[Y (t 1 ) Y (t Z ) T = t 1, E 4 = 1] (18) Because estimands (16)-(18) are transitive, P AT T E4 (t 1 t 1,t 2) and P AT T E4 (t 1 t 1,t 3), for example, could be contrasted to compare t 2 and t 3 in the population of subjects who received t 1. An additional benefit of the eligibility criterion is a lower need for extrapolation with respect to ineligible subjects. The shaded region in Figure 2 (Scenario 2b) depicts the subset of those eligible for all three treatments Vector Matching The primary downside of applying binary PS tools to match for nominal treatments is that there is no guarantee that subjects paired together are similar across their GPS vectors. If we were able to ensure that subjects could only be matched on one GPS if they were approximately equivalent to one another on remaining GPS s, binary matching tools could prove useful with multiple treatments. One common approach for generating groups of like subjects is to place them into subclasses, where subjects in the same subclass are homogeneous with respect to background covariates. Subclassification on a binary PS, for example, can remove upwards of 90% of the bias caused by heterogeneity in the pre-matched cohort under strong unconfoundedness (Rosenbaum and Rubin, 1984). With multiple treatments and GPS s, we propose subclassifying subjects using K means clustering, and matching subjects only if they appear in the same subclass. With three treatments, for example, if subjects matched on r(t 1, X) were also similar on r(t 3, X), their GPS vectors would roughly be the same. This strategy can be implemented by exact matching within each subclass (available in, among other options, the M atching package in R statistical software (Sekhon, 2008)). By exact matching, V M ensures that subjects paired to the reference treatment are both nearly perfect matches on one GPS and roughly similar on

9 Lopez & Gutman/Matching with multiple treatments 9 other GPS scores. In this sense, with a goal of obtaining perfect matches within subsets of other important scores, exact matching is similar to blocking in a randomized experiment, which has been shown to lead to large reductions in bias for binary treatment (Stuart and Rubin, 2008). In our proposed algorithm, we also match with replacement. Compared to matching on a scalar without replacement, matching with replacement has been shown to yield lower bias in matched sets (Abadie and Imbens, 2006). Moreover, because individuals not receiving the reference treatment can be matched more than once, matching with replacement allows estimation of P AT T s generalizable to each treatment group, and not just the group with the smallest sample size. With any t T = {t 1,...t Z }, vector matching (V M) estimates (16) to (18), using t as a reference, as follows 1. Model T X using a multinomial logistic model. After dropping those outside the common support (e.g., those with E 4 = 0), re-fit the model. 2. Extract R(Xi ) for each eligible subject i, i = 1,...N E4 =1, where N E4 =1 is the number of overall subjects with E 4 = t t (a) Cluster subjects receiving all other treatments besides t and t using K-means clustering (KMC) on the logit transform of remaining estimated GPS scores. This forms K strata of subjects, with roughly equivalent Z 2 GPS scores in each k K. For example, with Z = 5, T = {t 1,..t 5 }, reference treatment t 1 and letting t = t 2, V M would use KMC on logit( r(t3, X i ), r(t4, X i ), r(t5, X i )) (b) Match those receiving t to those receiving t on logit( r(t, Xi )) within each k. We match with replacement using a caliper of In our example, this matches subjects receiving t 1 to those receiving t 2 within each of the strata produced by KMC 4. Extract sets of subjects using subjects receiving t who were matched to subjects receiving all other treatments, along with their matches receiving other treatments. Up to n t1,e 4 =1 triplets are possible using VM, where n t1,e 4 =1 is the number of subjects receiving t 1 with E 4 = 1. V M is uniquely strait-forward for Z = 3: 1. Match those receiving t 1 to those receiving t 2 on logit( r(t1, X i )) within K means strata of logit( r(t3, X i )) 2. Match those receiving t 1 to those receiving t 3 on logit( r(t1, X i )) within K means strata of logit( r(t2, X i )) 3. Extract the subjects receiving t 1 who were matched to both subjects receiving t 2 and t 3 3. Simulations The goal of matching is to replicate the covariate balance which would have occurred in a completely randomized design, and success should be measured by the similarity among those matched (Stuart and Rubin, 2008). While extensive diagnostics are generally recommended in real data analyses, with binary treatment, a common standard is to check for differences in the covariates between each treatment group among the matched set. Only after the groups are judged to be balanced on the distributions of covariates can the matched cohort be used to estimate causal effects using outcome Y (Rubin, 2001). The benefits of well matched sets are plentiful; regression adjustment, for example can remove nearly 100% of the bias in causal effect estimation when used on well matched sets (Cochran and Rubin, 1973). However, when the distributions of X are asymmetric, regression adjustment on matched sets may yield erratic estimates (Cochran and Rubin, 1973). Moreover, when differences in the mean covariates between those at different treatments are a quarter of a standard deviation apart and the true outcome model is moderately non-linear, standard regression adjustment can lead to both increased bias (from 100% to 298%) and decreased bias (from 100% to -304%), the latter of which moves the bias in the other direction (see Table 1, Rubin (2001)). As a result, our primary focus lies in the balance between those matched, as measured by similarity in X, across a variety of covariate settings, and we purposely exclude Y. Matching procedures V M, CRM, and IP W are contrasted in the sections that follow; SBC is not included because (i) as a binary matching tool, it will generate similar matches to CRM, if not the identical ones, and (ii) because SBC cannot generate matched vectors, it cannot be used to contrast three or more treatments simultaneously, as in (16) to (18) Evaluating balance of matched sets by simulation We begin by generating X for different sets of N subjects receiving one of three treatments, T {t 1, t 2, t 3 }, with n 1, n 2 = γn 1, and n 3 = γ 2 n 1 the sample size of subjects receiving treatments t 1,t 2,and t 3, respectively, where X = [X 1,..., X P ] is distributed symmetrically such that:

10 Factor Lopez & Gutman/Matching with multiple treatments 10 Levels of factor n 1 {500, 2000} γ = n 2 n 1 = n 3 n 2 {1, 2} f {t 7, Normal} b B = b τ {0, 0.25} σ 2 2 {0.5, 1, 2} σ 2 3 {0.5, 1, 2} P {3, 6} Table 1 Simulation factors 1+σ 2 2 +σ2 3 3 takes levels {0, 0.25, 0.50, 0.75, 1.00} T i = t 1, i = 1,..., n 1 (19) T i = t 2, i = n 1 + 1,..., n 1 + γn 1 (20) T i = t 3, i = n 1 + γn 1,..., n 1 + γn 1 + γ 2 n 1 (21) X i {T i = t 1 } f(µ 1, Σ 1 ), i = 1,..., n 1 (22) X i {T i = t 2 } f(µ 2, Σ 2 ), i = n 1 + 1,..., n 1 + γn 1 (23) X i {T i = t 3 } f(µ 3, Σ 3 ), i = n 1 + γn 1,..., n 1 + γn 1 + γ 2 n 1 (24) µ 1 = ((b, 0, 0),..., (b, 0, 0)) T (25) µ 2 = ((0, b, 0),..., (0, b, 0)) T (26) µ 3 = ((0, 0, b),..., (0, 0, b)) T (27) 1 τ... τ τ 1... τ Σ 1 = (28) τ τ... 1 σ 2 τ... τ τ σ 2... τ Σ 2 = (29) τ τ... σ 2 σ 3 τ... τ τ σ 3... τ Σ 3 = (30) τ τ... σ 3 The means chosen in settings were chosen to mimic simulations in bivariate treatment settings, where for increasing b, subjects are increasingly likely to receive their respective treatments (as in Rubin (1979) and Cangul et al. (2009), among others). The distribution of X is varied by eight factors (Table 1). The first two factors are design features which an investigator has some control over, while the last six, for the most part, are unknown data attributes. The distance between treated and control group means is stated in terms of standardized bias B, where B = b 1+σ 2 2 +σ2 3 3 (31) in order to evaluate the effect of bias on match quality independent of variance ratios σ 2 2 and σ 2 3. Due to the small number of eligible subjects remaining when P = 6 and n 1 = 500, these simulations are discarded, leaving 1080 simulation factors. For each simulation condition, 200 data sets are generated, and on each data set, V M (using K = 5 strata), CRM (matching on binary PS with caliper 0.05), and IP W (weights generated using multinomial logistic regression) are used to fit matched sets.

11 Lopez & Gutman/Matching with multiple treatments Simulation metrics While several metrics have been proposed for evaluating the success of matching with binary treatments (see Austin, Grootendorst and Anderson (2007); Austin (2009), for example), assessments for multiple treatments are not as well formalized (Stuart, 2010). For V M or CRM, let w i be the number of times subject i is part of a triplet, and for IP W, let w i be each subject s weight, as estimated using the inverse GPS at the treatment they received (see Section 1.6.4). Let X pt be the weighted mean of covariate p, p = 1,...P, at treatment t, such that X pt = N i=1 X pii(t i = t)w i n trip (32) For each p, our interest lies in the similarity of X p1, X p2 and X p3. With binary treatment, Rubin and Thomas (1996) and Rubin (2001) suggest that SB p12 standardized bias between X p in the treatment (t 1 ) and control groups (t 2 ) should be less than 0.25, where SB k12 = X p1 X p2 δ p1 (33) and δ p1 is the standard deviation of X p in the treatment group. With multiple treatments, McCaffrey et al. (2013) uses a bias cutoff of Z! There are 2(Z 2)! pairwise standardized biases to calculate in a multiple treatment setting, which can be computationally extensive for large Z. For Z = 3, however, as in our simulations, we only have three such biases for each p, SB p12, SB p13, and SB p23. As in Hade (2012), we extract the maximum absolute standardized pairwise bias at each covariate, Max2SB p, such that Max2SB p = max( SB p12, SB p13, SB p23 ) (34) With three treatment pairs, the maximum absolute biases reflect a worst-case scenario, representing the largest discrepancy in estimated covariate means between any two treatment groups. For all matching algorithms and at each p, δ p1, the standard deviation of X p in the full sample among those receiving reference t 1, is used for standardization, to ensure that observed differences in the similarity of those matched are easily contrasted (as in Stuart and Rubin (2008)). Also extracted is %Matched, where %Matched is the fraction of subjects who received t 1 and were eligible to receive the other two treatments which were included in the final matched set. Inclusion of this metric can provide us a sense of both how many triplets V M and CRM can form relative to those eligible for all treatments, and, moreover, how similar those matched are to the population we are interested in studying. Simulations with %Matched 1 and relatively low Max2SB p p, for example, are the optimal result, where all eligible subjects who received t 1 would have been matched, with the matched subjects receiving t 2 and t 3 roughly identical to those receiving t 1 with respect to each pairwise GPS comparison. Under these results, the sets would be well-balanced, and inference would reflect the population of those receiving t 1. There are 200 independent covariate draws from each of the 1080 simulation settings. At each draw and for each each of the matching algorithms, V M, IP W, and CRM, Max2SB p p and %Matched are obtained, with these results averaged across simulation sets at each factor (%Matched is not relevant for IP W ). We simplify Max2SB p p by averaging over p, such that Max2SB = p=1,...p Max2SB p/p Determinants of matching performance ANOVA ANOVA is performed on Max2SB and %Matched using the set of results for from CRM, IP W, and V M, across all simulation factors in Table 1, as well as all of their pairwise and three-way interactions. The purposes of ANOVA are to see which factors are relevant to V M, CRM, and IP W performance, and to identify the success of each method across all conditions. High Max2SB s will suggest poorly matched triplets, while low %M atched will infer that the matched sets may not generalize to the eligible subject cohort. Initial covariate bias B drives the majority of variation in the bias remaining after matching, accounting for roughly 85%, 70%, and 45% of the variability in Max2SB for CRM, V M, and IP W, respectively (Table 2). Bias also drove nearly 100% of the variability in %Matched for V M and IP W (not shown). Compared to V M and CRM, IP W biases are substantially driven by the degrees of freedom (df) and the variance terms σ 2 and σ 3 ; skewed and highly variable covariates, which are more likely to yield extreme weights, appear to be driving covariate dissimilarity after weighting using IP W.

12 Lopez & Gutman/Matching with multiple treatments 12 Table 2 ANOVA for V M, CRM and IP W Max2SB: most influential factors V M IP W CRM Variable MSE Variable MSE Variable MSE B B B p 3092 (df) γ 4089 γ 1063 B:(df) σ t 2271 σ t 792 σ s σ s 469 σ s 680 p B:σ t 243 τ 517 n B:γ 233 B:p 490 σ t B:σ s 147 B:σ t 182 B:σ s 5280 p 64 B:σ s 158 B:p 3838 B:n 1 48 σ s:p 142 B:n (df) 28 Fig 3: Max2SB for pre-matched cohort and by matching algorithms (left), and %Matched for V M and CRM

arxiv: v2 [stat.me] 19 Jan 2017

arxiv: v2 [stat.me] 19 Jan 2017 Estimation of causal effects with multiple treatments: a review and new ideas Michael J. Lopez, and Roee Gutman Department of Mathematics, 815 N. Broadway, Saratoga Springs, NY 12866 e-mail: mlopez1@skidmore.edu.

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai Chapter 1 Propensity Score Analysis Concepts and Issues Wei Pan Haiyan Bai Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity score analysis, research using propensity score analysis

More information

Propensity Score for Causal Inference of Multiple and Multivalued Treatments

Propensity Score for Causal Inference of Multiple and Multivalued Treatments Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2016 Propensity Score for Causal Inference of Multiple and Multivalued Treatments Zirui Gu Virginia Commonwealth

More information

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

arxiv: v1 [stat.me] 15 May 2011

arxiv: v1 [stat.me] 15 May 2011 Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland

More information

Propensity Score Matching

Propensity Score Matching Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In

More information

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14 STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2

More information

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings Jessica Lum, MA 1 Steven Pizer, PhD 1, 2 Melissa Garrido,

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino University of Bologna and Cepr June 11, 2007 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao Department of Statistics, University of California, Irvine, CA 92697 shandonm@uci.edu

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

Causal modelling in Medical Research

Causal modelling in Medical Research Causal modelling in Medical Research Debashis Ghosh Department of Biostatistics and Informatics, Colorado School of Public Health Biostatistics Workshop Series Goals for today Introduction to Potential

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Marginal, crude and conditional odds ratios

Marginal, crude and conditional odds ratios Marginal, crude and conditional odds ratios Denitions and estimation Travis Loux Gradute student, UC Davis Department of Statistics March 31, 2010 Parameter Denitions When measuring the eect of a binary

More information

Balancing Covariates via Propensity Score Weighting

Balancing Covariates via Propensity Score Weighting Balancing Covariates via Propensity Score Weighting Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu Stochastic Modeling and Computational Statistics Seminar October 17, 2014

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates Eun Sook Kim, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta E. Lanehart, Aarti Bellara,

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Causal Sensitivity Analysis for Decision Trees

Causal Sensitivity Analysis for Decision Trees Causal Sensitivity Analysis for Decision Trees by Chengbo Li A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer

More information

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1 Imbens/Wooldridge, IRP Lecture Notes 2, August 08 IRP Lectures Madison, WI, August 2008 Lecture 2, Monday, Aug 4th, 0.00-.00am Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Introduction

More information

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Discussion: Can We Get More Out of Experiments?

Discussion: Can We Get More Out of Experiments? Discussion: Can We Get More Out of Experiments? Kosuke Imai Princeton University September 4, 2010 Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 1 / 9 Keele, McConnaughy, and White Question:

More information

Propensity score matching for multiple treatment levels: A CODA-based contribution

Propensity score matching for multiple treatment levels: A CODA-based contribution Propensity score matching for multiple treatment levels: A CODA-based contribution Hajime Seya *1 Graduate School of Engineering Faculty of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe

More information

Matching with Multiple Control Groups, and Adjusting for Group Differences

Matching with Multiple Control Groups, and Adjusting for Group Differences Matching with Multiple Control Groups, and Adjusting for Group Differences Donald B. Rubin, Harvard University Elizabeth A. Stuart, Mathematica Policy Research, Inc. estuart@mathematica-mpr.com KEY WORDS:

More information

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects The Problem Analysts are frequently interested in measuring the impact of a treatment on individual behavior; e.g., the impact of job

More information

Causal Analysis in Social Research

Causal Analysis in Social Research Causal Analysis in Social Research Walter R Davis National Institute of Applied Statistics Research Australia University of Wollongong Frontiers in Social Statistics Metholodogy 8 February 2017 Walter

More information

Balancing Covariates via Propensity Score Weighting

Balancing Covariates via Propensity Score Weighting Balancing Covariates via Propensity Score Weighting Fan Li Kari Lock Morgan Alan M. Zaslavsky 1 ABSTRACT Covariate balance is crucial for unconfounded descriptive or causal comparisons. However, lack of

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

Is My Matched Dataset As-If Randomized, More, Or Less? Unifying the Design and. Analysis of Observational Studies

Is My Matched Dataset As-If Randomized, More, Or Less? Unifying the Design and. Analysis of Observational Studies Is My Matched Dataset As-If Randomized, More, Or Less? Unifying the Design and arxiv:1804.08760v1 [stat.me] 23 Apr 2018 Analysis of Observational Studies Zach Branson Department of Statistics, Harvard

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Causal inference in multilevel data structures:

Causal inference in multilevel data structures: Causal inference in multilevel data structures: Discussion of papers by Li and Imai Jennifer Hill May 19 th, 2008 Li paper Strengths Area that needs attention! With regard to propensity score strategies

More information

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School

More information

Discussion of Papers on the Extensions of Propensity Score

Discussion of Papers on the Extensions of Propensity Score Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and

More information

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Optimal Blocking by Minimizing the Maximum Within-Block Distance Optimal Blocking by Minimizing the Maximum Within-Block Distance Michael J. Higgins Jasjeet Sekhon Princeton University University of California at Berkeley November 14, 2013 For the Kansas State University

More information

Observational Studies and Propensity Scores

Observational Studies and Propensity Scores Observational Studies and s STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Makeup Class Rather than making you come

More information

Implementing Matching Estimators for. Average Treatment Effects in STATA

Implementing Matching Estimators for. Average Treatment Effects in STATA Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University West Coast Stata Users Group meeting, Los Angeles October 26th, 2007 General Motivation Estimation

More information

Balancing Covariates via Propensity Score Weighting: The Overlap Weights

Balancing Covariates via Propensity Score Weighting: The Overlap Weights Balancing Covariates via Propensity Score Weighting: The Overlap Weights Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu PSU Methodology Center Brown Bag April 6th, 2017 Joint

More information

Conditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression

Conditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression Conditional SEMs from OLS, 1 Conditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression Mark R. Raymond and Irina Grabovsky National Board of Medical Examiners

More information

CEPA Working Paper No

CEPA Working Paper No CEPA Working Paper No. 15-06 Identification based on Difference-in-Differences Approaches with Multiple Treatments AUTHORS Hans Fricke Stanford University ABSTRACT This paper discusses identification based

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Section 10: Inverse propensity score weighting (IPSW)

Section 10: Inverse propensity score weighting (IPSW) Section 10: Inverse propensity score weighting (IPSW) Fall 2014 1/23 Inverse Propensity Score Weighting (IPSW) Until now we discussed matching on the P-score, a different approach is to re-weight the observations

More information

Controlling for overlap in matching

Controlling for overlap in matching Working Papers No. 10/2013 (95) PAWEŁ STRAWIŃSKI Controlling for overlap in matching Warsaw 2013 Controlling for overlap in matching PAWEŁ STRAWIŃSKI Faculty of Economic Sciences, University of Warsaw

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 13, 2009 Kosuke Imai (Princeton University) Matching

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Comparison of Three Approaches to Causal Mediation Analysis Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Introduction Mediation defined using the potential outcomes framework natural effects

More information

studies, situations (like an experiment) in which a group of units is exposed to a

studies, situations (like an experiment) in which a group of units is exposed to a 1. Introduction An important problem of causal inference is how to estimate treatment effects in observational studies, situations (like an experiment) in which a group of units is exposed to a well-defined

More information

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao David A. van Dyk Kosuke Imai July 3, 2018 Abstract Propensity score methods are

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

arxiv: v1 [stat.me] 8 Jun 2016

arxiv: v1 [stat.me] 8 Jun 2016 Principal Score Methods: Assumptions and Extensions Avi Feller UC Berkeley Fabrizia Mealli Università di Firenze Luke Miratrix Harvard GSE arxiv:1606.02682v1 [stat.me] 8 Jun 2016 June 9, 2016 Abstract

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Matching for Causal Inference Without Balance Checking

Matching for Causal Inference Without Balance Checking Matching for ausal Inference Without Balance hecking Gary King Institute for Quantitative Social Science Harvard University joint work with Stefano M. Iacus (Univ. of Milan) and Giuseppe Porro (Univ. of

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS

AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS Submitted to the Annals of Statistics AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS By Donald B. Rubin and Elizabeth A. Stuart Harvard

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 2016 Paper 121 A Weighted Instrumental Variable Estimator to Control for

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region Impact Evaluation Technical Session VI Matching Techniques Jed Friedman Manila, December 2008 Human Development Network East Asia and the Pacific Region Spanish Impact Evaluation Fund The case of random

More information

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University Stata User Group Meeting, Boston July 26th, 2006 General Motivation Estimation of average effect

More information

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017 Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability Alessandra Mattei, Federico Ricciardi and Fabrizia Mealli Department of Statistics, Computer Science, Applications, University

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Kosuke Imai Princeton University January 31, 2010 Kosuke Imai (Princeton) Introduction to Statistical Inference January 31, 2010 1 / 21 What is Statistics? Statistics

More information

The Role of the Propensity Score in Observational Studies with Complex Data Structures

The Role of the Propensity Score in Observational Studies with Complex Data Structures The Role of the Propensity Score in Observational Studies with Complex Data Structures Fabrizia Mealli mealli@disia.unifi.it Department of Statistics, Computer Science, Applications University of Florence

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Causal Inference Lecture Notes: Selection Bias in Observational Studies Causal Inference Lecture Notes: Selection Bias in Observational Studies Kosuke Imai Department of Politics Princeton University April 7, 2008 So far, we have studied how to analyze randomized experiments.

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Propensity score modelling in observational studies using dimension reduction methods

Propensity score modelling in observational studies using dimension reduction methods University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University

More information

A Theory of Statistical Inference for Matching Methods in Causal Research

A Theory of Statistical Inference for Matching Methods in Causal Research A Theory of Statistical Inference for Matching Methods in Causal Research Stefano M. Iacus Gary King Giuseppe Porro October 4, 2017 Abstract Researchers who generate data often optimize efficiency and

More information

A Theory of Statistical Inference for Matching Methods in Applied Causal Research

A Theory of Statistical Inference for Matching Methods in Applied Causal Research A Theory of Statistical Inference for Matching Methods in Applied Causal Research Stefano M. Iacus Gary King Giuseppe Porro April 16, 2015 Abstract Matching methods for causal inference have become a popular

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information