Causal modelling in Medical Research

Size: px
Start display at page:

Download "Causal modelling in Medical Research"

Transcription

1 Causal modelling in Medical Research Debashis Ghosh Department of Biostatistics and Informatics, Colorado School of Public Health Biostatistics Workshop Series

2 Goals for today Introduction to Potential outcomes framework Exposure to the causal inference analytic pipeline

3 Causal Inference Asks the question of how a person-specific outcome changes under different scenarios To understand this pop culturally, and

4 Causal Inference The fundamental notion: counterfactual Also called potential outcome

5 Father of the Counterfactual

6 Hume s definition of causal effect We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed. (from Section 7 of An Enquiry concerning Human Understanding, 1748)

7 Causality is a topic researched in many fields Philosophy Computer Science Economics Sociology Education Statistics

8 Motivating example

9 Motivating example (cont d.) Question asked by investigators: Is exposure to a statin in the early perioperative period associated with reduced postoperative complications after noncardiac surgery? The causal effect studied in the paper: causal relative risk in 30-day mortality associated with use of statin on the day of or the day after surgery Question: Ideal way to estimate this effect?

10 Motivating example (cont d.) Here, the investigators used data from an observational database on 180,478 patients undergoing noncardiac surgery who were admitted within 7 days of surgery and sampled by the Veterans Affairs Surgical Quality Improvement Program (VASQIP) Investigators used propensity scores for modelling the treatment effect and matching to estimate causal effects

11 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

12 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

13 Causal effects Let Y be the outcome in the movie Back to example from The Matrix Then one causal effect: Y(Neo takes blue pill) - Y(Neo takes red pill) From Sliding Doors, a causal effect is Y(Helen catches subway) - Y(Helen doesn t catch subway) Why are these causal effects?

14 Causal effect definition from potential outcomes Comparison of potential outcomes within the same individuals Requires envisioning a Y for a given person under each scenario The observed data will be Y under only one scenario Fundamental Problem of Causal Inference

15 Causal effects (cont d.) Suppose a person can get a treatment (T = 1) or control (T = 0) Then the data for causal inference can be visualized as Y (0) Y (1) T? y 1 1 y 2? 0? y 3 1? y y n? 0 The question marks define a missing data mechanism If we can fill in question marks with y-values, then we calculate causal effects.

16 Potential Outcomes Setup: let T denote a binary treatment that takes values 0 and 1 1 Let {Y (0), Y (1)} denote the potential outcomes for an individual 2 If we want to have a probability model, then we need to formulate a bivariate distribution 3 This is two-dimensional, in contrast to a one-dimensional distribution

17 Terminology Causality is tied to an action (also called manipulation, treatment or intervention) Determine a list of possible actions (equivalent to T so that we have a list of two actions) The action is applied to a unit The presumption is that at any point in time a unit could have been subject to either action (potential outcomes) We associate each unit-action pair with a potential outcome

18 Ideal data The full data for causal inference can be visualized as Unit # Y (0) Y (1) 1 y 10 y 11 2 y 20 y 21 3 y 30 y 31 4 y 40 y n y n0 y n1

19 Treatments Lots of controversy as to how to define this Examples: race, gender Rubin/Holland: No intervention without manipulation Think carefully about mechanisms that would lead to manipulation

20 Causal effects By definition, these are comparisons of potential outcomes made within a unit Regression models compare subpopulations and thus do not enjoy causal interpretations without further assumptions Example Correlation, not causation E(SBP) = β 0 + β 1 I (Age > 45)

21 One causal effect definition Unit # Y (0) Y (1) Y (1) Y (0) 1 y 10 y 11 y 11 y 10 2 y 20 y 21 y 21 y 20 3 y 30 y 31 y 31 y 30 4 y 40 y 41 y 41 y n y n0 y n1 y n1 y n0

22 Another causal effect definition Unit # Y (0) Y (1) Y (1)/Y (0) 1 y 10 y 11 y 11 /y 10 2 y 20 y 21 y 21 /y 20 3 y 30 y 31 y 31 /y 30 4 y 40 y 41 y 41 /y n y n0 y n1 y n1 /y n0

23 Causal effects What were shown on the previous page were individual-level causal effects A natural thing to do is average the unit-level causal effects across units What assumption are we making when we do this???

24 Subgroup-specific causal effects We can also define causal effects conditional on subgroups The subgroups could be based on observed covariates or even on potential outcomes What assumption are we making when we do this???

25 Causal inference assumptions Strongly ignorable treatment assumption Stable unit treatment value assumption Treatment Positivity assumption Consistency Assumption

26 Causal inference assumptions Strongly ignorable treatment assumption Math: {Y (0), Y (1)} T X where X denotes a vector of confounders In words: there exist a rich set of confounders such that the potential outcomes can be effectively treated as baseline variables.

27 SITA implication Data visualization Y (0) Y (1) T X 1 X p? y 11 1 x 11 x 1p y 20? 0 x 21 x 2p? y 31 1 x 31 x 3p? y 41 1 x 41 x 4p y n0? 0 x n1 x np SITA also means that we can use x s to fill in question marks

28 SUTVA Data visualization How does SUTVA show up here?? Y (0) Y (1) T X 1 X p? y 11 1 x 11 x 1p y 20? 0 x 21 x 2p? y 31 1 x 31 x 3p? y 41 1 x 41 x 4p y n0? 0 x n1 x np

29 Treatment Positivity Assumption This means that 0 < P(T = 1 X) < 1 for all subjects and for all X values This is needed to have well-defined counterfactuals for defining causal effects When is this violated???

30 How does violating SUTVA affect Y (0) Y (1) T X 1 X p? y 11 1 x 11 x 1p y 20? 0 x 21 x 2p? y 31 1 x 31 x 3p? y 41 1 x 41 x 4p y n0? 0 x n1 x np

31 Consistency Assumption We formulate Y (0) and Y (1) for everybody, but we only observe one of these. Consistency Assumption means that the potential outcome and observed outcome are the same.

32 Assumptions There are a lot of assumptions needed to identify causal effects from observational data These assumptions cannot be tested empirically

33 London et al., 2017, JAMA Internal Medicine Treatment: T = exposure to statin on day of or day after surgery (1 if received, 0 if not) Outcome: Y = death within 30 days Causal parameter: Y (1)/Y (0)

34 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

35 Assignment Mechanism We have shown how to conceptualize data for causal inference problems Let Yi obs Then denote the observed response for subject i Y obs i = T i Y i (1) + (1 T i )Y i (0) If T were the result of a coin flip, then P(T = 1) = P(T = 0) = 1/2 This is what a randomized clinical trial does! This is not true for an observational study

36 Assignment Mechanism (cont d.) Definition: An assignment mechanism refers to how each unit came to receive the treatment level actually received The propensity score is one attempt at modelling the assignment mechanism. It is defined as e(x) = P(T = 1 X) Conditional on propensity score, one achieves covariate balance on observed covariates

37 Potential Outcomes (cont d.) Use of propensity score leads to a three-step approach 1 Estimate propensity score 2 Check for balance and adjust accordingly 3 Based on estimated propensity score, estimate the average causal effect

38 Variable selection for propensity score Recommendation is to include all possible treatment confounders and interactions and quadratics by Rubin One variable to avoid: instruments Intuition: want to include as many variables that are strongly predictive of the potential outcomes We will need to check for balance

39 Balance Density Figure: Distribution of covariate X for treatment and control groups. The blue line denotes the kernel density estimation for X in the T = 1 group, while the magenta line represents the kernel density estimate for X in the T = 0 group. The bars represent the histogram of X regardless of the treatment groups.

40 Recipe for causal analysis (Stuart, 2010, Stat Sci) 1 Decide on covariates for which balance must be achieved 2 Estimate the distance measure (e.g., propensity score) 3 Condition on the distance measure (e.g., using matching, weighting, or subclassification) 4 Assess balance on the covariates of interest; if poor, repeat steps Estimate the treatment effect in the conditioned sample

41 Covariate balance Let T {0, 1} denote a binary treatment Let X denote confounders Ideally, we have that X T = 1 and X T = 0 are equal in distribution If this holds in original dataset, the dataset looks like what you might find in a randomized study

42 Covariate balance (cont d.) Typical ways to assess: report standardized mean difference, Kolmogorov-Smirnov statistics, or t-tests Covariate balance corresponds to smaller values or less significant p-values

43 Covariate balance stats within a matched dataset Start with n observations Assume n d observations get discarded, there remains n n d observations Let M denote the indices of the observations in the matched dataset

44 Covariate balance stats within a matched dataset (cont d.) Comparison of means of covariate k n i=1 I (i M)T ix ik n i=1 I (i M)T i n i=1 I (i M)(1 T i)x ik n i=1 I (i M)(1 T i) k = 1,..., p Can standardize by the variance in the controls Rule of thumb: Want the standardized difference to be < 0.25

45 London et al., 2017, JAMA Internal Medicine Propensity score-matching (1:1) was conducted by matching pairs of patients with statin exposure and no exposure with a greedy matching algorithm and a caliper width of 0.2 SD of the log odds of the estimated propensity score.covariate balance between matched pairs for continuous and dichotomous categorical variables was assessed using the standardized difference, with values equal to or less than 10% indicating minimal imbalance.

46 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

47 Inverse weighting estimators Use the observation that under SITA and treatment positivity [ ] Ti Y i E = E[Y i (1)] e(x i ) and E [ ] (1 Ti )Y i = E[Y i (0)] 1 e(x i )

48 Inverse weighting estimators (cont d.) The inverse-weighted estimator of average causal effect is given by ˆ ACE = n 1 n i=1 T i Y i ê(x i ) (1 T i)y i 1 ê(x i ) These estimators can be sensitive to the presence of extreme weights, i.e, ê(x i ) is close to zero or one

49 One hybrid approach: double-robust estimation Combine inverse weighting and regression adjustment The double-robust estimator is given by n ACE ˆ DR = n 1 T i Y i ê(x i ) T i ê(x i ) ˆm 1 (X i ) ê(x i ) n 1 i=1 n i=1 (1 T i )Y i 1 ê(x i ) + T i ê(x i ) 1 ê(x i ) ˆm 0(X i ), where ˆm 1 (X) and ˆm 0 (X) are the estimated functions for E(Y T = 1, X) and E(Y T = 0, X), respectively Asymptotic theory can give variance (or bootstrap)

50 Double-robust estimation Very popular in statistical research Note that ˆm t (X) (t = 0, 1) are the fitted values from a regression model of Y on X in the T = 0 and T = 1 subgroups ˆm t are used in the regression estimators for causal effects Pro: Double-robust estimator can be consistent if either propensity or outcome regression is correctly specifiied. Con:Double-robust estimator can perform poorly if both are misspecified.

51 Matching estimators Based on the estimated propensity score, for each subject with T = 1, try to find one (or more) subjects with similar propensity score with T = 0 and compare their responses; call this the matched set for subject i Estimate average causal effect as n 1 t (Y i (m i ) 1 i:t i =1 j M i :T j =0 Y j ), where M i is the matched set for subject i, and n t is the number of subjects with T = 1 A more robust approach is to coarsen the propensity score using coarsening (subclassification)

52 Matching Matching to balance the covariate distribution Goal: To make the treated and control subjects look alike before treatment Equivalently, to produce a study regime which resembles a completely randomized design, in terms of the observed covariates

53 Matching (cont d.) Need a notion of distance Typically idea: define distance between covariates, propensity scores or function thereof Example: ê(x) log 1 ê(x) For discrete covariates, could use exact matching, which corresponds to the Hamming distance

54 Matching (cont d.) Nearest Neighbor algorithm 1 Iteratively find the pair of subjects with the shortest distance between cases and controls 2 Easy to understand and implement; 3 Offers good results in practice; 4 Fast running time 5 Rarely offers the best matching results (compared to optimal matching)

55 Matching (cont d.) Phrased often as a bipartite matching problem Conceptually, have a graph with two sets of vertices: one set represents controls and one set represents treated Algorithm needs to find edges from one set to the other, where the presence of an edge denotes a matched set

56 Other matching schemes Bipartite matching: 1 Pair matching: used when the numbers of the treated and control are comparable 2 1:K matching: used when control group is huge compared to the treated 3 Variable matching: more flexible than 1:K, the matched control is set to be between a and b for each treated subject 4 Optimal matching: Phrase the bipartite algorithm as a minimizing a so-called max-flow/min-cut problem in graph theory

57 Matching options Mahalanobis metric matching: d M (x, x ) = (x x ) T ( n c ˆΣ c + n t ˆΣ t n c + n t ) 1 (x x ), where n c and n t are the number of observations in the T = 0 and T = 1 groups, and ˆΣ c and ˆΣ t are the estimated variance-covariance covariance matrices for the covariates/confounders Propensity score matching: Exact matching ( d l (x, x ê(x) ) = log 1 ê(x) log ê(x ) 1 ê(x ) Combine these into a hybrid ) 2

58 Matching options cont d. Sometimes will exclude matches of poor quality (relevance to causal effect estimation) For continuous covariates, might do what is known as caliper matching: define a match as having where d cal is a preset number d(x, x ) d cal,

59 Matching estimators: variance No consensus Some options: 1 Adjust for matching status as a fixed effect 2 Adjust for matching status as a random effect 3 Asymptotic theory (Abadie and Imbens, 2006) 4 Modified bootstrap

60 Subclassification Idea: categorize propensity score into K strata and to estimate causal effect within each stratum separately If there exists no stratum interaction with the causal effect, then can combine stratum-specific causal effect estimators to get an overall average causal effect

61 Subclassification: # of strata Pick a certain number for K Cochran (1965): set K = 5 A more data-driven choice of K given in the book (Chapter 13): 1 Start with K = 1 2 Do a pooled two-sample t-test to test for a difference in propensity score tests 3 If there is a significant difference, split up into more strata, subject to a mininum sample size for each group (treated and control) in the strata 4 Iterate over the 3 steps, stopping if there is nonsignificance

62 Subclassification: computation Given strata of the form where b 0 = 0 and b K = 1 (b j 1, b j ], j = 1,..., K, Define B ij (i = 1,..., n; j = 1,..., K) by { 1 ifb j 1 ê(x i ) b j, B ij = 0 otherwise Define for i = 1,..., n; j = 1,..., K n 1j = B ij, n 0j = B ij, i:t i =1 i:t i =0 n j = N 1j + N 0j

63 Subclassification: computation (cont d.) Then average causal effect in strata j is ˆτ j = N 1 1j B ij Y i N 1 0j i:t i =1 i:t i =0 B ij Y i. The subclassification or blocking average causal effect is ˆτ block = K j=1 n j n ˆτ j

64 London et al., 2017, JAMA Internal Medicine After creating a matched pair dataset using propensity scores, the authors estimated effects using matched pair methods Propensity score used to match observations Reduction in size (from >180,000 to 96,486)

65 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

66 Inference for surveys (finite-sample) versus superpopulation Recall earlier data structure: Y (0) Y (1) T X 1 X p? y 11 1 x 11 x 1p y 20? 0 x 21 x 2p? y 31 1 x 31 x 3p? y 41 1 x 41 x 4p y n0? 0 x n1 x np In this setup, the question marks occur randomly, but other than that, there is no randomness This is the finite-sample setup.

67 Finite-sample inferences Based on randomization distribution of T (i.e., shuffle the label of T ) Use estimated standard deviations conditional on propensity score

68 From finite-sample to superpopulation Everything discussed so far has been a finite-sample mode of inference Generalize to the superpopulation Idea now is that the observed units are a random sample from a superpopulation Recall τ fs = n 1 n i=1 {Y i(1) Y i (0)} Expectations are now taken with respect to the superpopulation sampling mechanism, and not T

69 Superpopulation inferences Extensive derivations can be found in the Appendix of Chapter 6 of Imbens and Rubin (2015) Key results: E(ˆτ dif ) = E(Y (1) Y (0)) Var(ˆτ dif ) = σ2 c n c + σ2 t n t, where σ 2 c is the variance of Y (0) with respect to the superpopulation sampling, and σ 2 t is the variance of Y (1) with respect to the superpopulation sampling

70 Other variance formulae Subclassification: Assume J strata based on the propensity score Use least squares of Y on T and X within each subclass Coefficient for T is ˆτ j and get ˆσ j 2 from sum of squares of residuals Assuming no stratum by treatment interaction, ˆτ s = J j=1 n c (j) + n t (j) ˆτ(j). n Then the estimated variance is given by V (ˆτ s ) = J ( ) nc (j) + n t (j) 2 ˆσ j 2. n j=1

71 To bootstrap or not to bootstrap? The form ˆτ = ˆτ(Y, T, X) suggests that one could do a bootstrap Algorithm 1 Resample bootstrapped dataset from observed data (Y i, T i, X i ), i = 1,..., n 2 Compute τ b = ˆτ(Y b, T b, X b ) for bth bootstrapped dataset 3 Repeat steps 1) and 2) B times 4 Use empirical distribution of τ 1,..., τ B to compute standard errors and confidence intervals

72 To bootstrap or not to bootstrap? (cont d.) Might be problematic for matching estimators Example of a nonregular problem, so the bootstrap might fail For inverse weighting problems, it works fine. Counterpoint arguing for the bootstrap: Samworth (2003, Biometrika)

73 London et al., 2017, JAMA Internal Medicine After creating a matched pair dataset using propensity scores, the authors estimated variances of effects using matched pair methods Random effects models are superpopulation in spirit Conditional methods are finite-sample in spirit.

74 A recipe for causal inference 1 Define the exposure 2 Define the causal parameter of interest 3 Model the assignment mechanism using propensity scores 4 Estimate the causal effect conditioning on the propensity score in some way 5 Get a variance estimate 6 Perform sensitivity analyses for unobserved confounding

75 Sensitivity Analysis: basic setup Assume that T is not conditionally independent of {Y (0), Y (1)} given X (i.e., SITA violated) Assume that T {Y (0), Y (1)} X, U, where U has a random variable Treat U as a synthetic covariate (i.e., simulated and under the control of the user. Write down a model for Y X, U and T X, U Let the coefficient for U vary and estimate the coefficients for X in the two models. Hard to implement, as it requires a grid search

76 Sensitivity Analysis: simplification One simplification is due to Lin et al. (1998, Biometrics) Idea: use model misspecification theory Recall that we have invoked consistency for parameter estimates: ˆθ p θ, or in words theta hat (estimate) converges in probability to θ (true value of the parameter) This assumes that the model for the data is correctly specified What happens when this is not true?

77 Lin et al. (1998) When model being fit to data is false ˆθ p θ, where θ is called the least false parameter θ minimizes the Kullback-Leibler divergence:

78 Lin et al. (1998, cont d.) Cast the violation of SITA as one of fitting the incorrect model: True model: Y depends on T, X and U False model: Y depends on T and X This approach allows for some simple approximate sensitivity analysis formulae

79 London et al., 2017, JAMA Internal Medicine Authors performed several sensitivity analyses Interestingly, nothing for unmeasured confounding

80 Summary Causal inference aims to use observational data in order to mimic a randomized experimental design using observed covariates The propensity score plays the role of the coin flip (probability based on observed covariates) We have given a six-step pipeline for analysis Note that many unverifiable assumptions are needed for performing causal inference, and thus, sensitivity analysis step is a key.

81 References Abadie, A. and Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica 74, Imbens and Rubin. (2015). Causal Inference for Statistics, Social and Biomedical Sciences Cambridge: Cambridge University Press. Lin, D. Y., Psaty, B. M., and Kronmal, R. A. (1998). Assessing the Sensitivity of Regression Results to Unmeasured Confounders in Observational Studies. Biometrics, 54(3), London, M. J., Schwartz, G. G., Hur, K. and Henderson, W. G. Association of Perioperative Statin Use With Mortality and Morbidity After Major Noncardiac Surgery. JAMA Intern Med Feb 1;177(2): Stuart, E. A. (2010). Matching methods for causal inference: a review and a look forward. Statistical Science 25,1 21.

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

Propensity Score Matching

Propensity Score Matching Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

arxiv: v1 [stat.me] 15 May 2011

arxiv: v1 [stat.me] 15 May 2011 Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

Causal Sensitivity Analysis for Decision Trees

Causal Sensitivity Analysis for Decision Trees Causal Sensitivity Analysis for Decision Trees by Chengbo Li A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid UK Stata Users Group Meeting London, September 10, 2007 CONTENT Presentation of a Stata

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching

More information

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14 STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 13, 2009 Kosuke Imai (Princeton University) Matching

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Stratified Randomized Experiments

Stratified Randomized Experiments Stratified Randomized Experiments Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Stratified Randomized Experiments Stat186/Gov2002 Fall 2018 1 / 13 Blocking

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Propensity Score for Causal Inference of Multiple and Multivalued Treatments

Propensity Score for Causal Inference of Multiple and Multivalued Treatments Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2016 Propensity Score for Causal Inference of Multiple and Multivalued Treatments Zirui Gu Virginia Commonwealth

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School

More information

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai Chapter 1 Propensity Score Analysis Concepts and Issues Wei Pan Haiyan Bai Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity score analysis, research using propensity score analysis

More information

Propensity score modelling in observational studies using dimension reduction methods

Propensity score modelling in observational studies using dimension reduction methods University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University

More information

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Telescope Matching: A Flexible Approach to Estimating Direct Effects

Telescope Matching: A Flexible Approach to Estimating Direct Effects Telescope Matching: A Flexible Approach to Estimating Direct Effects Matthew Blackwell and Anton Strezhnev International Methods Colloquium October 12, 2018 direct effect direct effect effect of treatment

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The

More information

Section 7: Local linear regression (loess) and regression discontinuity designs

Section 7: Local linear regression (loess) and regression discontinuity designs Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao David A. van Dyk Kosuke Imai July 3, 2018 Abstract Propensity score methods are

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Discussion of Papers on the Extensions of Propensity Score

Discussion of Papers on the Extensions of Propensity Score Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, 2016 Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) Our team! 2 Avi Feller (Berkeley) Jane Furey (Abt Associates)

More information

Matching for Causal Inference Without Balance Checking

Matching for Causal Inference Without Balance Checking Matching for ausal Inference Without Balance hecking Gary King Institute for Quantitative Social Science Harvard University joint work with Stefano M. Iacus (Univ. of Milan) and Giuseppe Porro (Univ. of

More information

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid North American Stata Users Group Meeting Boston, July 24, 2006 CONTENT Presentation of

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Schedule and outline 1:00 Introduction and overview 1:15 Quasi-experimental vs. experimental designs

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

arxiv: v5 [stat.me] 13 Feb 2018

arxiv: v5 [stat.me] 13 Feb 2018 arxiv: arxiv:1602.07933 BOOTSTRAP INFERENCE WHEN USING MULTIPLE IMPUTATION By Michael Schomaker and Christian Heumann University of Cape Town and Ludwig-Maximilians Universität München arxiv:1602.07933v5

More information

Balancing Covariates via Propensity Score Weighting

Balancing Covariates via Propensity Score Weighting Balancing Covariates via Propensity Score Weighting Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu Stochastic Modeling and Computational Statistics Seminar October 17, 2014

More information

Modeling Log Data from an Intelligent Tutor Experiment

Modeling Log Data from an Intelligent Tutor Experiment Modeling Log Data from an Intelligent Tutor Experiment Adam Sales 1 joint work with John Pane & Asa Wilks College of Education University of Texas, Austin RAND Corporation Pittsburgh, PA & Santa Monica,

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Front-Door Adjustment

Front-Door Adjustment Front-Door Adjustment Ethan Fosse Princeton University Fall 2016 Ethan Fosse Princeton University Front-Door Adjustment Fall 2016 1 / 38 1 Preliminaries 2 Examples of Mechanisms in Sociology 3 Bias Formulas

More information

Measures of Association and Variance Estimation

Measures of Association and Variance Estimation Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35

More information

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21 Sections 2.3, 2.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 21 2.3 Partial association in stratified 2 2 tables In describing a relationship

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization

More information

Introduction to Propensity Score Matching: A Review and Illustration

Introduction to Propensity Score Matching: A Review and Illustration Introduction to Propensity Score Matching: A Review and Illustration Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January 28, 2005 For Workshop Conducted at the

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

A Theory of Statistical Inference for Matching Methods in Causal Research

A Theory of Statistical Inference for Matching Methods in Causal Research A Theory of Statistical Inference for Matching Methods in Causal Research Stefano M. Iacus Gary King Giuseppe Porro October 4, 2017 Abstract Researchers who generate data often optimize efficiency and

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

causal inference at hulu

causal inference at hulu causal inference at hulu Allen Tran July 17, 2016 Hulu Introduction Most interesting business questions at Hulu are causal Business: what would happen if we did x instead of y? dropped prices for risky

More information

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao Department of Statistics, University of California, Irvine, CA 92697 shandonm@uci.edu

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. RUBIN My perspective on inference for causal effects: In randomized

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question. Data 8 Fall 2017 Foundations of Data Science Final INSTRUCTIONS You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question. The exam is closed

More information

Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573)

Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573) Prognostic Propensity Scores: A Method Accounting for the Correlations of the Covariates with Both the Treatment and the Outcome Variables in Matching and Diagnostics Authors and Affiliations: Nianbo Dong

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer

More information

arxiv: v1 [math.st] 28 Feb 2017

arxiv: v1 [math.st] 28 Feb 2017 Bridging Finite and Super Population Causal Inference arxiv:1702.08615v1 [math.st] 28 Feb 2017 Peng Ding, Xinran Li, and Luke W. Miratrix Abstract There are two general views in causal analysis of experimental

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information