CompSci Understanding Data: Theory and Applications

Similar documents
An Introduction to Causal Analysis on Observational Data using Propensity Scores

Selection on Observables: Propensity Score Matching.

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Statistical Models for Causal Analysis

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

The 2004 Florida Optical Voting Machine Controversy: A Causal Analysis Using Matching

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Section 10: Inverse propensity score weighting (IPSW)

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Alternative Balance Metrics for Bias Reduction in. Matching Methods for Causal Inference

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Gov 2002: 4. Observational Studies and Confounding

Propensity Score Matching

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Potential Outcomes and Causal Inference I

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Propensity Score Weighting with Multilevel Data

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Research Design: Causal inference and counterfactuals

Econometric Causality

Propensity Score Methods for Causal Inference

Covariate selection and propensity score specification in causal inference

Lecture 1 January 18

Technical Track Session I: Causal Inference

Quantitative Economics for the Evaluation of the European Policy

Propensity Score for Causal Inference of Multiple and Multivalued Treatments

PEARL VS RUBIN (GELMAN)

Estimating the Marginal Odds Ratio in Observational Studies

Introduction to Statistical Inference

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Causal inference in multilevel data structures:

Genetic Matching for Estimating Causal Effects:

A Decision Theoretic Approach to Causality

causal inference at hulu

The problem of causality in microeconometrics.

Geoffrey T. Wodtke. University of Toronto. Daniel Almirall. University of Michigan. Population Studies Center Research Report July 2015

Variable selection and machine learning methods in causal inference

Causal Analysis in Social Research

Composite Causal Effects for. Time-Varying Treatments and Time-Varying Outcomes

Causal Mechanisms Short Course Part II:

Lecture 3: Causal inference

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Propensity score modelling in observational studies using dimension reduction methods

Front-Door Adjustment

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Predicting the Treatment Status

Potential Outcomes and Causal Inference

Principles Underlying Evaluation Estimators

Causal Inference from Experimental Data

Job Training Partnership Act (JTPA)

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

What s New in Econometrics. Lecture 1

Flexible Estimation of Treatment Effect Parameters

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

ECON Introductory Econometrics. Lecture 17: Experiments

Observational Studies and Propensity Scores

The Causal Inference Problem and the Rubin Causal Model

A Theory of Statistical Inference for Matching Methods in Causal Research

Causal Sensitivity Analysis for Decision Trees

Technical Track Session I:

Research Note: A more powerful test statistic for reasoning about interference between units

Potential Outcomes Model (POM)

Causal modelling in Medical Research

Bounding the Probability of Causation in Mediation Analysis

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

Partially Identified Treatment Effects for Generalizability

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

The Econometrics of Randomized Experiments

Notes on causal effects

EMERGING MARKETS - Lecture 2: Methodology refresher

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Measuring Social Influence Without Bias

14.74 Lecture 10: The returns to human capital: education

Sensitivity analysis and distributional assumptions

Combining Difference-in-difference and Matching for Panel Data Analysis

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6

Matching. James J. Heckman Econ 312. This draft, May 15, Intro Match Further MTE Impl Comp Gen. Roy Req Info Info Add Proxies Disc Modal Summ

arxiv: v1 [stat.me] 15 May 2011

Estimating Causal Effects from Observational Data with the CAUSALTRT Procedure

Matching with Multiple Control Groups, and Adjusting for Group Differences

1 Impact Evaluation: Randomized Controlled Trial (RCT)

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Section 1 : Introduction to the Potential Outcomes Framework. Andrew Bertoli 4 September 2013

MATCHING FOR EE AND DR IMPACTS

Omitted Variables, Countervailing Effects, and The Possibility of Overadjustment

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized. Clinical Trials Subject to Noncompliance.

The propensity score with continuous treatments

Summary and discussion of The central role of the propensity score in observational studies for causal effects

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

PSC 504: Instrumental Variables

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

University of California, Berkeley

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Transcription:

CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1

Today s Reading Rubin Journal of the American Statistical Association, 2005 Causal Inference Using Potential Outcomes: Design, Modeling, Decisions Rosenbaum- Rubin Biometrika, 1983 The Central Role of the Propensity Score in Observational Studies for Causal Effects 2

Potential Outcome Model Referred to as Neyman- Rubin model or Rubin s model First proposed in Neyman s Ph.D. thesis (1923) A model for Randomized Experiments by Fisher (1920s- 30s) Further developed by Rubin (1978) and others Establish a causal relationship between a potential cause (treatment) and its effect (outcome) 3

Widely used in Medicine Potential Outcome Model Christakis and Iwashyna 2003; Rubin 1997 Economics Abadie and Imbens 2006; Galiani, Gertler, and Schargrodsky 2005; Dehejia and Wahba 2002, 1999 Political science Bowers and Hansen 2005; Imai 2005; Sekhon 2004b Sociology Morgan and Harding 2006; Diprete and Engelhardt 2004; Winship and Morgan 1999; Smith 1997 Law Rubin 2001 References in [Sekhon 2007] 4

Units N units physical objects at particular points in time e.g. individual people, one person at different points of time, plots of lands Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 N X n T N Y 1N Y 0N Y 1N Y 0N E[Y 1 Y 0 ] 5

Treatment and Control Each unit i can be exposed or not to a treatment T i e.g. individuals taking an Aspirin vs. placebo, Active Treatment or Treatment (T i = 1) if exposed Control Treatment or Control (T i = 0) if not exposed Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 N X n T N Y 1N Y 0N Y 1N Y 0N E[Y 1 Y 0 ] 6

Covariates Variables that take their values before the treatment assignment Cannot be affected by the treatment e.g. pre- aspirin headache pain, gender, blood- pressure Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 N X n T N Y 1N Y 0N Y 1N Y 0N E[Y 1 Y 0 ] 7

Potential Outcome Y 1 (for treatment, T i = 1) Y 0 (for control, T i = 0) for i- th unit : Y 1i and Y 0i Observed outcome Y = T i Y 1i + (1 - T i )Y 0i Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 N X n T N Y 1N Y 0N Y 1N Y 0N E[Y 1 Y 0 ] 8

Unit- level causal effect The comparisons of Y 1i and Y 0i difference or ratio Typically Y 1i - Y 0i For any unit i, only one of them can be observed we cannot go back in time and expose it to the other treatment Fundamental problem of causal inference Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 E[Y 1 Y 0 ] N X n T N Y 1N Y 0N Y 1N Y 0N 9

Summary of causal effect Defined for a collection of units e.g. the mean (or expected) unit- level causal effect - - standard the median unit- level causal effect for all males the difference between the median Y 1i and Y 0i for all females Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 E[Y 1 Y 0 ] N X n T N Y 1N Y 0N Y 1N Y 0N 10

Remark.. To be a causal effect, the comparisons of Y 1 and Y 0 should be for a common set of units e.g. females we cannot apply control to males and treatment to females Units Covariates Treatment assignment Potential Outcome: Treatment Potential Outcome: Control Unit- level causal effects Summary of causal effects 1 X 1 T 1 Y 11 Y 01 Y 11 - Y 01 2 X 2 T 2 Y 12 Y 02 Y 12 Y 02 E[Y 1 Y 0 ] N X n T N Y 1N Y 0N Y 1N Y 0N 11

Average Treatment Effect (ATE) ATE = E[Y 1 Y 0 ] Recall observed outcome Y = T Y 1 + (1- T) Y 0 Suppose Treatment Assignment (T) is independent of Y 1, Y 0 Then E[Y 1 Y 0 ] = E[Y 1 ] E[Y 0 ] = E[Y 1 T = 1] E[Y 0 T = 0] = E[Y T = 1] E[Y T = 0] e.g. in a Randomized Experiment (Fisher 1920-30), when each unit is randomly assigned to a Treatment or Control Group Still need additional assumptions 12

SUTVA Stable Unit Treatment Value Assumptions Cox 1958, Rubin 1978 1. No interference or spill- over effect among units For unit i, Y 1i and Y 10 are NOT affected by what action any other unit j received 2. Unique Treatment Level or Dose There are no hidden versions of treatments No matter how (mechanism) unit i received treatment 1, the outcome that would be observed would be Y 1i - - similarly for treatment 0 13

Violations of SUTVA 1. No interference (wiki) Two units Joe and Mary for effect of a drug for high blood pressure They share the same household Mary cooks Mary got drug (treatment) her pressure reduces cooks salty food In practice, Mary may not know if she got the drug or placebo Joe s pressure increases 2. Unique Treatment Level or Dose Different doses of the medicine for drug pressure 14

More assumptions Compliance issue People assigned to treatment may refuse it People assigned to control may try to get treatment Barnard, Frangakis, Hill, and Rubin 2003 People started taking a medicine, then stopped in the middle because it made them too sick to work 15

Notes on Neyman- Rubin Model At least half of the potential outcomes are missing Still it is important to explicitly represent both potential outcomes Considered to be a significant contribution by Neyman (Rubin 2005) Assumptions are critical without them the causal inferences are meaningless 16

Recall The Power of Randomized Experiments Covariates (X) represent the set of variables that take their values before the assignment of the units into treatment or control groups e.g., the gender of a human subject cannot be affected by treatments What do we get by randomly assigning units to treatment/control groups? 17

The Power of Randomized Experiments The assigned treatment is statistically independent of any (measured or unmeasured) covariate in the population before the experiment has been started The distribution of any covariate is the same in the treatment and control groups Any difference in outcomes is due to the treatment and not any other pre- existing differences The average of control/treatment group outcomes is an unbiased estimate of average outcome under control/treatment for whole population ATE = E[Y 1 Y 0 ] = E[Y T = 1] E[Y T = 0] 18

But, Randomized Experiments are not always feasible 1. Infeasibility or high cost e.g., how allocation of government funding in different research areas will affect the number of academic jobs in these areas 2. Ethical reasons e.g., effect of availability to better resources during childhood on higher education in the future 3. Prohibitive delay e.g., effect of childhood cholesterol on teen obesity) 4. In some scenarios randomization may not estimate effects for the groups we are interested in 5. Experiments can be on a small population, may have a large variance 19

Observational Study Alternative to true randomized experiments Tries to simulate the ideal situation Create treatment and control groups that appear to be random at least on observed/measured variables by choosing individuals with similar covariate values do not use the outcome while selecting the groups 20

Balancing Scores A balancing score b(x) is a function of the observed covariates X such that the conditional distributions of X given b(x) are the same on the treatment (T = 1) and the control groups (T = 0), i.e., X T b(x) Example: b(x) = X The finest balancing score Propensity scoree(x) The coarsest balancing score 21

Propensity- Score Methods Rosenbaum- Rubin 1983 Make coarse (bigger) groups May not match on all measured covariates But the distributions of covariates are the same for treatment and control Cannot say anything about unmeasured/unobserved covariates 22

Propensity Score The conditional probability of assignment to treatment given the covariates e(x) = Pr(T = 1 X) Known for Randomized Experiments Not known for Observational Study 23

Strongly Ignorable Treatment Assignment Treatment assignment is strongly ignorable given a vector of covariates V if for all V 1. (Y1, Y0) T V 2. 0 < Pr[T = 1 V ] < 1 Simply strongly ignorable when V = X [Rosenbaum- Rubin 1983] 1. If treatment assignment is strongly ignorable given X, then it is strongly ignorable given any balancing score b(x) 2. For any function b(x) of X, b(x) is a balancing score if and only if e(x) = f(b(x)) for some function f In particular, X T e(x) 24

ATE in Observational Study Recall, ATE = E[Y 1 Y 0 ] Consider a two- phase sampling approach 1. Suppose a specific value of the vector of covariates X = x is randomly sampled from the entire population (both treated and control groups) 2. Then a treated and a control units are sampled with this value X = x The expected difference in response is E X [ E[Y 1 X, T = 1] E[Y 0 X, T = 0] ] If the treatment assignment is strongly ignorable, then E X [ E[Y 1 X, T = 1] E[Y 0 X, T = 0] ] = E X [ E[Y 1 X] E[Y 0 X] ] = E[Y 1 - Y 0 ] (why?) Challenge: Too many (measured) covariates, individual groups will be too sparse 25

Three methods for using balancing score on observational data 1. Pair matching on balancing scores 2. Sub- classification on balancing scores 3. Covariance adjustment on balancing scores 26

Pair- matching on balancing score Sample b(x) at random Then sample one treated and one control units with this value of b(x). The expected difference in response equals the ATE at this b(x) the mean of matches pair differences in this two- step process is an unbiased estimator of the ATE 27

Sub- classification on balancing scores Sample a group of units using b(x) such that b(x) is constant for all units in this group at least one unit in the group received each treatment (T = 1, 0). The expected difference in treatment means equals the ATE at this b(x) the weighted average of such differences (weight = fraction of population at b(x)) is an unbiased estimator of the ATE. 28

Covariance adjustment on balancing scores Assumes that the conditional expectation of Y t given b(x) is linear E[Y t b(x), S = t] = α t + β t b(x) for t = 0, 1 Gives an unbiased estimator of the treatment effect at b(x) = E[Y 1 Y 0 b(x)] in terms of unbiased estimators of α 1, β 1, α 0, β 0 29

Neyman- Rubin vs. Pearl s Model Potential Outcome (Neyman- Rubin) = Counterfactuals (Pearl) Treatment (Neyman- Rubin) intervention (Pearl) Structural causal graph on variables assumed by Pearl Causal inference is on (variable- value) pairs No causal structure assumed in Neyman- Rubin s model Infers causal relationships by experiments or from evidence Some authors (e.g., Greenland, Pearl, and Robins 1999; Dawid 2000) call the potential outcomes counterfactuals, borrowing the term from philosophy (e.g., Lewis 1973). I much prefer Neyman s implied term potential outcomes, because these values are not counterfactual until after treatments are assigned, and calling all potential outcomes counterfactuals certainly confuses quantities that can never be observed (e.g., your height at age 3 if you were born yesterday in the Arctic) and so are truly a priori counterfactual, with unobserved potential outcomes that are not a priori counterfactual - - Rubin 2005 30

Other References 1. [Holland 1986]: Holland, Paul W. 1986. Statistics and Causal Inference. Journal of the American Statistical Association 81(396): 945-960 2. [Sekhon 2007]: Sekhon, Jasjeet S. 2007. The Neyman- Rubin Model of Causal Inference and Estimation via Matching Methods Next Topic: Exploring Data with Humans in the Loop 31