An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Similar documents
Causal mediation analysis: Definition of effects and common identification assumptions

Casual Mediation Analysis

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE

Causal Mechanisms Short Course Part II:

Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

New Weighting Methods for Causal Mediation Analysis

Guanglei Hong. University of Chicago. Presentation at the 2012 Atlantic Causal Inference Conference

Modeling Mediation: Causes, Markers, and Mechanisms

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Mediation Analysis: A Practitioner s Guide

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Mediation: Background, Motivation, and Methodology

Statistical Analysis of Causal Mechanisms

A Distinction between Causal Effects in Structural and Rubin Causal Models

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

SEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Modern Mediation Analysis Methods in the Social Sciences

Introduction. Consider a variable X that is assumed to affect another variable Y. The variable X is called the causal variable and the

Parametric and Non-Parametric Weighting Methods for Mediation Analysis: An Application to the National Evaluation of Welfare-to-Work Strategies

Statistical Methods for Causal Mediation Analysis

Mediation and Interaction Analysis

Help! Statistics! Mediation Analysis

Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula

Statistical Analysis of Causal Mechanisms

A comparison of 5 software implementations of mediation analysis

New developments in structural equation modeling

An Introduction to Causal Analysis on Observational Data using Propensity Scores

New developments in structural equation modeling

Causal Mediation Analysis Short Course

Discussion of Papers on the Extensions of Propensity Score

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects

Harvard University. Harvard University Biostatistics Working Paper Series. Sheng-Hsuan Lin Jessica G. Young Roger Logan

Statistical Models for Causal Analysis

arxiv: v2 [math.st] 4 Mar 2013

CompSci Understanding Data: Theory and Applications

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Department of Biostatistics University of Copenhagen

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Identification and Estimation of Causal Mediation Effects with Treatment Noncompliance

Causal Mechanisms and Process Tracing

Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis

Identification and Inference in Causal Mediation Analysis

Harvard University. Harvard University Biostatistics Working Paper Series. Semiparametric Estimation of Models for Natural Direct and Indirect Effects

Some challenges and results for causal and statistical inference with social network data

Identification and Estimation of Causal Effects from Dependent Data

Bayesian Causal Mediation Analysis for Group Randomized Designs with Homogeneous and Heterogeneous Effects: Simulation and Case Study

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments

A Longitudinal Look at Longitudinal Mediation Models

Causal mediation analysis: Multiple mediators

Statistical Analysis of Causal Mechanisms for Randomized Experiments

Estimating the Marginal Odds Ratio in Observational Studies

Non-parametric Mediation Analysis for direct effect with categorial outcomes

Experimental Designs for Identifying Causal Mechanisms

Quantitative Economics for the Evaluation of the European Policy

Harvard University. Inverse Odds Ratio-Weighted Estimation for Causal Mediation Analysis. Eric J. Tchetgen Tchetgen

Bootstrapping Sensitivity Analysis

University of California, Berkeley

Causal Directed Acyclic Graphs

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Ignoring the matching variables in cohort studies - when is it valid, and why?

arxiv: v1 [stat.me] 15 May 2011

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Propensity Score Matching

Estimating direct effects in cohort and case-control studies

Notes on causal effects

Mplus Code Corresponding to the Web Portal Customization Example

Natural direct and indirect effects on the exposed: effect decomposition under. weaker assumptions

The Causal Mediation Formula A Guide to the Assessment of Pathways and Mechanisms

Abstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size

Published online: 19 Jun 2015.

Causal Inference for Mediation Effects

The Causal Mediation Formula A Guide to the Assessment of Pathways and Mechanisms

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Simultaneous Equation Models

Propensity Score Weighting with Multilevel Data

This paper revisits certain issues concerning differences

Bounding the Probability of Causation in Mediation Analysis

Identification of Causal Parameters in Randomized Studies with Mediating Variables

Research Design: Causal inference and counterfactuals

Selection on Observables: Propensity Score Matching.

CAUSAL MEDIATION ANALYSIS FOR NON-LINEAR MODELS WEI WANG. for the degree of Doctor of Philosophy. Thesis Adviser: Dr. Jeffrey M.

Introduction to Statistical Inference

Estimating Post-Treatment Effect Modification With Generalized Structural Mean Models

Front-Door Adjustment

Technical Track Session I: Causal Inference

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Introduction to Propensity Score Matching: A Review and Illustration

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Published online: 25 Sep 2014.

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Statistical Analysis of Causal Mechanisms

Transcription:

An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1

Causality In the applications of statistics, many central questions are related to causality rather than simply association. Sociology: Does divorce affect children s education? Health: Is a new drug effective against a disease? Economy: Does a job training program improve participants earnings? Business: Does a sales reward program boost a company s profits? People care about not only the causal effect itself, but also how and why an intervention affects the outcome. Treatment? Black Box Outcome 2

What Is Mediation? It uncovers the black box. Baron and Kenny (1986): it represents the generative mechanism through which the focal independent variable (treatment) is able to influence the dependent variable of interest (outcome). It decomposes the total treatment effect into an indirect effect transmitted through the hypothesized mediator and a direct effect representing the contribution of other unspecified pathways. Mediator Treatment Outcome 3

Example Maternal Speech (Mediator: M) SES (Treatment: T) Child Vocabulary (Outcome: Y) Indirect Effect: The improvement in child vocabulary attributable to the SES-induced difference in maternal speech. Direct Effect: The impact of SES on child vocabulary without changing maternal speech. 4

Conventional Estimation Method e 1 Y = d 0 + ct + e 0 M = d 1 + at + e 1 Y = d 2 + c T + bm + e 2 SES (Treatment: T) Maternal Speech (Mediator: M) Child Vocabulary (Outcome: Y) (Wright, 1934; Baron and Kenny, 1986; Judd and Kenny, 1981) a c b e 2 Total treatment effect: c Direct Effect: c Indirect Effect: ab or c c Significance test: Sobel test (Sobel, 1982); Bootstrapping (Bollen & Stine, 1990; Shrout & Bolger, 2002); Monte Carlo Method (MacKinnon, Lockwood, and Williams, 2004) 5

Limitations a Maternal Speech (Mediator: M) b SES (Treatment: T) The path coefficients represent the causal effects of interest only when the functional form of each of the models is correctly specified no confounding of the T-Y relation (no covariates associated with both T and Y) no confounding of the T-M relation Child Vocabulary (Outcome: Y) no confounding of the M-Y relation (Either pre-treatment or post-treatment) c no interaction exists between T and M affecting Y. However, this typically overlooks the fact that a treatment may generate an impact on the outcome through not only changing the mediator value but also changing the mediatoroutcome relationship (Judd & Kenny, 1981). 6

Potential Outcomes Framework Rubin (1978, 1986) Book recommendation: Imbens and Rubin (2015) 7

Movie: It s a Wonderful Life Example Source: Imbens and Rubin (2015) 8

9

10

11

Potential Outcomes Framework Observed Counterfactual 12

13

Definition of Causal Effects Individual i s potential outcome under T = 1: Y i (1) Individual i s potential outcome under T = 0: Y i (0) Treatment effect for individual i: Y i 1 Y i (0) Population average treatment effect: δ E[Y 1 ] E[Y 0 ] ID Treatment Potential outcomes Causal effect i T i Y i 1 Y i 0 Y i 1 Y i 0 1 1 16 12 4 True averages E[Y(1)] = 15 E[Y(0)] = 8 δ = 7 14

SUTVA (Stable Unit Treatment Value Assumption) No Interference The potential outcomes for any unit do not vary with the treatments assigned to other units. No Hidden Variations of Treatments For each unit, there are no different forms or versions of each treatment level, which lead to different potential outcomes. 15

Identification of Causal Effects ID Treatment Potential outcomes Causal effect i T i Y i 1 Y i 0 Y i 1 Y i 0 1 1 16?? 2 1 14?? 3 1 15?? 4 0? 10? 5 0? 6? True averages E[Y(1)] = 15 E[Y(0)] = 8 δ = 7 Observed averages E Y T = 1 = 15 E Y T = 0 = 8 E Y T = 1 E Y T = 0 = 7 Identification relates counterfactual quantities to observable population data. In a randomized design, ignorability assumption holds: Under Y i t T i for t = 0,1, E Y t = E Y t T = t. Hence δ = E Y T = 1 E Y T = 0 16

Identification of Causal Effects In observational studies, we are able to identify the causal effect under strong ignorability assumption: Y i t T i X i = x where 0 < P T i = 1 X i = x < 1 Population average treatment effect can be identified by δ = E{E Y T = 1, X } E{E Y T = 0, X } 17

Estimation of Causal Effects Propensity-score based methods (Rosenbaum and Rubin, 1983): Matching Subclassification Covariance adjustment Inverse weighting Sensitivity Analysis (Rosenbaum, 1986) The goal is to quantify the degree to which the key identification assumption must be violated for a researcher s original conclusion to be reversed. Software list (including R packages) on Prof. Elizabeth Stuart's webpage: http://www.biostat.jhsph.edu/~estuart/propensityscoresoftware.html 18

Causal Mediation Analysis Book recommendation: Hong (2015); VanderWeele (2015) 19

Research Question I SES (Treatment: T) Child Vocabulary (Outcome: Y) How much is the average SES impact on child vocabulary? 20

Research Question II Maternal Speech (Mediator: M) SES (Treatment: T) Child Vocabulary (Outcome: Y) How would the SES-induced change in maternal speech exert an impact on child vocabulary? 21

Definition (Pearl, 2001) Maternal Speech (Mediator: M) SES (Treatment: T) Child Vocabulary (Outcome: Y) High SES T i = 1 Low SES T i = 0 Maternal speech if SES is high M i (1) Maternal speech if SES low M i (0) Y i 1, 1, M i 11 Y i 1, 1, M i 00 Y i 0, M i 0 Population Average Natural Indirect Effect: E Y 1, M 1 E[Y 1, M 0 ] 22

Research Question III Maternal Speech (Mediator: M) SES (Treatment: T) Child Vocabulary (Outcome: Y) How much is the average causal effect of SES on child vocabulary without changing maternal speech? 23

Definition (Pearl, 2001) Maternal Speech (Mediator: M) SES (Treatment: T) Child Vocabulary (Outcome: Y) High SES T i = 1 Low SES T i = 0 Maternal speech if SES is high M i (1) Maternal speech if SES is high M i (0) Y i 1, M i 1 Y i 1, 1, M i 00 Y i 0, 0, M i 00 Population Average Natural Direct Effect: E Y 1, M 0 E[Y 0, M 0 ] 24

Alternative Definitions (Pearl, 2001) Manipulations Controlled direct effect: E[Y t, m ] E[Y t, m ] Causal effect of directly manipulating the mediator under T = t Natural Mechanisms Natural Indirect effect: E Y 1, M 1 E[Y 1, M 0 ] Counterfactuals about treatment-induced mediator values The following discussions will be focused on this definition. 25

Identification of Causal Effects We face an identification problem since we don t observe Y i 1, M i 0 Sequential Ignorability (Imai et al., 2010a, 2010b) {Y i t, m, M i (t)} T i X i = x Y i t, m M i (t) T i = t, X i = x, for t, t = 0,1 where 0 < Pr T i = t X i = x < 1, 0 < Pr M i (t) = m T i = t, X i = x < 1 Within levels of pretreatment confounders, the treatment is ignorable. Within levels of pretreatment confounders, the mediator is ignorable given the observed treatment. 26

Existing Analytic Methods Instrumental Variable Method (Angrist, Imbens and Rubin, 1996) Exclusion restriction: a constant zero direct effect Assumes no T-by-M interaction Marginal Structural Model For controlled direct effect: Robins, Hernan, and Brumback (2000) For natural direct and indirect effects: VanderWeele (2009) Assumes no T-by-M interaction Modified Regression Approach (Valeri & VanderWeele, 2013) M = d 1 + at + β 1 X + e 1 Y = d 2 + c T + bm + dtm + β 2 X + e 2 Resampling Method Weighting Method 27

Resampling Method (Imai et al., 2010a, 2010b) Algorithm 1 (Parametric) Step 1: Fit models for the observed outcome and mediator variables. Step 2: Simulate model parameters from their sampling distribution. Step 3: Repeat the following three steps for each draw of model parameters: 1. Simulate the potential values of the mediator. 2. Simulate potential outcomes given the simulated values of the mediator. 3. Compute quantities of interest (NDE, NIE, or average total effect). Step 4: Compute summary statistics, such as point estimates (average) and confidence intervals. Sensitivity analysis Algorithm 2 (Nonparametric/Semiparametric) Combine Algorithm 1 with bootstrap R package: mediation http://imai.princeton.edu/software/mediation.html http://web.mit.edu/teppei/www/research/mediationr.pdf 28

Weighting Method Maternal speech is SES is high M i (1) Maternal speech is SES is high M i (0) High SES T i = 1 E[Y(1, M 1 )] = = E[Y T = 1] Low SES T i = 0 E[Y(1, M 0 )] E[Y(0, M 0 )] = E[WY T = 1] E[Y T = 0] W = Pr(M = m T = 0, X = x) Pr(M = m T = 1, X = x) Hong (2010, 2015); Hong et al. (2011, 2015); Hong and Nomi, 2012; Huber (2014); Lange et al. (2012); Lange et al. (2014); Tchetgen Tchetgen and Shpitser (2012); Tchetgen Tchetgen (2013) Software RMPW could be downloaded from: hlmsoft.net/ghong 29

References Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91(434), 444-455 Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182. Bollen, K. A., & Stine, R., (1990). Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology, 20, 115-40. Hong, G. (2010). Ratio of mediator probability weighting for estimating natural direct and indirect effects. In Proceedings of the american statistical association, biometrics section (pp. 2401 2415). American Statistical Association. Hong, G. (2015). Causality in a social world: Moderation, mediation and spill-over. John Wiley & Sons. Hong, G., Deutsch, J., & Hill, H. D. (2011). Parametric and non-parametric weighting methods for estimating mediation effects: An application to the national evaluation of welfare-to-work strategies. In Proceedings of the american statistical association, social statistics section (pp. 3215 3229). American Statistical Association. Hong, G., Deutsch, J., & Hill, H. D. (2015). Ratio-of-mediator-probability weighting for causal mediation analysis in the presence of treatment-by-mediator interaction. Journal of Educational and Behavioral Statistics, 40 (3), 307 340. Hong, G., & Nomi, T. (2012). Weighting methods for assessing policy effects mediated by peer change. Journal of Research on Educational Effectiveness, 5 (3), 261 289. Huber, M. (2014). Identifying causal mechanisms (primarily) based on inverse probability weighting. Journal of Applied Econometrics, 29 (6), 920 943. 30

Imai, K., Keele, L. and Tingley, D. (2010a) A general approach to causal mediation analysis. Psychological methods, 15(4), 309. Imai, K., Keele, L. and Yamamoto, T. (2010b) Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science, 25(1), 51-71. Imbens, G. W., & Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge University Press. Judd, C. M., & Kenny, D. A. (1981). Process analysis: Estimating mediation in treatment evaluations. Evaluation Review, 5, 602-619. Lange, T., Rasmussen, M., & Thygesen, L. (2014). Assessing natural direct and indirect effects through multiple pathways. American journal of epidemiology, 179 (4), 513. Lange, T., Vansteelandt, S., & Bekaert, M. (2012). A simple unified approach for estimating natural direct and indirect effects. American journal of epidemiology, 176 (3), 190 195. MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99-128. Pearl, J. (2001) Direct and indirect effects. In Proceedings of the seventeenth conference on uncertainty in artificial intelligence, pp. 411-420. Morgan Kaufmann Publishers Inc. Robins, J. M., Hernan, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 550-560. Rosenbaum, P. R. (1986). Dropping out of high school in the United States: An observational study. Journal of Educational and Behavioral Statistics, 11(3), 207-224. Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41 55. 31

Rubin, D. B. (1978) Bayesian Inference for Causal Effects: The Role of Randomization. The Annals of Statistics, 6(1), 34-58. Rubin, D. B. (1986) Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396), 961-962. Shrout, P. E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7, 422-445. Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological Methodology 1982 (pp. 290-312). Washington DC: American Sociological Association. Tchetgen, E. J. T., Shpitser, I., et al. (2012). Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness and sensitivity analysis. The Annals of Statistics, 40 (3), 1816 1845. Tchetgen Tchetgen, E. J. (2013). Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in medicine, 32 (26), 4567 4580. Valeri, L., & VanderWeele, T. J. (2013). Mediation analysis allowing for exposure mediator interactions and causal interpretation: Theoretical assumptions and implementation with sas and spss macros. Psychological methods, 18 (2), 137 VanderWeele, T. J. (2009). Marginal structural models for the estimation of direct and indirect effects. Epidemiology, 20(1), 18-26. VanderWeele, T. (2015). Explanation in causal inference: methods for mediation and interaction. Oxford University Press. Wright, S. (1934). The method of path coefficients. Annals of Mathematical Statistics, 5, 161-215. 32

Thank you! Contact: xuqin@uchicago.edu 33