Causal Mechanisms Short Course Part II:

Similar documents
Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Experimental Designs for Identifying Causal Mechanisms

Statistical Analysis of Causal Mechanisms

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Statistical Analysis of Causal Mechanisms

Identification and Inference in Causal Mediation Analysis

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Statistical Analysis of Causal Mechanisms

Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects

Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments

Statistical Analysis of Causal Mechanisms for Randomized Experiments

mediation: R Package for Causal Mediation Analysis

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Experimental Designs for Identifying Causal Mechanisms

Causal mediation analysis: Definition of effects and common identification assumptions

Bayesian Causal Mediation Analysis for Group Randomized Designs with Homogeneous and Heterogeneous Effects: Simulation and Case Study

Casual Mediation Analysis

Identification, Inference and Sensitivity Analysis for Causal Mediation Effects

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Published online: 19 Jun 2015.

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE

Statistical Models for Causal Analysis

Telescope Matching: A Flexible Approach to Estimating Direct Effects

Help! Statistics! Mediation Analysis

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

R and Stata for Causal Mechanisms

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Identification and Estimation of Causal Mediation Effects with Treatment Noncompliance

Telescope Matching: A Flexible Approach to Estimating Direct Effects *

Discussion of Papers on the Extensions of Propensity Score

Introduction to Statistical Inference

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Department of Biostatistics University of Copenhagen

Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis

Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction

Introduction to Causal Inference. Solutions to Quiz 4

Difference-in-Differences Methods

Combining multiple observational data sources to estimate causal eects

arxiv: v3 [stat.me] 4 Apr 2018

Estimating direct effects in cohort and case-control studies

Statistical Methods for Causal Mediation Analysis

Propensity Score Weighting with Multilevel Data

Mediation: Background, Motivation, and Methodology

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

An Introduction to Causal Analysis on Observational Data using Propensity Scores

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

ONLINE APPENDICES Emotions and the Micro-Foundations of Commitment Problems

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

Non-parametric Mediation Analysis for direct effect with categorial outcomes

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

New developments in structural equation modeling

Mediation and Interaction Analysis

Causal mediation analysis in the context of clinical research

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Estimating the Marginal Odds Ratio in Observational Studies

arxiv: v2 [math.st] 4 Mar 2013

mediation: R Package for Causal Mediation Analysis

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

A Decision Theoretic Approach to Causality

University of California, Berkeley

When Should We Use Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Causal Inference for Mediation Effects

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Gov 2002: 4. Observational Studies and Confounding

Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula

The Stata Journal. Editor Nicholas J. Cox Department of Geography Durham University South Road Durham DH1 3LE UK

CAUSAL MEDIATION ANALYSIS FOR NON-LINEAR MODELS WEI WANG. for the degree of Doctor of Philosophy. Thesis Adviser: Dr. Jeffrey M.

Mediation Analysis: A Practitioner s Guide

Parametric and Non-Parametric Weighting Methods for Mediation Analysis: An Application to the National Evaluation of Welfare-to-Work Strategies

On the Use of Linear Fixed Effects Regression Models for Causal Inference

Quantitative Economics for the Evaluation of the European Policy

Flexible Estimation of Treatment Effect Parameters

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Causality II: How does causal inference fit into public health and what it is the role of statistics?

The International Journal of Biostatistics

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Causal Mediation Analysis Using R

Selection on Observables: Propensity Score Matching.

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

CompSci Understanding Data: Theory and Applications

Covariate Balancing Propensity Score for General Treatment Regimes

Causal Mediation Analysis Short Course

Gov 2002: 13. Dynamic Causal Inference

SEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION

Harvard University. Harvard University Biostatistics Working Paper Series. Semiparametric Estimation of Models for Natural Direct and Indirect Effects

arxiv: v1 [math.st] 28 Feb 2017

Front-Door Adjustment

The Causal Mediation Formula A Guide to the Assessment of Pathways and Mechanisms

Sensitivity checks for the local average treatment effect

Econometric Analysis of Cross Section and Panel Data

What s New in Econometrics. Lecture 1

Transcription:

Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal Mechanisms Pre-Conference Short Course

Overview Our central question: How can we learn about causal mechanisms from data? In this tutorial, you will learn: A quantitative definition of causal mechanisms Assumptions needed to identify a causal mechanism from data A general procedure to estimate a causal mechanism (given the assumptions) Methods for analyzing sensitivity to the violation of the assumptions Experimental designs to identify mechanisms with weaker assumptions

Companion Materials This tutorial is based on: Imai, Keele, Tingley and Yamamoto (2011), Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies, APSR. All methods can be implemented by the R package mediation Further details are given in the (more technical) papers listed at the end

What Is a Causal Mechanism? Mechanisms as causal pathways Example: Soil fumigants increase farm crops by reducing eel-worms (Cochran) Causal mediation analysis Mediator, M Treatment, T Outcome, Y Quantities of interest: Direct and indirect effects Fast growing literature in the past 10 20 years

Potential Outcomes Framework Framework: Potential outcomes (Neyman 1923; Rubin 1974) Binary treatment: T i {0, 1} Mediator: M i M Outcome: Y i Y Observed pre-treatment covariates: X i X Potential mediators: M i (t), where M i = M i (T i ) observed Potential outcomes: Y i (t, m), where Y i = Y i (T i, M i (T i )) observed In a standard experiment, only one potential outcome can be observed for each i Moreover, some potential outcomes can never be observed: Y i (t, M i (t )) where t t

Causal Mediation Effects Total causal effect: τ i Y i (1, M i (1)) Y i (0, M i (0)) Causal mediation effects (a.k.a. natural indirect effects): δ i (t) Y i (t, M i (1)) Y i (t, M i (0)) Definition originates in Robins and Greenland (1992) and Pearl (2001) Causal effect of the change in M i on Y i that would be induced by treatment Represents the mechanism through M i

Total Effect = Indirect Effect + Direct Effect Natural direct effects: ζ i (t) Y i (1, M i (t)) Y i (0, M i (t)) Causal effect of T i on Y i, holding mediator constant at its potential value that would realize when T i = t Represents all mechanisms other than through M i Cf. Controlled direct effect: ξ i (m) Y i (1, m) Y i (0, m) Causal effect of T i on Y i when mediator is manipulated at a fixed value m (regardless of unit i s natural response to T i ) Total effect = mediation (indirect) effect + direct effect: τ i = δ i (t) + ζ i (1 t) = 1 2 {δ i(0) + δ i (1) + ζ i (0) + ζ i (1)}

What Do the Observed Data Tell Us? Quantity of Interest: Average causal mediation effects δ(t) E(δ i (t)) = E{Y i (t, M i (1)) Y i (t, M i (0))} Average direct effects ( ζ(t)) are defined similarly Problem: Y i (t, M i (t)) is observed but Y i (t, M i (t )) can never be observed We have an identification problem = Need additional assumptions to make progress

Identification under Standard Research Design Standard experiment: Randomize T i and measure M i and Y i An identification assumption: Sequential Ignorability {Y i (t, m), M i (t)} T i X i = x (1) Y i (t, m) M i (t) T i = t, X i = x (2) (1) is guaranteed to hold in standard experiments (2) does not hold if there exist: unobserved pre-treatment M Y confounders, or any post-treatment M Y confounding, even if observed In observational studies, neither (1) nor (2) is guaranteed Alternative assumptions: Robins, Pearl, Petersen et al., VanderWeele, etc. Theorem (Imai et al. 2010): Under sequential ignorability, ACME and average direct effects are nonparametrically identified (= estimable from data)

A General Estimation Algorithm 1 Model outcome and mediator Outcome model: p(y i T i, M i, X i ) Mediator model: p(m i T i, X i ) These models can be of any form (linear or nonlinear, semior nonparametric, with or without interactions) 2 Predict mediator for both treatment values (M i (1), M i (0)) 3 Predict outcome by first setting T i = 1 and M i = M i (0), and then T i = 1 and M i = M i (1) 4 Compute the average difference between two outcomes to obtain a consistent estimate of ACME 5 Monte-Carlo or bootstrapping to estimate uncertainty (Alternative procedures: Inverse probability weighting, conditional mean models, etc.)

Relationship to the Traditional Mediation Analysis The simplest example: Linear structural equation model: M i = α 2 + β 2 T i + ξ 2 X i + ɛ i2, Y i = α 3 + β 3 T i + γm i + ξ 3 X i + ɛ i3. Traditional procedure (Baron & Kenny 1986): 1 Fit two least squares regressions separately 2 Use product of coefficients ( ˆβ 2ˆγ) to estimate ACME 3 Use asymptotic variance to test significance (Sobel test) Under SI and the no-interaction assumption ( δ(1) δ(0)), this procedure consistently estimates ACME But only if the model is correct! (linearity, unit homogeneity) The proposed algorithm generalizes this to any types of models

Example: Anxiety, Group Cues and Immigration Brader, Valentino & Suhat (2008, AJPS) How and why do ethnic cues affect immigration attitudes? Theory: Anxiety transmits the effect of cues on attitudes Anxiety, M Media Cue, T Immigration Attitudes, Y ACME = Average difference in immigration attitudes due to the change in anxiety induced by the media cue treatment Sequential ignorability = No unobserved covariate affecting both anxiety and immigration attitudes

Reanalysis: Estimates under Sequential Ignorability Original method: Product of coefficients with the Sobel test Valid only when both models are linear w/o T M interaction (which they are not) Our method: Calculate ACME using our general algorithm Product of Average Causal Outcome variables Coefficients Mediation Effect (δ) Decrease Immigration.347.105 δ(1) [0.146, 0.548] [0.048, 0.170] Support English Only Laws.204.074 δ(1) [0.069, 0.339] [0.027, 0.132] Request Anti-Immigration Information.277.029 δ(1) [0.084, 0.469] [0.007, 0.063] Send Anti-Immigration Message.276.086 δ(1) [0.102, 0.450] [0.035, 0.144]

Beyond Sequential Ignorability Assumption (2) is too strong in most scenarios Can we go beyond just making this assumption? Sensitivity analysis: Assess the robustness of the estimates to the violation of sequential ignorability How large a departure from the key assumption must occur for the conclusions to no longer hold? Parametric sensitivity analysis by assuming but not {Y i (t, m), M i (t)} T i X i = x Y i (t, m) M i (t) T i = t, X i = x Addresses the possible existence of unobserved pre-treatment confounders But not post-treatment confounders An alternative approach: VanderWeele (2010)

Parametric Sensitivity Analysis Assume a linear structural equations model: M i = α 2 + β 2 T i + ξ2 X i + ɛ i2, Y i = α 3 + β 3 T i + γm i + ξ3 X i + ɛ i3. Sensitivity parameter: ρ Corr(ɛ i2, ɛ i3 ) Theorem (Imai et al. 2010): ACME is identified given ρ Set ρ to different values and see how ACME changes Sequential ignorability implies ρ = 0 Extension to nonlinear models (e.g. logistic regression for binary outcomes)

Anxiety Example: Sensitivity Analysis w.r.t. ρ Average Mediation Effect: δ(1) 0.4 0.2 0.0 0.2 0.4 1.0 0.5 0.0 0.5 1.0 Sensitivity Parameter: ρ ACME > 0 as long as the error correlation is less than 0.39 (0.30 with 95% CI)

Alternative Formulation for Easier Interpretation Interpreting ρ: how small is too small? An unobserved (pre-treatment) confounder formulation: ɛ i2 = λ 2 U i + ɛ i2 and ɛ i3 = λ 3 U i + ɛ i3 How much does U i have to explain the variances of M i and Y i for our results to go away? Reparameterize ACME using coefficients of determination (R 2 and R 2 )

Anxiety Example: Sensitivity Analysis w.r.t. R 2 M and R 2 Y Proportion of Total Variance in Y Explained by Confounder 0.0 0.1 0.2 0.3 0.4 0.5 0.15 0.1 0.05 0 0.05 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Proportion of Total Variance in M Explained by Confounder An unobserved confounder can account for up to 26.5% of the variation in both Y i and M i before ACME becomes zero

Experimental Designs for Identifying Mechanisms Sensitivity analysis may be unsatisfactory What if we get rid of the assumption altogether? Under a standard design, even the sign of ACME is unidentified (Sjölander) Can we do any better? Use alternative experimental designs for more credible yet powerful inference Designs feasible when the mediator can be directly or indirectly manipulated Experiments also serve as templates for observational studies

Parallel Design Must assume consistency (i.e. no direct effect of manipulation on outcome) More informative than standard single experiment If we assume no T M interaction, ACME is point identified

Parallel Encouragement Design Direct manipulation of the mediator is often infeasible Even if feasible, more subtle form of intervention may be preferred to assure consistency Parallel encouragement design: Randomly encourage subjects to take particular values of the mediator Standard instrumental variable assumptions (Angrist et al.)

Crossover Designs Recall ACME can be identified if we observe Y i (t, M i (t)) Get M i (t), then switch T i to t while holding M i = M i (t) Crossover design: 1 Round 1: Conduct a standard experiment 2 Round 2: Change the treatment to the opposite status but fix the mediator to the value observed in the first round Crossover encouragement design: 1 Round 1: Conduct a standard experiment 2 Round 2: Same as crossover, except encourage subjects to take the mediator values Both must assume no carryover effect

Summary Even in a randomized experiment, a strong assumption is needed to identify causal mechanisms Analyzing mechanisms is, therefore, not so easy! Under the identification assumption, a general estimation procedure is available for various types of statistical models The violation of the assumption can be addressed by: Analyzing sensitivity with respect to key assumptions Creative research designs to avoid strong assumptions Therefore, progress can still be made! Several software implementations to make things easy: R and Stata (next session) SAS and SPSS (session after next)

References Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies. American Political Science Review, 2011. Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects. Statistical Science, 2010. A General Approach to Causal Mediation Analysis. Psychological Methods, 2010. Experimental Designs for Identifying Causal Mechanisms. Journal of the Royal Statistical Society, Series A, 2012. Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments. Manuscript. Causal Mediation Analysis Using R. Advances in Social Science Research Using R