Causality II: How does causal inference fit into public health and what it is the role of statistics?

Similar documents
Introduction to Causal Calculus

Probabilistic Causal Models

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

Causal Inference. Prediction and causation are very different. Typical questions are:

Estimating direct effects in cohort and case-control studies

Mediation for the 21st Century

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

A Decision Theoretic Approach to Causality

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (

Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE

Single World Intervention Graphs (SWIGs):

Statistical Models for Causal Analysis

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Causal mediation analysis: Definition of effects and common identification assumptions

Causality. Pedro A. Ortega. 18th February Computational & Biological Learning Lab University of Cambridge

Estimation of direct causal effects.

The identification of synergism in the sufficient-component cause framework

Casual Mediation Analysis

Mediation analysis for different types of Causal questions: Effect of Cause and Cause of Effect

Causal Directed Acyclic Graphs

Graphical Representation of Causal Effects. November 10, 2016

Statistical Methods for Causal Mediation Analysis

From Causality, Second edition, Contents

Help! Statistics! Mediation Analysis

MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS 1. By Tyler J. VanderWeele and James M. Robins. University of Chicago and Harvard University

Front-Door Adjustment

Single World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality

Econometric Causality

A Distinction between Causal Effects in Structural and Rubin Causal Models

Models of Causality. Roy Dong. University of California, Berkeley

Confounding, mediation and colliding

The Effects of Interventions

Gov 2002: 4. Observational Studies and Confounding

MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS. By Tyler J. VanderWeele and James M. Robins University of Chicago and Harvard University

Bounding the Probability of Causation in Mediation Analysis

Ignoring the matching variables in cohort studies - when is it valid, and why?

PSC 504: Instrumental Variables

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

Sensitivity analysis and distributional assumptions

Structure learning in human causal induction

arxiv: v2 [math.st] 4 Mar 2013

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA

Counterfactual Reasoning in Algorithmic Fairness

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Causal Models. Macartan Humphreys. September 4, Abstract Notes for G8412. Some background on DAGs and questions on an argument.

Mendelian randomization as an instrumental variable approach to causal inference

Lecture Notes 22 Causal Inference

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Introduction to Statistical Inference

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

The identi cation of synergism in the su cient-component cause framework

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

A proof of Bell s inequality in quantum mechanics using causal interactions

Research Note: A more powerful test statistic for reasoning about interference between units

Automatic Causal Discovery

Carnegie Mellon Pittsburgh, Pennsylvania 15213

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th

The Doubly-Modifiable Structural Model (DMSM) The Modifiable Structural Model (MSM)

Causal inference in statistics: An overview

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Using Instrumental Variables to Find Causal Effects in Public Health

Part IV Statistics in Epidemiology

Statistical Analysis of Causal Mechanisms

Relations in epidemiology-- the need for models

Statistical Analysis of Causal Mechanisms

Conditional probabilities and graphical models

Statistical Analysis of Causal Mechanisms for Randomized Experiments

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Geoffrey T. Wodtke. University of Toronto. Daniel Almirall. University of Michigan. Population Studies Center Research Report July 2015

Causal Mechanisms Short Course Part II:

Intervention and Causality. Volker Tresp 2015

Single World Intervention Graphs (SWIGs):

Causal Discovery by Computer

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Causal Analysis After Haavelmo: Definitions and a Unified Analysis of Identification of Recursive Causal Models

DAGS. f V f V 3 V 1, V 2 f V 2 V 0 f V 1 V 0 f V 0 (V 0, f V. f V m pa m pa m are the parents of V m. Statistical Dag: Example.

Journal of Biostatistics and Epidemiology

A Linear Microscope for Interventions and Counterfactuals

Instrumental Sets. Carlos Brito. 1 Introduction. 2 The Identification Problem

The distinction between a biologic interaction or synergism

Causal analysis after Haavelmo. 8th/last Lecture - Hedibert Lopes

Causal discovery from big data : mission (im)possible? Tom Heskes Radboud University Nijmegen The Netherlands

This paper revisits certain issues concerning differences

Causal exposure effect on a time-to-event response using an IV.

Causal Models with Hidden Variables

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING. Judea Pearl University of California Los Angeles (

Using Descendants as Instrumental Variables for the Identification of Direct Causal Effects in Linear SEMs

The International Journal of Biostatistics

Learning in Bayesian Networks

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Variable selection and machine learning methods in causal inference

Mediation Analysis for Health Disparities Research

STA6938-Logistic Regression Model

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017

Exploratory Causal Analysis in Bivariate Time Series Data Abstract

Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power)

Transcription:

Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1

Outline Potential Outcomes / Counterfactual Judea Pearl Statistics and Causal Inference Aalen and Frigessi Causal Inference in Epidemiology Parascandola and Weed 2

Some history 1910 s & 1920 s: Sewall Wright Introduced path analysis in biology and agricultural sciences Path analysis conceptualized as a method using a sequence of regression models to make inferences about correlations and effects 1930 s & 1950 s: Economists (e.g., Haavelmo) and Bollen SEMs evolve which generalize path models to allow correlated errors. 1950 s: Turner and Stevens Introduce path analysis and SEMs to statistics literature 1960 s: Duncan Introduces path analysis to psychological literature Path analysis and SEM used interchangeably 1970 s: Rubin Introduces concept of potential outcomes. Opens a whole new can of worms. 3

The setting Y = γ + X + τ ε 1 1 Bland Statistical Interpretation: A one unit change in X is associated with an expected change in Y of τ units. We would really LIKE to say: If I am nature, and I change an X in the population from 0 to 1, then Y will change by τ units on average. 4

Causal Inference Relatively new field Uses statistics to make causal inference Interplay between science and statistics Science dictates model Statistics measures magnitude of effect Hypothesis: X Y 5

Example: pregnancy smoking and childhood conduct problems X = pregnancy smoking (binary) Y = childhood conduct (continuous) Data Structure: Y i (0) = childhood conduct for child i if mom does not smoke. Y i (1) = childhood conduct for child i if mom does smoke. Each child has the potential to have an outcome under either scenario. But, we only observe the outcome under the observed smoking status Maughan et al, J. Child Psychol. Psychiat. Vol. 42, No. 8, pp. 1021-1028, 2001 6

Causal Estimation Observed data: Y = α + β X + ε 0 Observed effect: β 0 Full data: EY [ ( 1)] EY [ ( 0)] = βc Causal effect: β C 7

Causal Estimation When does β 0 = β C? Main problem: Counterfactual E[Y(1) X = 1] is observed E[Y(1) X = 0] is counterfactual How do we identify counterfactual outcomes? 8

Randomization Under randomization, Y i (0) and Y i (1) are INDEPENDENT of X i That is, E[Y i (0) X i = 1] = E[Y i (0) X i = 0] In words, counterfactual and observed outcomes are exchangeable 9

Observational Study In an observational study, we cannot assume exchangeability But, if we can control for confounders, we can regain exchangeability 10

Mediation Model α Parenting (Z) β Pregnancy Smoking (X) τ Childhood Conduct (Y) Y Z = τx + = αx + ε ε Y = τ' X + βz + ε 1 2 3 Total effect of X on Y is τ 11

Mediation Model The indirect effect is the effect of X on Y that is mediated by Z One measure of the mediated effect is αβ. Under the potential outcomes framework, a mediated effect might be E[ Y(X = 1,Z = z 1 ) Y(X = 1,Z = z 2 )] Identification of the potential outcomes is tricky 12

Mediation Model Intuition when Y is binary: E[Y(X = 1, Z = z 1 )] = P[Y(X = 1, Z = z 1 )] We can talk about odds and odds ratios: OR C = oddsp[ Y( X = 1, Z = 1)] oddsp[ Y( X = 1, Z = 0)] 13

Mediation Model Complete Mediation α Parenting (Z) β Pregnancy Smoking (X) Childhood Conduct (Y) P[Y(Z = 1 X = 1)] = P[Y(Z = 1 X = 0)] P[Y(X = 1 Z = 1)] P[Y(X = 1 Z = 0)] 14

Mediation Model Under complete mediation, P[Y(X = 1,Z = 1)] = P[Y(X = 0, Z = 1)] P[Y(X = 1,Z = 1)] P[Y(X = 1, Z = 0)] In words, For kids with good parenting, pregnancy smoking IS NOT associated with conduct disorder For kids whose mom s smoked during pregnancy, their conduct is still associated with parenting. 15

Effect Modification? Parenting (Z) Pregnancy Smoking (X) Childhood Conduct (Y) Pregnancy Smoking Z =1 τ 1 Childhood Conduct (Y) Pregnancy Smoking Z =0 τ 0 Childhood Conduct (Y) 16

Historical Problem: Path Analysis Traditional statistical formulation of direct and indirect effects Weakness: absence of time Models of variables versus stochastic processes In our mind, we have time in mind, but not necessarily in our data 17

Historical Problem: Path Analysis Associational versus causal relationships Distinction has not been made clear Statisticians tend to caution people about interpretation But, there has tended to be no formal distinction in modeling 18

Causal Notation do : forcing a change Judea Pearl et al. P[Y(X = 1) do(z=1)] do indicates changing the state of nature CAUSAL versus OBSERVED conditioning Difference is P[Y X] and P[Y do(x)] P[Y do(x)]: what is the change in the expected value of Y if we were to intervene and change the value of X from x to x+1? P[Y X]: what would be the difference in the expected value of Y is we were to FIND X at level x+1 instead of x? Recent work in causal inference forces us to explicitly state assumptions Failing to do so may lead to incorrect inferences 19

Causal Notation Directed Acyclic Graphs (DAGS) Important concepts in understanding causality D-separation Blocking Colliders Non-colliders Descendants 20

So far. Potential Outcomes (i.e., counterfactual) framework Changes the way statisticians think about structural modeling Application? Used appropriately in medicine and clinical trials. What about observational studies? Does it still fit? 21

What can statistics contribute to causal understanding? Many problems in statistics applied to causal inference Pearl, in response to avoidance of causality in statistics literature : This position of caution and avoidance has paralyzed many fields that look to statistics for guidance, especially economics and social science. 22

What can statistics contribute to causal understanding? Various types of causal thinking Experience based Statistics specializes in this If you take the medicine, you will be cured Why? Doesn t necessarily matter Analysis of randomized clinical trial does not need to understand mechanism for treatment effect black box causality Mechanistic based Looks into the black box to understand mechanism Validity of mechanism varies substantially in medical research (e.g., heart function versus cancer versus psychiatric disorders) Can think of a hierarchy between these (more later). 23

New Developments in Statistics (Pearl, 2000) Precise definitions are given of what one should mean by a causal effect (e.g., counterfactual) This has clarified causal thinking New methods for approximating counterfactual comparison (e.g. marginal structural model (Robins, 1986)) Sensitivity studies can be made to see whether confounders or other factors can explain differences 24

Mechanistic Causality These developments apply to experience based causality Mechanistic insights driving science play little role in analysis Randomized trial ignores mechanism Counterfactual causality typically related to action being taken (e.g. pregnancy smoking) Mechanistic causality aims at understanding mechanisms or processes. May be 2ndary to understand whether or not mechanisms can be influenced 25

Mechanistic Causality Statistics is generally most helpful when mechanism is very poorly understood Mechanisms unfold over time. Need sequence of events Granger causality Measurements taken over time How they influence each other Present and past influencing future 26

Levels of Mechanistic Understanding Can be studied at many levels Example: genetic studies Can derive genetic versus environment component Can say genetic cause But, wouldn t it be more detailed to know which genes were the cause? Statistics: causality tends to be thought of absolute Better to think in less definite terms A study can make a step towards mechanistic understanding But, understanding may still be superficial These concepts depend on the level of detail A direct effect may become an indirect effect if new intermediate variables are observable 27

Causation in Epidemiology Essential in epidemiology But, no agreed upon definition Five categories can be delineated Production Necessary and sufficient Sufficient-component Counterfactual Probabilistic 28

Production A cause is something that produces or creates an effect Definition of production and creation are not well-defined Rejected due to this ambiguity 29

Necessary & Sufficient Causes Necessary: must be present for effect to occur Sufficient: in its presence effect must occur 4 combinations Few epidemiologists believe that cause should be limited to necessary conditions Support is based on scientific determinism and one cause model Requires one-to-one correspondence. No role for chance. Too many neither necessary nor sufficient to make this practical. 30

Sufficient-Component Rothman: widely cited. Causes can be neither sufficient nor necessary Made up of a number of components, no one of which is sufficient on its own. Still assumes determinism: no variation or chance allowed. Must assume existence of countless hidden effects: big assumption Rejected because unwieldy. 31

Probabilistic Causation A cause increases the probability that an effect will occur Offers alternative to determinism Makes fewer biological assumptions Little discussion of their strengths and weaknesses in epi literature Fails to explain, for example, why some smokers develop cancer and others do not. Unclear about what it means to increase the probability Cox and Holland object to this idea: how can causal and non-causal associations be differentiated? On its own, probabilistic causation is not enough 32

Counterfactuals Compares outcomes under different conditions Can be either deterministic or probabilistic Counterfactuals are not inconsistent with 4 previous definitions They articulate additional attribute by strengthening distinction between cause and correlation Counterfactual alone is not sufficient for causation 33

Probabilistic + Counterfactual This combination is suggested as the best option for epidemiology Consistent with both deterministic and probabilistic models. Makes few assumptions about unobservables Probabilistic is implicit in practical reasoning: What does physician mean when she tells her patient that he can reduce risk of lung cancer by giving up smoking? Does she mean he MIGHT be an individual for whom smoking tips the balance? Implication: if other component causes are known, he might not need to give up smoking. Trivializes the nature of public health advice By quitting smoking, the probability of lung cancer is lowered. Deterministic account does not allow this approach. 34