OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.

Similar documents
OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

CAUSAL INFERENCE IN STATISTICS. A Gentle Introduction. Judea Pearl University of California Los Angeles (

OUTLINE CAUSES AND COUNTERFACTUALS IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING. Judea Pearl University of California Los Angeles (

PANEL ON MEDIATION AND OTHER MIRACLES OF THE 21 ST CENTURY

CAUSES AND COUNTERFACTUALS: CONCEPTS, PRINCIPLES AND TOOLS

From Causality, Second edition, Contents

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

THE MATHEMATICS OF CAUSAL INFERENCE With reflections on machine learning and the logic of science. Judea Pearl UCLA

CAUSALITY CORRECTIONS IMPLEMENTED IN 2nd PRINTING

Gov 2002: 4. Observational Studies and Confounding

Introduction to Causal Calculus

Statistical Models for Causal Analysis

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causal inference in statistics: An overview

An Introduction to Causal Inference

Causality in the Sciences

The Effects of Interventions

A Decision Theoretic Approach to Causality

The Mathematics of Causal Inference

The Foundations of Causal Inference

Confounding Equivalence in Causal Inference

Ignoring the matching variables in cohort studies - when is it valid, and why?

UCLA Department of Statistics Papers

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

Taming the challenge of extrapolation: From multiple experiments and observations to valid causal conclusions. Judea Pearl and Elias Bareinboim, UCLA

Local Characterizations of Causal Bayesian Networks

5.3 Graphs and Identifiability

Technical Track Session I: Causal Inference

Causal and Counterfactual Inference

Recovering Causal Effects from Selection Bias

Job Training Partnership Act (JTPA)

The International Journal of Biostatistics

The Science and Ethics of Causal Modeling

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Identification of Conditional Interventional Distributions

6.3 How the Associational Criterion Fails

causal inference at hulu

Front-Door Adjustment

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

On the Identification of Causal Effects

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Counterfactual Reasoning in Algorithmic Fairness

Introduction. Abstract

Confounding Equivalence in Causal Inference

THE FOUNDATIONS OF CAUSAL INFERENCE

Probabilities of causation: Bounds and identification

On Measurement Bias in Causal Inference

Mediation for the 21st Century

A Linear Microscope for Interventions and Counterfactuals

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Single World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality

Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE

Complete Identification Methods for the Causal Hierarchy

PEARL VS RUBIN (GELMAN)

Causal Analysis After Haavelmo

Casual Mediation Analysis

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Identification and Overidentification of Linear Structural Equation Models

Effect Modification and Interaction

STA 4273H: Statistical Machine Learning

Bounding the Probability of Causation in Mediation Analysis

arxiv: v2 [cs.ai] 26 Sep 2018

Informally, the consistency rule states that an individual s

The Do-Calculus Revisited Judea Pearl Keynote Lecture, August 17, 2012 UAI-2012 Conference, Catalina, CA

Causal Analysis After Haavelmo: Definitions and a Unified Analysis of Identification of Recursive Causal Models

Causal analysis after Haavelmo. 8th/last Lecture - Hedibert Lopes

Causal Directed Acyclic Graphs

Estimating direct effects in cohort and case-control studies

Reflections, Elaborations, and Discussions with Readers

Econometric Causality

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

A Linear Microscope for Interventions and Counterfactuals

Models of Causality. Roy Dong. University of California, Berkeley

Causal Mechanisms Short Course Part II:

Empirical approaches in public economics

On Identifying Causal Effects

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Single World Intervention Graphs (SWIGs):

Naïve Bayes Classifiers

Using Descendants as Instrumental Variables for the Identification of Direct Causal Effects in Linear SEMs

Causal Analysis in Social Research

Mediation analysis for different types of Causal questions: Effect of Cause and Cause of Effect

Estimating the Marginal Odds Ratio in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Graphical Representation of Causal Effects. November 10, 2016

What Counterfactuals Can Be Tested

Causal Discovery. Beware of the DAG! OK??? Seeing and Doing SEEING. Properties of CI. Association. Conditional Independence

Controlling for Time Invariant Heterogeneity

Advances in Cyclic Structural Causal Models

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Causal mediation analysis: Definition of effects and common identification assumptions

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Identifying Linear Causal Effects

Research Design: Causal inference and counterfactuals

Transcription:

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics The logical equivalence of Structural Equation Models (SEM) and potential outcomes (Neyman-Rubin) Where do they differ? 1. Define, 2. Assume, 3. identif 4. Estimate, 5. Test Which covariates should be measured? Causation without manipulation The Mediation Formula and its ramifications Measurement bias and effect restoration. 1 2 3-LEVEL HIERARCH OF CAUSAL MODELS TRADITIONAL STATISTICAL INFERENCE PARADIGM 1. Probabilistic Knowledge y x) Bayesian networks, graphical models P Q(P) (Aspects of P) 2. Interventional Knowledge y do(x)) Causal Bayesian Networks (CBN) (Agnostic graphs, manipulation graphs) Inference 3. Counterfactual Knowledge x = x =y ) Structural equation models, physics, functional graphs, Treatment assignment mechanism 3 e.g., Infer whether customers who bought product A would also buy product B. Q = B A) 4 FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations P change Inference P Q(P ) (Aspects of P ) What happens when P changes? e.g., Infer whether customers who bought product A would still buy A if we were to double the price. 5 FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES What remains invariant when P changes sa to satisfy P (price=2)=1 P change P Inference Note: P (v) P (v price = 2) P does not tell us how it ought to change Causal knowledge: what remains invariant Q(P ) (Aspects of P ) 6 1

FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomiation / Intervention Association / Independence Confounding / Effect Controlling for / Conditioning Instrumental variable Odd and risk ratios Exogeneity / Ignorability Collapsibility / Granger causality Mediation Propensity score 2. 3. 4. FROM STATISTICAL TO CAUSAL ANALSIS: 2. MENTAL BARRIERS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomiation / Intervention Association / Independence Confounding / Effect Controlling for / Conditioning Instrumental variable Odd and risk ratios Exogeneity / Ignorability Collapsibility / Granger causality Mediation Propensity score 2. No causes in no causes out (Cartwright, 1989) statistical assumptions + data causal assumptions causal conclusions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. } 7 8 FROM STATISTICAL TO CAUSAL ANALSIS: 2. MENTAL BARRIERS BAESIAN NETWORKS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomiation / Intervention Association / Independence Confounding / Effect Controlling for / Conditioning Instrumental variable Odd and risk ratios Exogeneity / Ignorability Collapsibility / Granger causality Mediation Propensity score 2. No causes in no causes out (Cartwright, 1989) statistical assumptions + data causal assumptions causal conclusions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. Non-standard mathematics: a) Structural equation models (Wright, 1920; Simon, 1960) b) Counterfactuals (Neyman-Rubin ( x ), Lewis (x )) } 9 Definition P is Markov relative to G iff: V i NDi PAi Algebraic Representation P ( vi,..., vn) = vi pai ) i Graphical Representation PA i ( ) G ( ) P ( d - separation conditional independence) V i G 10 CAUSAL BAESIAN NETWORK (CBN) (Agnostic graphs, manipulation graphs) Definition Let P x (v) stand for v do(=x)), and x arbitrary G is a CBN relative to P x * iff: i. P x (v) is Markov relative to G; ii. P x (v i )=1 for all V i if v i agrees with =x. iii. P x( vi pai ) = vi pai ) for all V i whenever pa i agrees with = i.e., each v i pa i ) remains INVARIANT to interventions not involving V i. Algebraic representation Px ( v) = vi pai ) for all v consistent with x. { i V i } (Truncated product, G-computation, manipulation theorem) Graphical representation Surgery: remove incoming arrows to. 11 12 THE STRUCTURAL MODEL PARADIGM Inference Generating Model M Q(M) (Aspects of M) M Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis. Think Nature, not experiment! 2

STRUCTURAL CAUSAL MODELS Definition: A structural causal model is a 4-tuple V,U, F,, where V = {V 1,...,V n } are endogeneas variables U ={U 1,...,U m } are background variables F = {f 1,..., f n } are functions determining V, v i = f i (v, e.g., y = α + βx + u is a distribution over U and F induce a distribution v) over observable variables CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: would be y (in unit, had been denoted x ( = means: The solution for in a mutilated model M x, (i.e., the equations for replaced by = x) with input U=u, is equal to y. ( U M ( = x U M x ( 13 14 CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: would be y (in unit, had been denoted x ( = means: The solution for in a mutilated model M x, (i.e., the equations for replaced by = x) with input U=u, is equal to y. The Fundamental Equation of Counterfactuals: x ( = M ( x 15 CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: would be y (in situation, had been denoted x ( = means: The solution for in a mutilated model M x, (i.e., the equations for replaced by = x) with input U=u, is equal to y. probabilities of counterfactuals: x = w = = u: x ( = w( = In particular: y do(x) ) Δ = x = = u: x ( = y PN( x' = y' = u u: x ' ( = y' 16 TWO PARADIGMS FOR CAUSAL INFERENCE Observed:,,,...) Conclusions needed: x =, y =x =... ARE THE TWO PARADIGMS EQUIVALENT? es (Galles and Pearl, 1998; Halpern 1998) How do we connect observables,,,, to counterfactuals x,, y,? N-R model Counterfactuals are primitives, new variables Super-distribution P *(,,...,,...),, constrain... Structural model Counterfactuals are derived quantities Subscripts modify the model and distribution x = = PM ( = x 17 In the N-R paradigm, x is defined by consistency: = x1 + ( 1 x) 0 In SCM, consistency is a theorem. Moreover, a theorem in one approach is a theorem in the other. 18 3

THE FIVE NECESSAR STEPS OF CAUSAL ANALSIS THE FIVE NECESSAR STEPS FOR EFFECT ESTIMATION Define: Express the target quantity Q as a function Q(M) that can be computed from any model M. Assume: Formulate causal assumptions A using some formal language. Identify: Determine if Q is identifiable given A. Estimate: Estimate Q if it is identifiable; approximate it, if it is not. Test: Test the testable implications of A (if an. 19 Define: Express the target quantity Q as a function Q(M) that can be computed from any model M. ATE Δ = E( do( x1)) do( x0)) Assume: Formulate causal assumptions A using some formal language. Identify: Determine if Q is identifiable given A. Estimate: Estimate Q if it is identifiable; approximate it, if it is not. Test: Test the testable implications of A (if an. 20 FORMULATING ASSUMPTIONS THREE LANGUAGES IDENTIFING CAUSAL EFFECTS IN POTENTIAL-OUTCOME FRAMEWORK 1. English: Smoking (), Cancer (), Tar (), Genotypes (U) 2. Counterfactuals: Not too friendly: x consistent? complete? redundant? arguable? 3. Structural: x( = yx(, y( = y( = ( = (, ( = x(, {, } 21 Define: Assume: Identify: Estimate: Express the target quantity Q as a counterfactual formula, e.g., E((1) (0)) Formulate causal assumptions using the distribution: P * =,, (1), (0)) Determine if Q is identifiable using P* and =x (1) + (1 x) (0). Estimate Q if it is identifiable; approximate it, if it is not. 22 GRAPHICAL COUNTERFACTUALS SMBIOSIS Every causal graph expresses counterfactuals assumptions, e.g., 1. Missing arrows x, ( = x( 2. Missing arcs x y consistent, and readable from the graph. IDENTIFICATION IN SCM Find the effect of on, y do(x)), given the causal assumptions shown in G, where 1,..., k are auxiliary variables. 3 1 4 G 2 5 Express assumption in graphs Derive estimands by graphical or algebraic methods 23 6 Can y do(x)) be estimated if only a subset,, can be measured? 24 4

ELIMINATING CONFOUNDING BIAS THE BACK-DOOR CRITERION y do(x)) is estimable if there is a set of variables such that d-separates from in G x. 1 G 2 1 G x 2 EFFECT OF WARM-UP ON INJUR (After Shrier & Platt, 2008) Watch out! Front??? Door 3 4 6 3 5 5 4 6 Moreover, y do( x)) = y = ( adjusting for ) Ignorability x 25 No, no! Warm-up Exercises () Injury () 26 CONFOUNDING EQUIVALENCE EQUALL VALUABLE CONFOUNDING EQUIVALENCE EQUALL VALUABLE L W 1 W 4 T W 2 W 3 V 1 V 2? T 27 Definition: T and are c-equivalent if P ( y t) t) = y y t Definition (Markov boundar: Markov boundary S m of S (relative to ) is the minimal subset of S that d-separates from all other members of S. Theorem (Pearl and Pa, 2009) and T are c-equivalent iff 1. m =T m, or 2. and T are admissible (i.e., satisfy the back-door condition) 28 CONFOUNDING EQUIVALENCE EQUALL VALUABLE CONFOUNDING EQUIVALENCE EQUALL VALUABLE L W 1 W 4 T W 1 W 4 T W 2 W 3 T W 2 W 3 T V 1 V 2 V 1 V 2 29 30 5

CONFOUNDING EQUIVALENCE EQUALL VALUABLE BIAS AMPLIFICATION B INSTRUMENTAL VARIABLES T W 2 W 1 W 1 W 4 W 1 {W 1, W 2 } W 2 W 3 T U V 1 V 2 31 Adding W 2 to Propensity Score increases bias (if such exists) (Wooldridge, 2009) In linear systems always In non-linear systems almost always (Pearl, 2010) Outcome predictors are safer than treatment predictors 32 EFFECT DECOMPOSITION (direct vs. indirect effects) WH DECOMPOSE EFFECTS? 1. Why decompose effects? 2. What is the definition of direct and indirect effects? 3. What are the policy implications of direct and indirect effects? 4. When can direct and indirect effect be estimated consistently from experimental and nonexperimental data? 33 1. To understand how Nature works 2. To comply with legal requirements 3. To predict the effects of new type of interventions: Signal routing, rather than variable fixing 34 (Gender) LEGAL IMPLICATIONS OF DIRECT EFFECT Can data prove an employer guilty of hiring discrimination? (Hiring) What is the direct effect of on? (Qualifications) E ( do( x1 ), do( ) do( x0), do( ) (averaged over Adjust for? No! No! 35 NATURAL INTERPRETATION OF AVERAGE DIRECT EFFECTS Robins and Greenland (1992) Pure = f ( y = g (, Natural Direct Effect of on : DE( x0, x1; ) The expected change in, when we change from x 0 to x 1 and, for each u, we keep constant at whatever value it attained before the change. E[ x ] 1 x x 0 0 In linear models, DE = Controlled Direct Effect = β ( x1 x0 ) 36 6

DEFINITION OF INDIRECT EFFECTS POLIC IMPLICATIONS OF INDIRECT EFFECTS = f ( y = g (, Indirect Effect of on : IE( x0, x1; ) The expected change in when we keep constant, say at x 0, and let change to whatever value it would have attained had changed to x 1. E[ x ] 0 x1 x0 In linear models, IE = TE - DE 37 What is the indirect effect of on? The effect of Gender on Hiring if sex discrimination is eliminated. GENDER IGNORE f HIRING QUALIFICATION Deactivating a link a new type of intervention 38 MEDIATION FORMULAS 1. The natural direct and indirect effects are identifiable in Markovian models (no confounding), 2. And are given by: DE = [ E( do( x1, ) do( x0, )] do( x0)). IE = E( do( x0, )[ do( x1 )) do( x0))] 3. Applicable to linear and non-linear models, continuous and discrete variables, regardless of distributional form. 39 MEDIATION FORMULAS IN UNCONFOUNDED MODELS DE = [ E( x1, x0, ] x0) IE = [ E( x0, [ x1) x0)] TE = E( x1) x0) TE DE = Fraction of responses owed to mediation IE = Fraction of responses explained by mediation 40 n 1 0 0 0 n 2 0 0 1 n 3 0 1 0 n 4 0 1 1 n 5 1 0 0 n 6 1 0 1 n 7 1 1 0 n 8 1 1 1 COMPUTING THE MEDIATION FORMULA E( =g x n2 = g00 n1 + n2 n4 = g01 n3 + n4 n6 = g10 n5 + n6 n8 = g11 n7 + n8 E( x)=h x n3 + n4 = h0 n1 + n2 + n3 + n4 n7 + n8 = h1 n5 + n6 + n7 + n8 DE = ( g10 g00)(1 h0 ) + ( g11 g01) h0 IE = ( h1 h0 )( g01 g00) 41 RAMIFICATION OF THE MEDIATION FORMULA DE should be averaged over mediator levels, IE should NOT be averaged over exposure levels. TE-DE need not equal IE TE-DE = proportion for whom mediation is necessary IE = proportion for whom mediation is sufficient TE-DE informs interventions on indirect pathways IE informs intervention on direct pathways. 42 7

MEASUREMENT BIAS AND EFFECT RESTORATION Unobserved w y do(x)) is identifiable from measurement of W, if w is given (Selen, 1986; Greenland W & Lash, 2008) w = w (local independence) w) =, w) = w = w P ( = I(, w) w) w 43 W EFFECT RESTORATION IN BINAR MODELS 1 To cell ( = 0) 1 undefined undefined w 1 W = 0 = 1) = ε δ To cell 1 ε ( = 1) 1 W = 1 = 0) = δ w 1 1 ( ) ( ( )) 1 ) P x P y do x 1 ( 1 ) δ P x w w1 w1 ) w + 0 ) 1 1 x) 1 0 ) ε x w w0 w0 ) Weight distribution from cell (W = 1) 44 EFFECT RESTORATION IN LINEAR MODELS c 1 c c 2 3 W c 0 (a) c 1 c c c 2 4 3 W V c 0 cov( ) cov( V ) cov( W ) cov( WV ) c0 = cov( V ) var( ) cov( W ) cov( WV ) (b) σ xy σ xwσ yw / k c k c 2 0 = = 3 σ σ xxσ 2 xw / k CONCLUSIONS I He TOLD is wise OU who CAUSALIT bases causal IS inference SIMPLE on an explicit causal structure that is Formal basis for causal and counterfactual defensible on scientific grounds. inference (complete) Unification of the graphical, potential-outcome and structural equation approaches Friendly and formal solutions to From Charlie Poole (Aristotle 384-322 B.C.) century-old problems and confusions. Correlated proxies (Cai & Kuroki, 2008) 45 46 QUESTIONS??? They will be answered 47 8