OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (www.cs.ucla.

Similar documents
CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING. Judea Pearl University of California Los Angeles (

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (

CAUSAL INFERENCE IN STATISTICS. A Gentle Introduction. Judea Pearl University of California Los Angeles (

OUTLINE CAUSES AND COUNTERFACTUALS IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

From Causality, Second edition, Contents

CAUSES AND COUNTERFACTUALS: CONCEPTS, PRINCIPLES AND TOOLS

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles

PANEL ON MEDIATION AND OTHER MIRACLES OF THE 21 ST CENTURY

THE MATHEMATICS OF CAUSAL INFERENCE With reflections on machine learning and the logic of science. Judea Pearl UCLA

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Introduction to Causal Calculus

Causality in the Sciences

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Taming the challenge of extrapolation: From multiple experiments and observations to valid causal conclusions. Judea Pearl and Elias Bareinboim, UCLA

Bounding the Probability of Causation in Mediation Analysis

Causal inference in statistics: An overview

The Foundations of Causal Inference

Technical Track Session I: Causal Inference

The Mathematics of Causal Inference

The Effects of Interventions

The Science and Ethics of Causal Modeling

Probabilities of causation: Bounds and identification

An Introduction to Causal Inference

A Decision Theoretic Approach to Causality

Gov 2002: 4. Observational Studies and Confounding

Identification of Conditional Interventional Distributions

Introduction. Abstract

Causal Analysis After Haavelmo

CAUSALITY CORRECTIONS IMPLEMENTED IN 2nd PRINTING

Causal Analysis After Haavelmo

Reflections, Elaborations, and Discussions with Readers

Counterfactual Reasoning in Algorithmic Fairness

THE FOUNDATIONS OF CAUSAL INFERENCE

Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE

Stability of causal inference: Open problems

Complete Identification Methods for the Causal Hierarchy

5.3 Graphs and Identifiability

External Validity and Transportability: A Formal Approach

Causal and Counterfactual Inference

Identification and Overidentification of Linear Structural Equation Models

UCLA Department of Statistics Papers

1 Introduction The research questions that motivate most studies in statistics-based sciences are causal in nature. For example, what is the ecacy of

Casual Mediation Analysis

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Causality. Pedro A. Ortega. 18th February Computational & Biological Learning Lab University of Cambridge

here called \descriptive" (see Section 2.2), which concerns the action of causal forces under natural, rather than experimental conditions, and provid

Mediation for the 21st Century

Bounds on Direct Effects in the Presence of Confounded Intermediate Variables

REASONING WITH CAUSE AND EFFECT

THE SCIENCE OF CAUSE AND EFFECT

Estimating complex causal effects from incomplete observational data

arxiv: v2 [math.st] 4 Mar 2013

causal inference at hulu

Statistical Models for Causal Analysis

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

ASA Section on Health Policy Statistics

Local Characterizations of Causal Bayesian Networks

(or death). This paper deals with the question of estimating the probability of causation from statistical data. Causation has two faces, necessary an

Causal Effect Evaluation and Causal Network Learning

Causes of Effects and Effects of Causes

What Counterfactuals Can Be Tested

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Biometrics Section. x = v y = x + u β β

Technical Track Session I:

Identifiability in Causal Bayesian Networks: A Gentle Introduction

Mediation analysis for different types of Causal questions: Effect of Cause and Cause of Effect

Identifiability in Causal Bayesian Networks: A Sound and Complete Algorithm

Front-Door Adjustment

On Identifying Causal Effects

Identification and Estimation of Causal Effects from Dependent Data

Confounding Equivalence in Causal Inference

Instructor: Sudeepa Roy

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs

Causal Analysis in Social Research

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Generalized Do-Calculus with Testable Causal Assumptions

Single World Intervention Graphs (SWIGs):

The Doubly-Modifiable Structural Model (DMSM) The Modifiable Structural Model (MSM)

Direct and Indirect Effects

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics

Tutorial: Causal Model Search

SEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION

Transportability of Causal and Statistical Relations: A Formal Approach

Ignoring the matching variables in cohort studies - when is it valid, and why?

Generalized Quantifiers Logical and Linguistic Aspects

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

Applications of Exponential Functions Group Activity 7 STEM Project Week #10

A Linear Microscope for Interventions and Counterfactuals

Computational Systems Biology: Biology X

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

On the Identification of a Class of Linear Models

Models of Causality. Roy Dong. University of California, Berkeley

Path analysis for discrete variables: The role of education in social mobility

Identifying Linear Causal Effects

Causal Structure Learning and Inference: A Selective Review

Help! Statistics! Mediation Analysis

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

Single World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality

Transcription:

THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Modeling: Statistical vs. Causal Causal Models and Identifiability to three types of claims: 1. Effects of potential interventions 2. Claims about attribution (responsibilit 3. Claims about direct and indirect effects TRADITIONAL STATISTICAL INFERENCE PARADIGM P Q(P) (Aspects of P) e.g., Infer whether customers who bought product A would also buy product B. Q = B A) FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations P change P Q(P ) (Aspects of P ) What happens when P changes? e.g., Infer whether customers who bought product A would still buy A if we were to double the price. FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES What remains invariant when P changes say, to satisfy P (price=2)=1 P change P Note: P (v) P (v price = 2) P does not tell us how it ought to change e.g. Curing symptoms vs. curing diseases e.g. Analogy: mechanical deformation Q(P ) (Aspects of P ) FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect Controlling for / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. 3. 4. 1

FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect Controlling for / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in no causes out (Cartwright, 1989) statistical assumptions + data causal assumptions causal conclusions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. } FROM STATISTICAL TO CAUSAL ANALSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect Controlling for / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in no causes out (Cartwright, 1989) statistical assumptions + data causal assumptions causal conclusions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. Non-standard mathematics: a) Structural equation models (Wright, 1920; Simon, 1960) b) Counterfactuals (Neyman-Rubin ( x ), Lewis (x )) } WH CAUSALIT NEEDS SPECIAL MATHEMATICS FROM STATISTICAL TO CAUSAL ANALSIS: 2. THE MENTAL BARRIERS SEM Equations are Non-algebraic: = 2 = 1 Process information Had been 3, would be 6. If we raise to 3, would be 6. Must wipe out = 1. = 1 = 2 Static information 1. Every exercise of causal analysis must rest on untested, judgmental causal assumptions. 2. Every exercise of causal analysis must invoke non-standard mathematical notation. TWO PARADIGMS FOR CAUSAL INFERENCE Observed:,,,...) Conclusions needed: x =, y =x =z)... How do we connect observables,,,, to counterfactuals x, z, y,? THE STRUCTURAL MODEL PARADIGM Generating Model Q(M) (Aspects of M) N-R model Counterfactuals are primitives, new variables Super-distribution P*(,,, x, z, ),, constrain x, y, Structural model Counterfactuals are derived quantities Subscripts modify a data-generating model M Oracle for computing answers to Q s. e.g., Infer whether customers who bought product A would still buy A if we were to double the price. 2

FAMILIAR CAUSAL MODEL ORACLE FOR MANIPILATION STRUCTURAL CAUSAL MODELS INPUT OUTPUT Definition: A structural causal model is a 4-tuple V,U, F,, where V = {V 1,...,V n } are observable variables U ={U 1,...,U m } are background variables F = {f 1,..., f n } are functions determining V, v i = f i (v, is a distribution over U and F induce a distribution v) over observable variables CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: would be y (in situation, had been x, denoted x ( = y, means: The solution for in a mutilated model M x, (i.e., the equations for replaced by = x) with input U=u, is equal to y. probabilities of counterfactuals: x = y, w = z) = u: x( = y, w( = z The super-distribution P* is derived from M. Parsimonious, consistent, and transparent APPLICATIONS 1. Predicting effects of actions and policies 2. Learning causal relationships from assumptions and data 3. Troubleshooting physical systems and plans 4. Finding explanations for reported events 5. Generating verbal explanations 6. Understanding causal talk 7. Formulating theories of causal thinking AIOMS OF CAUSAL COUNTERFACTUALS x ( = y : would be y, had been x (in state U = 1. Definiteness x s. t. y ( = x 2. Uniqueness ( y ( = x) & ( y( = x') x = x' 3. Effectiveness xw ( = x 4. Composition W x ( = w xw( = x( 5. Reversibility ( xw ( = y & ( Wxy( = w) x( = y RULES OF CAUSAL CALCULUS Rule 1: Ignoring observations y do{x}, z, w) = y do{x}, w) if (,W ) G Rule 2: Action/observation exchange y do{x}, do{z}, w) = y do{x},z,w) if (,W ) Rule 3: Ignoring actions y do{x}, do{z}, w) = y do{x}, w) G if (,W ) G (W) 3

DERIVATION IN CAUSAL CALCULUS Genotype (Unobserved) Smoking Tar Cancer P (c do{s}) = Σ t P (c do{s}, t) P (t do{s}) Probability Axioms = Σ t P (c do{s}, do{t}) P (t do{s}) = Σ t P (c do{s}, do{t}) P (t s) = Σ t P (c do{t}) P (t s) Rule 2 Rule 2 Rule 3 = Σ s Σ t P (c do{t}, s ) P (s do{t}) t s) Probability Axioms = Σ s Σ t P (c t, s ) P (s do{t}) t s) Rule 2 = Σ s Σ t P (c t, s ) P (s ) t s) Rule 3 THE BACK-DOOR CRITERION Graphical test of identification y do(x)) is identifiable in G if there is a set of variables such that d-separates from in G x. 3 1 4 6 G Moreover, y do(x)) = y x,z) z) z ( adjusting for ) 2 5 3 1 4 6 G x 2 5 RECENT RESULTS ON IDENTIFICATION do-calculus is complete DETERMINING THE CAUSES OF EFFECTS (The Attribution Problem) our Honor! My client (Mr. A) died BECAUSE he used that drug. Complete graphical criterion for identifying causal effects (Shpitser and Pearl, 2006). Complete graphical criterion for empirical testability of counterfactuals (Shpitser and Pearl, 2007). DETERMINING THE CAUSES OF EFFECTS (The Attribution Problem) our Honor! My client (Mr. A) died BECAUSE he used that drug. Semantical Problem: THE PROBLEM 1. What is the meaning of PN(x,: Probability that event y would not have occurred if it were not for event x, given that x and y did in fact occur. Court to decide if it is MORE PROBABLE THAN NOT that A would be alive BUT FOR the drug! PN =? A is dead, took the drug) > 0.50 4

Semantical Problem: THE PROBLEM 1. What is the meaning of PN(x,: Probability that event y would not have occurred if it were not for event x, given that x and y did in fact occur. Answer: Computable from M PN( x, = x ' = y' x, Semantical Problem: THE PROBLEM 1. What is the meaning of PN(x,: Probability that event y would not have occurred if it were not for event x, given that x and y did in fact occur. Analytical Problem: 2. Under what condition can PN(x, be learned from statistical data, i.e., observational, experimental and combined. TPICAL THEOREMS (Tian and Pearl, 2000) Bounds given combined nonexperimental and experimental data 0 1 y ) ( ) max x' P y' PN min x' x, x, Identifiability under monotonicity (Combined data) y x) y x' ) y x' ) y ) PN = + x' y x) x, corrected Excess-Risk-Ratio CAN FREQUENC DATA DECIDE LEGAL RESPONSIBILIT? Experimental Nonexperimental do(x) do(x ) x x Deaths ( 16 14 2 28 Survivals (y ) 984 986 998 972 1,000 1,000 1,000 1,000 Nonexperimental data: drug usage predicts longer life Experimental data: drug has negligible effect on survival Plaintiff: Mr. A is special. 1. He actually died 2. He used the drug by choice Court to decide (given both data): Is it more probable than not that A would be alive but for the drug? PN Δ = x' = y' x, > 0.50 SOLUTION TO THE ATTRIBUTION PROBLEM EFFECT DECOMPOSITION What is the semantics of direct and indirect effects? What are their policy-making implications? WITH PROBABILIT ONE 1 y x x, 1 Combined data tell more that each study alone Can we estimate them from data? Experimental data? 5

WH DECOMPOSE EFFECTS? 1. Direct (or indirect) effect may be more transportable. 2. Indirect effects may be prevented or controlled. Pill Pregnancy + + Thrombosis 3. Direct (or indirect) effect may be forbidden Gender Hiring Qualification SEMANTICS BECOMES NONTRIVIAL IN NONLINEAR MODELS (even when the model is completely specified) TE Δ = E( do( x)) x DE Δ = E( do( x), do( z)) x IE Δ =???? z = f (x, ε 1 ) y = g (x, z, ε 2 ) Dependent on z? Void of operational meaning? THE OPERATIONAL MEANING OF DIRECT EFFECTS THE OPERATIONAL MEANING OF INDIRECT EFFECTS z = f (x, ε 1 ) y = g (x, z, ε 2 ) z = f (x, ε 1 ) y = g (x, z, ε 2 ) Natural Direct Effect of on : The expected change in per unit change of, when we keep constant at whatever value it attains before the change. E[ x ] 1 x x 0 0 In linear models, NDE = Controlled Direct Effect Natural Indirect Effect of on : The expected change in when we keep constant, say at x 0, and let change to whatever value it would have under a unit change in. E[ x ] 0 x1 x0 In linear models, NIE = TE - DE POLIC IMPLICATIONS OF INDIRECT EFFECTS indirect What is the direct effect of on? The effect of Gender on Hiring if sex discrimination is eliminated. GENDER IGNORE f HIRING QUALIFICATION SEMANTICS AND IDENTIFICATION OF NESTED COUNTERFACTUALS Consider the quantity Q Δ = Eu [ x ( ] x*( Given M,, Q is well defined Given u, x* ( is the solution for in M x*, call it z x x ( ) is the solution for in M * ( u xz experimental Can Q be estimated from data? nonexperimental 6

W GENERAL PATH-SPECIFIC EFFECTS (Def.) z * = x* ( Form a new model, M g *, specific to active subgraph g fi * ( pai, u; g) = fi( pai( g), pai*( g ), u ) Definition: g-specific effect Nonidentifiable even in Markovian models W x * E g ( x, x* ; ) M = TE( x, x* ; ) M * g EFFECT DECOMPOSITION SUMMAR Graphical conditions for estimability from experimental / nonexperimental data. Graphical conditions hold in Markovian models Useful in answering new type of policy questions involving mechanism blocking instead of variable fixing. CONCLUSIONS Structural-model semantics, enriched with logic and graphs, provides: Complete formal basis for causal reasoning Powerful and friendly causal calculus Lays the foundations for asking more difficult questions: What is an action? What is free will? Should robots be programmed to have this illusion? 7