Taming the challenge of extrapolation: From multiple experiments and observations to valid causal conclusions. Judea Pearl and Elias Bareinboim, UCLA

Size: px

Start display at page:

Download "Taming the challenge of extrapolation: From multiple experiments and observations to valid causal conclusions. Judea Pearl and Elias Bareinboim, UCLA"

Winfred Adams
6 years ago
Views:

1 Taming the challenge of extrapolation: From multiple experiments and observations to valid causal conclusions Judea Pearl and Elias Bareinboim, UCLA One distinct feature of big data applications, setting them apart from traditional causal analyses, is that data are pooled together from multiple sources. Each represents a different population, studied under different conditions, some experimental and some observational, and each contaminated by different confounding factors and different sample selection mechanisms. All of these studies need be pooled together to answer causal questions in yet a new environment, unmatched by any of those studied. The problem, as described above, appears utterly hopeless, almost as hopeless as confronting physics problems in the 16th century, before the advent of algebra. My talk will demonstrate that the solutions to the two types of problems, data pooling on the one hand and algebraic equations on the other, are strikingly similar; they can be unraveled by symbolic operations and managed at ease by anyone who masters the two basic tools of causal inference: graphical models and the logic of causation. The net result is the emergence of algorithms that decide what piece of information must be extracted from each of the available data sources and how to combine those pieces together so as to provide valid answers to a variety of causal questions: hat if? How? and hy? To gain an appreciation for the task, the audience may wish to consider three toy problems described in Figure 3 of this introductory paper All three problems involve the same task: e gather causal information in one environment and we wish to generalize it to another, which differs from the first in a set of characteristics. How do we adjust the available information so as to account for differences between the two populations? It turns out, as readers will immediately discover, that there is no universal adjustment formula for this task; the adjustment should vary from case to case depending on the location of the set in the causal scheme of things. Traditional schemes like post stratification or re weighting would not work, except in special cases. Fortunately, however, once the set is allocated in the causal graph, the proper adjustment can be generated automatically using symbolic operations, in much the same way that we solve algebraic equations. The talk will describe how we go from two to multiple environments, how we characterize the idiosyncratic features of each population or environment, and how we can generalize from unrepresentative samples to population level effects, Related papers can be viewed here: Tutorials reprint.pdf Extrapolation problem

2 FUION + ETRAPOLATION: FROM MULTIPLE EPERIMENT AND OBERVATION TO VALID CAUAL CONCLUION Judea Pearl Elias Bareinboim University of California, Los Angeles ( THE DATA POOLING PROBLEM The problem How to combine results of several experimental and observational studies, each conducted on a different population and under a different set of conditions, so as to construct a valid estimate of effect size in yet a new population, unmatched by any of those studied. 1

3 THE PROBLEM IN REAL LIFE Target population * Query of interest: Q = P*(y do(x)) (a) Arkansas urvey data available (b) New ork urvey data Resembling target (c) Los Angeles urvey data ounger population (d) Boston Age not recorded Mostly successful lawyers (g) Toronto Randomized trial College students (e) an Francisco High post-treatment blood pressure (h) Utah RCT, paid volunteers, unemployed (f) Texas Mostly panish subjects High attrition (i) yoming RCT, young athletes THE PROBLEM IN MATHEMATIC Target population * Query of interest: Q = P*(y do(x)) (a) (b) (c) (d) (e) (f) (g) (h) (i) 2

4 FIVE LEON FROM THE THEATRE OF CAUAL INFERENCE 1. Every causal inference task must rely on judgmental, extra-data assumptions. 2. e have ways of encoding those assumptions mathematically. 3. e have a mathematical machinery to take those assumptions, combine them with data and derive answers to questions of interest. 4. e have a way of doing (2) and (3) in a language that permits us to judge the scientific plausibility of our assumptions and to derive their ramifications swiftly and transparently. 5. Items (2)-(4) make the fusion problem manageable. OUTLINE Three useful tools Reading a diagram Representing interventions Do-calculus: evaluating interventions Transportability: The two-population problem Three toy examples and their solutions hy transportation requires causal calculus The mathematics of solving extrapolation problems The inference engine Examples 6 3

5 READING A DIAGRAM Graph (G) (prinkler) C (Climate) (etness) R (Rain) Model (M) C f C (U C ) f (C,U ) R f R (C,U R ) f (, R,U ) Miracles do happen If the U 's are independent, the observed distribution P(C,R,,) satisfies constraints that are: (1) independent of the f 's and of P(U), (2) readable from the graph. READING A DIAGRAM (Cont.) Graph (G) C (Climate) Model (M) C f C (U C ) R f (C,U ) (prinkler) (Rain) R f R (C,U R ) (etness) f (, R,U ) Every missing arrow advertises an independency, conditional on a separating set. e.g., C (, R) R C Applications: 1. Model testing 2. tructure learning 3. Reducing "what if I do" questions to symbolic calculus 4. Reducing extrapolation to symbolic calculus 4

6 REPREENTING INTERVENTION Definition: Given a model M, the effect of setting to x, P( = y do (=x)), is equal to the probability of = y in a mutilated model M x, in which the equation for is replaced by = x. M U M x U (u) (u) = x x (u) The Fundamental Equation of Interventions: P( y do( x)) P M x ( y) P( x y) THE TO FUNDAMENTAL LA OF CAUAL INFERENCE 1. The Law of Counterfactuals (and Interventions) x (u) M x (u) (M generates and evaluates all counterfactuals.) 2. The Law of Conditional Independence (d-separation) ( sep ) G(M ) ( ) P(v) (eparation in the model independence in the distribution.) 5

7 ELIMINATING CONFOUNDING BIA THE BACK-DOOR CRITERION P(y do(x)) is estimable if there is a set of variables that d-separates from in Gx ("conditional ignorability ) x G G x Moreover, P(y do((x)) ( adjusting for z) 6 P(y x,z)p(z) z GOING BEOND ADJUTMENT Goal: Find the effect of moking on Cancer, P(c do(s)), given samples from P(, T, C), when latent variables confound the relationship -C. Query Genotype (Unobserved) Data moking Tar Cancer 6

8 IDENTIFICATION REDUCED TO CALCULU Query Genotype (Unobserved) moking Tar Cancer P(c do(s)) t P(c do(s),t)p(t do(s)) Probability Axioms t P(c do(s),do(t))p(t do(s)) Rule 2 t P(c do(s),do(t))p(t s) Rule 2 t P(c do(t)p(t s) Rule 3 s' t P(c do(t),s')p(s' do(t))p(t s) Probability Axioms s' t P(c t,s')p(s' do(t))p(t s) Rule 2 s' t P(c t,s')p(s')p(t s) Rule 3 Estimand DO-CALCULU (IDENTIFICATION REDUCED TO CALCULU) The following transformations are valid for every interventional distribution generated by a structural causal model M: Rule 1: Ignoring observations P(y do(x),z,w) P(y do(x),w), Rule 2: Action/observation exchange P(y do(x),do(z),w) P(y do(x), z,w), Rule 3: Ignoring actions P(y do(x),do(z),w) P(y do(x),w), 7

9 UMMAR: THE IDENTIFICATION PROBLEM I OLVED (NONPARAMETRICALL) The estimability of any expression of the form Q P(y 1, y 2,..., y n do(x 1, x 2,..., x m ), z 1,z 2,...,z k ) can be decided in polynomial time. If Q is estimable, then its estimand can be derived in polynomial time. The algorithm is complete. ame for ETT (hpitser 2008). THE TO POPULATION PROBLEM HAT CAN EPERIMENT IN LA TELL U ABOUT NC? (Age) (LA) *(N) (Intervention) (Outcome) Experimental study in LA Measured: Needed: Transport Formula: P(x, y,z) P(y do(x), z) Observational study in NC Measured: P*(x, y,z) P*(z) P(z) Q P*(y do(x))? P(y do(x), z)p*(z) z Q F(P,P do,p*) 8

10 TRANPORT FORMULA DEPEND ON THE CAUAL TOR Lesson: Not every dissimilarity deserves re-weighting. (a) a) represents age (b) z b) represents language skill? c) represents a bio-marker? (c) P *(y do(x)) P(y do(x), z)p *(z) P *(y do(x)) P(y do(x)) P *(y do(x)) P(y do(x), z)p *(z x) TRANPORT FORMULA DEPEND ON THE CAUAL TOR Lesson: There are limits to -ignorability. (a) (b) (c) a) represents age P *(y do(x)) P(y do(x), z)p *(z) z b) represents language skill P *(y do(x)) P(y do(x)) c) represents a bio-marker P *(y do(x)) P(y do(x), z)p *(z x) 9

11 TRANPORTABILIT REDUCED TO CALCULU Theorem A causal relation R is transportable from to * if and only if it is reducible, using the rules of do-calculus, to an expression in which is separated from do( ). Query R * P*(y do(x)) P(y do(x),s) P(y do(x), s,w)p(w do(x), s) w Estimand P(y do(x),w)p(w s) w P(y do(x),w)p*(w) w 19 REULT: ALGORITHM TO DETERMINE IF AN EFFECT I TRANPORTABLE U T ' V INPUT: Annotated Causal Graph Factors creating differences OUTPUT: 1. Transportable or not? 2. Measurements to be taken in the experimental study 3. Measurements to be taken in the target population 4. A transport formula 5. Completeness (Bareinboim, 2012) P*(y do(x)) P(y do(x), z) P *(z w) P(w do(w),t)p *(t) z w t 20 10

12 HICH MODEL LICENE THE TRANPORT OF THE CAUAL EFFECT External factors creating disparities es No es (a) (b) (c) es es No (d) (e) (f) FROM TO TO MULTIPLE ENVIRONMENT Target population * (a) (b) R = P*(y do(x)) (c) (d) (e) (f) (g) (h) (i) 11

13 UMMAR OF TRANPORTABILIT REULT Nonparametric transportability of experimental results from multiple environments can be determined provided that commonalities and differences are encoded in selection diagrams. hen transportability is feasible, the transport formula can be derived in polynomial time. The algorithm is complete. TRANPORTABILIT VERU AMPLING ELECTION BIA Transportability (Beach proximity) (Age) election Bias (Age) (ads) (Outcome) = disparity-producing factors P *(y do(x)) P(y do(x), 1) P(y do(x), z)p(z 1) z (ads) (Outcome) (Army service) = sampling mechanism P *(y do(x)) P(y do(x)) P(y do(x), z, 1) P(z) z 12

14 THE INFERENCE ENGINE Query (The thing I want estimated) Calculus Model (My perception of reality) Design (Type of data collected and how) Goal olution (Estimand) A problem is solved if the model permits the calculus to transform the Query into a format deliverable by the design. BAB INFERENCE ENGINE Query P(, ) (The thing I want estimated) Data {P(), P()} Calculus Model (My perception of reality) Design (Type of data collected and how) Goal olution (Estimand) License: P(, ) P( )P( ) P()P( ) A problem is solved if the model permits the calculus to transform the Query into a format deliverable by the design. 13

15 JUVENILE INFERENCE ENGINE Query P(y do(x)) (The thing I want estimated) Data P(, ) Calculus Model (My perception of reality) Design (Type of data collected and how) Goal License: No back-door olution (Estimand) P( ) A problem is solved if the model permits the calculus to transform the Query into a format deliverable by the design. THE TRANPORT ENGINE Query P*(y do(x)) = P(y do(x), = s) (The thing I want estimated) Data P(y do(x), ) P(,, = s) Design (Type of data collected and how) Calculus Goal Model (My perception of reality) P(y do(x),z)p(z x,s) z olution (Estimand) = s A problem is solved if the model permits the calculus to transform the Query into a format deliverable by the design. 14

16 THE ELECTION-BIA ENGINE Data P(y do(x),, = 1) P(,, ) Design (Type of data collected and how) Query P*(y do(x)) = P(y do(x)) (The thing I want estimated) Calculus Goal Model (My perception of reality) P(y do(x), z, 1)P(z x) z olution (Estimand) A problem is solved if the model permits the calculus to transform the Query into a format deliverable by the design. FIVE LEON FROM THE THEATRE OF CAUAL INFERENCE 1. Every causal inference task must rely on judgmental, extra-data assumptions. 2. e have ways of encoding those assumptions mathematically. 3. e have a mathematical machinery to take those assumptions, combine them with data and derive answers to questions of interest. 4. e have a way of doing (2) and (3) in a language that permits us to judge the scientific plausibility of our assumptions and to derive their ramifications swiftly and transparently. 5. Items (2)-(4) make causal inference easy, fun, and extremely productive. 15

17 CONCLUION More has been learned about causal inference in the last few decades than the sum total of everything that had been learned about it in all prior recorded history. (Gary King, Harvard, 2014) More will be learned about causal inference in the next decade than most of us imagine today. Thank you 16

18 More:

THE MATHEMATICS OF CAUSAL INFERENCE With reflections on machine learning and the logic of science. Judea Pearl UCLA

THE MATHEMATICS OF CAUSAL INFERENCE With reflections on machine learning and the logic of science Judea Pearl UCLA OUTLINE 1. The causal revolution from statistics to counterfactuals from Babylon to Athens