MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS 1. By Tyler J. VanderWeele and James M. Robins. University of Chicago and Harvard University
|
|
- Scott Palmer
- 6 years ago
- Views:
Transcription
1 MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS 1 By Tyler J. VanderWeele and James M. Robins University of Chicago and Harvard University Summary. Notions of minimal su cient causation are incorporated within the directed acyclic graph causal framework. Doing so allows for the graphical representation of su cient causes and minimal su cient causes on causal directed acyclic graphs whilst maintaining all of the properties of causal directed acyclic graphs. This in turn provides a clear theoretical link between two major conceptualizations of causality: one counterfactual-based and the other based on a more mechanistic understanding of causation. The d-separation criteria can be used to detect conditional independencies within particular strata of the conditioning variable which are not evident without the minimal su cient causation structures. These minimal su cient cause representations are further used to derive conditions that imply the existence of monotonic e ects and to derive rules governing minimal su cient causation and the signs of the conditional covariances amongst variables. 1. Introduction. Two broad conceptualizations of causality can be discerned in the literature, both within philosophy and within statistics and epidemiology. The rst conceptualization may be characterized as giving an account of the e ects 1 Abbreviated Title: Minimal Su cient Causation Tyler VanderWeele was supported by a predoctoral fellowship from the Howard Hughes Medical Institute. AMS 2000 subject classi cations. Primary 62A01, 62M45; secondary 62G99, 68T30, 68R10, 05C20. Key words and phrases. Causal inference; conditional independence; directed acyclic graphs; graphical models; interactions; su cient causation; synergism. 1
2 of certain causes; the approach addresses the question, "Given a particular cause or intervention, what are its e ects?" In the contemporary philosophical literature this approach is most closely associated with Lewis work [18, 19] on counterfactuals. In the contemporary statistics literature, this rst approach is closely associated with the work of Rubin [33, 34] on potential outcomes; of Robins [27, 28] on the use of counterfactual variables in the context of time-varying treatment; and of Pearl [23] on the graphical representation of various counterfactual relations on directed acyclic graphs. This counterfactual approach has been used extensively in statistics both in the development of theory and in application. The second conceptualization of causality may be characterized as giving an account of the causes of particular e ects; this approach attempts to address the question, "Given a particular e ect, what are the various events which might have been its cause?" In the contemporary philosophical literature this second approach is most notably associated with Mackie s work [20] on insu cient but necessary components of unnecessary but suf- cient conditions (INUS conditions) for an e ect. In the epidemiologic literature this approach is most closely associated with Rothman s work [32] on su cientcomponent causes. The work is more closely related to the various mechanisms for a particular e ect than is the counterfactual approach. However, with perhaps only one notable major exception in the statistics literature [1, comments relating Aickin s work to the present work are available from the authors upon request], Rothman s work on su cient-component causes has not been developed, extended or applied, though the basic framework is routinely taught in introductory epidemiology courses. In this paper we incorporate notions of minimal su cient causes, corresponding to Rothman s su cient-component causes, within the directed acyclic graph causal 2
3 framework [23]. Doing so essentially unites the mechanistic and the counterfactual approaches into a single framework. By incorporating minimal su cient causation into the directed acyclic graph framework it is possible to graphically represent suf- cient causes and minimal su cient causes on a causal directed acyclic graph whilst maintaining all of the properties of a causal directed acyclic graph. Various extensions to the directed acyclic graph causal framework follow concerning conditional independence, monotonic e ects and conditional covariance. Incorporating minimal su cient causes into the directed acyclic graph causal framework essentially gives rise to the graphical representation of AND and OR nodes on the directed acyclic graph corresponding to what will be de ned below as individual su cient conjunctions and determinative sets of su cient conjunctions. These AND and OR nodes could potentially be incorporated into more general graphical models such summary graphs [3], MC-graphs [12], chain graph models[16, 5, 37, 17, 2, 15, 25, 41] and ancestral graph models [26]. However, directed acyclic graphs have proven particularly useful in representing causal relationships since the directed and acyclic nature of these graphs assures that causes precede e ects and that a variable cannot be its own cause. In fact, with very few exceptions [15], the use of graphical models in the eld of causal inference has been restricted to these directed acyclic graphs which includes graphs allowing for bidirected edges which represent unobserved common causes. The directed acyclic graph framework, as formulated by Pearl [23] in terms of non-parametric structural equations, will therefore be the focus in this paper of the graphical representation of minimal su cient causation. Note that Pearl s non-parametric structural equation theory is deterministic, rather than stochastic, at the individual level. It follows then that our theory, as a re nement of Pearl s, will also be deterministic at the 3
4 individual level. In what follows we will provide rigorous de nitions for the concepts of a su cient cause and minimal su cient causes within the directed acyclic graph framework. It will be seen below that corresponding to these mathematical de nitions are informal philosophical notions such as those of a causal mechanism and of synergism. It is the philosophical ideas that provide some of the motivation for the development of the mathematical and statistical theory presented in this paper and, as such, these philosophical issues receive some attention in certain examples and also in the discussion of the various results presented. However the reader uninterested in the philosophy can ignore this material as none of the de nitions, propositions, lemmas, theorems or corollaries make reference to these more informal philosophical notions. The theory developed in this paper is motivated by several other considerations. It is now standard practice to use graphs to represent and characterize conditional independence relationships amongst variables [13]. Various criteria have been developed to identify these conditional independence relations. The incorporation of minimal su cient cause nodes allows for these criteria to be applied in order to detect certain conditional independencies within particular strata of the conditioning variable which were not evident without the minimal su cient causation structures. These "asymmetric conditional independencies" have been represented elsewhere using Bayesian multinets [6]. Another motivation for the development of the theory in this paper concerns the notion of interaction. Product terms are frequently included in regression models to assess interactions amongst variables; these statistical interactions, however, even if present, need not imply the existence of an actual mechanism in which two distinct causes both participate. Interactions which do concern the actual mechanisms are sometimes referred to as "synergism" [32], 4
5 "biologic interactions" [35] or "conjunctive causes" [21] and the development of minimal su cient cause theory provides a useful framework to characterize mechanistic interactions. Incorporating minimal su cient cause nodes into the directed acyclic framework also allows in certain cases for the determination of the sign of the conditional covariance of various nodes on the graph. As yet further motivation, we conclude this introduction by describing how the methods we develop in this paper clari ed and helped resolve an analytic puzzle faced by psychiatric epidemiologists. Consider the following somewhat simpli ed version of a study reported in Hudson et al. [10]. Three hundred pairs of obese siblings living in an ethnically homogenous upper middle class suburb of Boston are recruited and cross classi ed by the presence or absence of two psychiatric disorders: manic-depressive disorder P and binge eating disorder B. The question of scienti c interest is whether these two disorders have a common genetic cause, because, if so, studies to search for a gene or genes that cause both disorders would be useful. Consider two analyses. The rst analysis estimates the covariance between P 2i and B 1i, while the second analysis estimates the conditional covariance between P 2i and B 1i among subjects with P 1i = 1, where B ki is 1 if the k th sibling in the i th family has disorder B and is zero otherwise, with P ki de ned analogously. It was found that the estimates and were both positive with 95% con dence intervals that excluded zero. Hudson et al. s substantive prior knowledge is summarized in the directed acyclic graph of Figure 1 in which the i index denoting family is suppressed. In what follows we will make reference to some standard results concerning directed acyclic graphs; these results are reviewed in detail in the following section. 5
6 G B B 1 E 1 F P 1 B 2 G P P 2 E 2 Figure 1. Causal directed acyclic graph under the alternative hypothesis of familial coaggregation. In Figure 1, G B and G P represent the genetic causes of B and P respectively that are not common causes of both B and P: The variables E 1 and E 2 represent the environmental exposures of siblings 1 and 2 respectively that are common causes of both diseases, perhaps such as exposure to a particularly stressful teacher. The variables G B and G P are assumed independent as would typically be the case if, as is highly likely, they are not genetically linked. Furthermore, as is common in genetic epidemiology, the environmental exposures E 1 and E 2 are assumed independent of the genetic factors. The causal arrows from P 1 to B 1 and P 2 to B 2 represent the investigators beliefs that manic-depressive disorder may be a cause of binge eating disorder but not vice-versa. The node F represents the common genetic causes of both P and B as well as any environmental causes of both P and B that are correlated within families. There is no data available for G B, G P, E 1, E 2 or F. The reason for grouping the common genetic causes with the correlated environmental causes is that based on the available data fp ki ; B ki ; i = 1; :::; 300; k = 1; 2g, we can only hope to test the null hypothesis that F so de ned is absent, which is referred to as the hypothesis of no familial coaggregation. If this null hypothesis is rejected, we cannot determine from the available data whether F is present due to a common genetic cause or a correlated common environmental cause. Thus E 1 and E 2 are 6
7 independent on the graph because, by de nition, they represent the environmental common causes of B and P that are independently distributed between siblings. Now, under the null hypothesis that F is absent, we note that P 2 and B 1 are still correlated due to the unblocked path P 2 G p P 1 B 1 so we would expect 6= 0 as found. Furthermore P 2 and B 1 are still expected to be correlated given P 1 = 1 due to the unblocked path P 2 G p P 1 E 1 B 1 so we would expect 6= 0 as found. Thus we cannot test the null hypothesis that F is absent without further substantive assumptions beyond those encoded in the causal directed acyclic graph of Figure 1. Now Hudson et al. were also willing to assume that for no subset of the population did the genetic causes G p and G B of P and B prevent disease. Similarly they assumed there was no subset of the population for whom the environmental causes E 1 and E 2 of B and P prevented either disease. We will show in Section 5 that under these additional assumptions, the null hypothesis that F is absent implies that the conditional covariance must be less than or equal to zero, provided that there is no interaction, in the su cient cause sense, between E and G P : Hudson et al. thought it plausible that no su cient cause interaction between E and G P existed and thus rejected the null hypothesis that that F is absent because the estimate of was positive with a 95% con dence interval that did not include zero. Thus the conclusion of Hudson et al. that familial aggregation of diseases B and P was present depended critically on the existence of (i) a formal de nition of a su cient cause interaction, (ii) a substantive understanding of what the assumption of no su cient cause interaction entailed, and (iii) a sound mathematical theory that related assumptions about the absence of su cient cause interactions to testable restrictions on the distribution of the observed data. In this paper we provide a 7
8 theory that o ers (i)-(iii). The remainder of the paper is organized as follows. The second section reviews the directed acyclic graph causal framework and provides some basic de nitions; the third section presents the theory which allows for the graphical representation of minimal su cient causes within the directed acyclic graph causal framework; the fourth section describes certain equivalences between minimal su cient causation and the notion of a monotonic e ect; the fth section considers the relation between minimal su cient causation and the sign of conditional covariances; the sixth section provides some discussion concerning possible extensions to the present work. 2. Basic De nitions and Concepts. In this section we review the directed acyclic graph causal framework and give a number of de nitions regarding su cient conjunctions and related concepts. Following Pearl [23], a causal directed acyclic graph is a set of nodes (X 1 ; :::; X n ) corresponding to variables and directed edges amongst nodes such that the graph has no cycles and such that for each node X i on the graph the corresponding variable is given by its non-parametric structural equation X i = f i (pa i ; i ) where pa i are the parents of X i on the graph and the i are mutually independent. These non-parametric structural equations can be seen as a generalization of the path analysis and linear structural equation models [23, 24] developed by Wright [42] in the genetics literature and Haavelmo [9] in the econometrics literature. Robins [29, 30] discusses the close relationship between these non-parametric structural equation models and fully randomized causally interpreted structured tree graphs [27, 28]. Spirtes et al. [36] present a causal interpretation of directed acyclic graphs outside the context of non-parametric structural equations and counterfactual variables. The non-parametric structural equations encode counterfactual relationships amongst the variables represented on the graph. 8
9 The equations themselves represent one-step ahead counterfactuals with other counterfactuals given by recursive substitution. A node E will be a parent of D if there is some level of all variables that precede D such that intervening to set E to di erent levels will allow D to vary even after intervening to x all other variables that precede D. If there exists some level of A such that intervening to set C to di erent levels will allow B to vary even after xing A and there exists some level of B such that intervening to set C to di erent levels will allow A to vary even after xing B then C is said to be a common cause of A and B. The requirement that the i be mutually independent is essentially a requirement that there is no variable absent from the graph which, if included on the graph, would be a parent of two or more variables [23, 24]. A path is a sequence of nodes connected by edges regardless of arrowhead direction; a directed path is a path which follows the edges in the direction indicated by the graph s arrows; a collider is a particular node on a path such that both the preceding and subsequent nodes on the path have directed edges going into that node i.e. both the edge to and the edge from that node have arrowheads into the node. A path between A and B is said to be blocked given some set of variables Z if either there is a variable in Z on the path that is not a collider or if there is a collider on the path such that neither the collider itself nor any of its descendants are in Z. If all paths between A and B are blocked given Z then A and B are said to be d-separated given Z. It has been shown that if all paths between A and B are blocked given Z then A and B are conditionally independent given Z [40, 7, 14]. The directed acyclic graph causal framework has proven to be particularly useful in determining whether conditioning on a given set of variables, or none at all, is su cient to control for confounding. Let D E=e denote the counterfactual value of D intervening to set 9
10 E = e. Pearl [23] showed that for intervention variable E and outcome D, if a set of variables Z such that no variable in Z is a descendent of E blocks all "back-door paths" from E to D (i.e. all paths with directed edges into E) then conditioning on Z su ces to control for confounding for the estimation of the causal e ect of E on D and this causal e ect is then given by E(D E=e ) = P z E(DjE = e; Z = z)pr(z = z). Note that this is a graphical generalization of Theorem 4 of Rosenbaum and Rubin [31] and of the g-formula [27, 28, 36, 22]. In giving de nitions for a su cient conjunctions and related concepts, we will use the following notation. An event is a binary variable taking values in f0; 1g. The complement of some event E we will denote by E. A conjunction or product of the events X 1 ; :::; X n will be written as X 1 :::X n. The associative OR operator, W, is de ned by A W B = A + B AB. For a random variable A with sample space we will use the notation A 0 to denote that A(!) = 0 for all! 2. We will use the notation 1 A=a to denote the indicator function for the random variable A taking the value a; for some subset S of the sample space we will use 1 S to denote the indicator that! 2 S. We will use the notation A a BjC to denote that A is conditionally independent of B given C. We begin with the de nitions of a su cient conjunction and a minimal su cient conjunction. These basic de nitions make no reference to directed acyclic graphs or causation. Definition 1. A set of events X 1 ; :::; X n is said to constitute a su cient conjunction for event D if X 1 :::X n = 1 ) D = 1. Definition 2. A set of events X 1 ; :::; X n is said to constitute a minimal su cient conjunction for an event D if X 1 :::X n = 1 ) D = 1 and there is no proper subset X i1 ; :::; X ik of X 1 ; :::; X n such that X i1 :::X ik = 1 ) D = 1. 10
11 Su cient conjunctions for a particular event need not be causes for an event. Suppose a particular sound is produced when and only when an individual blows a whistle. This particular sound the whistle makes is a su cient conjunction for the whistle s having been blown but the sound does not cause the blowing of the whistle. The converse rather is true - the blowing of the whistle causes the sound to be produced. Corresponding then to these notions of a su cient conjunction and a minimal su cient conjunction are those of a su cient cause and a minimal su cient cause which will be de ned in Section 3. Definition 3. A set of events M 1 ; :::; M n, each of which may be some product of events, is said to be determinative for some event D if D = M 1 W M2 W ::: W Mn. Definition 4. If M 1 ; :::; M n is a determinative set of (minimal) su cient conjunctions for D such that there is no proper subset M i1 ; :::; M ik of M 1 ; :::; M n that is also a determinative set of (minimal) su cient conjunctions for D then M 1 ; :::; M n is said to constitute a non-redundant determinative set of (minimal) su cient conjunctions for D. Example 1. Suppose A = B W CE and D = EF. If we consider all the minimal su cient conjunctions for A among the events fb; C; Dg we can see that B and CD are the only minimal su cient conjunctions but it is not the case that A = B W CD. Clearly then a complete list of minimal su cient conjunctions for A generated by a particular collection of events may not be a determinative set of su cient conjunctions for A. If we consider all minimal su cient conjunctions for A among the events fb; C; D; Eg we see that B and CD and CE are all minimal su cient conjunctions. In this example, B W CD W CE is a determinative set of minimal su cient conjunctions for A but is not non-redundant. We see then 11
12 that even when a complete list of minimal su cient conjunctions generated by a particular collection of events constitutes a determinative set of minimal su cient conjunctions it may not be a non-redundant determinative set of minimal su cient conjunctions. 3. Minimal Su cient Causation and Directed Acyclic Graphs. Causal directed acyclic graphs provide a useful framework in which to make use of these ideas of su cient conjunctions and minimal su cient conjunctions. With our basic de nitions in place we can develop theory concerning minimal su cient causation by stating and proving a number of results relating su cient conjunctions to directed acyclic graphs. Theorem 1. Consider a causal directed acyclic graph G with some node D such that D and all its parents are binary. Suppose that there exists a set of binary variables A 0 ; :::; A u such that a determinative set of su cient conjunctions for D, say M 1 ; :::; M S, can be formed from conjunctions of A 0 ; :::; A u along with the parents of D on G and the complements of these variables. Suppose further that there exists a causal directed acyclic graph H such that the parents of D on H that are not on G consist of the nodes A 0 ; :::; A u and such that G is the marginalization of H over the set of variables which are on the graph for H but not G. Then the directed acyclic graph J formed by adding to H the nodes M 1 ; :::; M S, removing the directed edges into D from the parents of D on H, adding directed edges from each M i into D and adding directed edges into each M i from every parent of D on H which appears in the conjunction for M i is itself a causal directed acyclic graph. Proof. To prove that the directed acyclic graph J is a causal directed acyclic graph it is necessary to show that each of the nodes on the directed acyclic graph can 12
13 be represented by a non-parametric structural equation involving only the parents on J of that node and a random term i which is independent of all other random terms j in the non-parametric structural equations for the other variables on the graph. The non-parametric structural equation for M i may be de ned as the product of events in the conjunction for M i. The non-parametric structural equation for D can be given by D = M 1 W ::: W Mn : The non-parametric structural equations for all other nodes on J can be taken to be the same as those de ning the causal directed acyclic graph H. Because the non-parametric structural equations for D and for each M i on J are deterministic, they have no random error term. Thus, for the non-parametric structural equations de ning D and each M i on J, the requirement that the non-parametric structural equation s random term i is independent of all the other random terms j in the non-parametric structural equations for the other variables on the graph is trivially satis ed. That this requirement is satis ed for the non-parametric structural equations for the other variables on J follows from the fact that it is satis ed on H. In Theorem 1 su cient conjunctions for D are constructed from some set of variables that, on some causal directed acyclic graph H, are all parents of D and thus, within the directed acyclic graph causal framework, it makes sense to speak of su cient causes and minimal su cient causes. Definition 5. If on a causal directed acyclic graph some node D with nonparametric structural equation D = f D (pa D ; D ) is such that D and all its parents are binary then X 1 ; :::; X n is said to constitute a su cient cause for D if X 1 ; :::; X n are all parents of D or complements of the parents of D and are such 13
14 that f D (pa D ; D ) = 1 for all D whenever pa D is such that X 1 :::X n = 1; if no proper subset of X 1 ; :::; X n also constitutes a su cient cause for D then X 1 ; :::; X n is said to constitute a su cient cause for D. A set of (minimal) su cient causes, M 1 ; :::; M n, each of which is a product of the parents of D and their complements, is said to be determinative for some event D if for all D, f D (pa D ; D ) = 1 if and only if pa D is W W W such that M 1 M2 ::: Mn = 1; if no proper subset of M 1 ; :::; M n is also determinative for D then M 1 ; :::; M n is said to constitute a non-redundant determinative set of (minimal) su cient causes for D. Definition 6. If for some directed acyclic graph G there exist A 0 ; :::; A u which satisfy the conditions of Theorem 1 for some node D on G so that a determinative set of su cient causes for D can be constructed from A 0 ; :::; A u along with the parents of D on G and their complements then D is said to admit a su cient causation structure. In the examples below we will use the following notation. First, we will in general replace the M i nodes with the conjunctions that constitute them. Second, the directed edges from the A i nodes and the parents of D into the M i nodes and from the M i nodes into D represent deterministic dependencies. The node D with directed edges from the M i nodes is e ectively an OR node. The M i nodes with the directed edges from the A i nodes and the parents of D on G are e ectively AND nodes. To indicate these deterministic dependencies we add to the diagram an ellipse around the M i nodes. We call this resulting diagram a causal directed acyclic graph with a su cient causation structure (or a minimal su cient causation structure if the determinative set of su cient conjunctions for D are each minimal su cient conjunctions). If a determinative set of su cient causes for D can be constructed simply from 14
15 the parents of D on G then H can be taken to be G. If a set of variables A 0 ; :::; A u satisfying Theorem 1 can be constructed from functions of the random term U = G D of the non-parametric structural equation for D on G and their complements so that A i = f i (U) then H can be chosen to be the graph G with the additional nodes U; A 0 ; :::; A u and with directed edges from U into each A i and from each A i into D. Our rst several examples will be cases in which no additional nodes A 0 ; :::; A u are needed to form a determinative set of su cient causes for D but rather in which a determinative set of su cient causes can be formed just from the parents of D on the original causal directed acyclic graph G. Example 2. Consider a causal directed acyclic graph given in Figure 2(i) and suppose E 1 E 2 and E 3 E 4 constituted a determinative set of su cient causes for D. Then by Theorem 1, the graph in Figure 2(ii) is also a causal directed acyclic graph. Similarly if Figure 2(iii) represents a causal directed acyclic graph and if E 1 E 2 and E 2 E 3 constitutes a determinative set of su cient causes for D then by Theorem 1, the graph in Figure 2(iv) is also a causal directed acyclic graph. Note that in Figure 2(iv) there are directed edges from E 2 into those su cient cause nodes involving E 2 and into those involving E 2. E 1 E 1 E 1 E 1 E 2 E 1 E 2 D E 2 D E 2 D E 2 E 1 E 2 D E 3 E 2 E 3 E 3 E 3 E 4 E 3 E 4 E 3 E 4 (i) (ii) (iii) (iv) Fig. 2. Causal directed acyclic graphs with su cient causation structures. 15
16 Theorem 1 provides a link between all four of the causal model frameworks discussed by Greenland and Brumback [8]: graphical models, potential outcome (counterfactual) models, su cient-component cause models and structural equation models. The four are linked through non-parametric structural equations. Graphical models as developed by Pearl [23] are diagrammatic shorthand for nonparametric structural equations. Non-parametric structural equations can be interpreted as sets of counterfactual relations. Theorem 1 provides the nal link by relating su cient-component cause models to non-parametric structural equations and thereby also graphical models. Non-parametric structural equations may thus be seen as a framework encompassing all four of these approaches to representing causal relations. Because a causal directed acyclic graph with a su cient causation structure is itself a causal directed acyclic graph, the d-separation criterion applies and allow one to determine independencies and conditional independencies. A minimal su cient causation structure will often make apparent conditional independencies within strata which were not apparent on the original causal directed acyclic graph. Two corollaries to Theorem 1 are particularly useful in this regard. Corollary 1. If some node D on a causal directed acyclic graph admits a su cient causation structure then conditioning on D = 0 conditions also on all su cient cause nodes for D on the causal directed acyclic graph with the su cient causation structure. Corollary 2. Suppose that all the parents of some node D on a causal directed acyclic graph are binary and independent and that D admits a su cient causation structure, then the parents of D on the causal directed acyclic graph with the su cient causation structure can be broken into equivalence classes where two 16
17 elements share an equivalence class if on the causal directed acyclic graph with the su cient causation structure there exists a path between them involving only the set of parents of D and the su cient cause nodes. Any two causes not in the same equivalence class are conditionally independent given D = 0. Example 2 (continued). Consider the causal directed acyclic graph with the minimal su cient causation structure given in Figure 2(ii). Conditioning on D = 0 also conditions on E 1 E 2 = 0 and E 3 E 4 = 0 and thus by the d-separation criteria E i is conditionally independent of E j given D = 0 for i 2 f1; 2g; j 2 f3; 4g. In the causal directed acyclic graph with the minimal su cient causation structure in Figure 2(iv) no similar conditional independence relations within the D = 0 stratum holds. Although conditioning on D = 0 conditions also on E 1 E 2 = 0 and E 2 E 3 = 0 there still remains an unblocked path E 1 E 1 E 2 E 2 E 2 E 3 E 3 between E 1 and E 3 and so E 1 and E 3 are not conditionally independent given D = 0 and similarly there are unblocked paths between E 1 and E 2 and also between E 2 and E 3 given D = 0. The additional nodes A 0 ; :::; A u required to form a determinative set of su cient conjunctions for D will generally not be unique. For example, if D = A 0 W A1 E then it is also the case that D = B 0 W B1 E where B 0 = A 0 and B 1 = A 0 A 1. Similarly, there will in general be no unique set of su cient causes that is determinative for D. For example if E 1 and E 2 constitute a set of su cient causes for D so that D = E 0 W E1 then it is also the case that E 1 E 2, E 1 E 2, and E 1 E 2 also constitute a set of su cient causes for D and so we could also write D = E 1 E 2 W E1 E 2 W E1 E 2. It can be shown that not even non-redundant determinative sets of minimal su cient causes are unique. Corresponding to the de nition of a su cient cause is the more philosophical 17
18 notion of a causal mechanism. A causal mechanism can be conceived of as a set of events or conditions which, if all present, inevitably bring about the outcome under consideration in a particular manner. A causal mechanism thus provides a particular description of how the outcome comes about. Suppose for instance that an individual were exposed to two poisons, E 1 and E 2, such that in the absence of E 2, the poison E 1 would lead to heart failure resulting in death; and that in the absence of E 1, the poison E 2 would lead to respiratory failure resulting in death; but such that when E 1 and E 2 are both present, they interact and lead to a failure of the nervous system again resulting in death. In this case there are three distinct causal mechanisms for death each corresponding to a su cient cause for D: death by heart failure corresponding to E 1 E 2, death by respiratory failure corresponding to E 1 E 2, and death due to a failure of the nervous system corresponding to E 1 E 2. It is interesting to note that in this case none of the su cient causes corresponding to the causal mechanisms is minimally su cient. Each of E 1 E 2, E 1 E 2, and E 1 E 2 is su cient for D but none is minimally su cient as either E 1 or E 2 alone is su cient for death. The last example shows that the existence of a particular set of determinative su cient causes does not guarantee that there are actual causal mechanisms corresponding to these su cient causes; it only implies that a set of causal mechanisms corresponding to these su cient causes cannot be ruled out by a complete knowledge of counterfactual outcomes. In particular, in the previous example, the set fe 1 ; E 2 g is a determinative set of su cient causes that does not correspond to the actual set of causal mechanisms fe 1 E 2 ; E 1 E 2 ; E 1 E 2 g. If there are two or more sets of su cient causes that are determinative for some outcome D then although the two sets of determinative su cient causes are logically equivalent for prediction, we 18
19 nevertheless view them as distinct. In such cases, some knowledge of the subject matter in question will in general be needed to discern which of the sets of determinative su cient causes actually corresponds to the true causal mechanisms. For instance, in the previous example, we needed biological knowledge of how poisons brought about death in the various scenarios. We will, in the interpretation of our results, assume that there always exists some set of true causal mechanisms which forms a determinative set of su cient causes for the outcome. The concept of synergism is closely related to that of a causal mechanism and is often found in the epidemiologic literature [32, 35, 11]. We will say that there is synergism between the e ects of E 1 and E 2 on D if there exists a su cient cause for D which represents some causal mechanism and such that this su cient cause has E 1 and E 2 in its conjunction. In related work, we have developed tests for synergism i.e. tests for the joint presence of two or more causes in a single su cient cause [39]. As noted in the introduction, in some of our examples and in our discussion of the various results in the paper we will sometimes make reference to the concepts of a causal mechanism and synergism. However, all de nitions, propositions, lemmas, theorems and corollaries will be given in terms of su cient causes for which we have a precise de nition. The graphical representation of su cient causes on a causal directed acyclic graph does not require that the determinative set of su cient causes for D be minimally su cient, nor does it require that the set of determinative su cient causes for D be non-redundant. To expand a directed acyclic graph into another directed acyclic graph with su cient cause nodes, all that is required is that the set of su - cient causes constitutes a determinative set of su cient causes for D. However, a set of events that constitutes a su cient cause can be reduced to a set of events that 19
20 constitutes a minimal su cient cause by iteratively excluding unnecessary events from the set until a minimal su cient cause is obtained. Also a set of determinative su cient causes that is redundant can be reduced to one that is non-redundant by excluding those su cient causes or minimal su cient causes that are redundant. It is sometimes an advantage to reduce a redundant set of su cient causes to a nonredundant set of minimal su cient causes. This is so because allowing su cient causes that are not minimally su cient or allowing redundant su cient causes or redundant minimal su cient causes can obscure the conditional independence relations implied by the structure of the causal directed acyclic graph. This is made evident in Example 3. Example 3. Consider a causal directed acyclic graph with the minimal su - cient causation structure indicated in Figure 3(i). A AB A AB B C D B BC D C C BC (i) (ii) Fig. 3. Example illustrating that non-minimal su cient causes can obscure conditional independence relations. Conditioning on D = 0 conditions also on AB = 0 and C = 0 and by the d- separation criterion, A is conditionally independent of C given D = 0. However, consider a logically equivalent su cient causation structure for this causal directed acyclic graph which allows su cient causes that are not minimal su cient causes as given in Figure 3(ii). Here BC and BC are su cient causes but not minimal 20
21 su cient causes. Conditioning on D = 0 conditions also on AB = 0, BC = 0 and BC = 0 but the d-separation criterion no longer implies that A and C are conditionally independent given D = 0 because on the causal directed acyclic graph given in Figure 3(ii), there is an unblocked path between A and C conditioning on D = 0, namely A AB B BC C. Thus from the causal directed acyclic graph given in Figure 3(i) it was possible to use the d-separation criterion to identify the conditional independence of A and C given D = 0. But from the causal directed acyclic graph given in Figure 3(ii) the d-separation criterion would not identify this conditional independence relation even though the two directed acyclic graphs describe the same causal structure. Allowing su cient causes that are not minimal su cient causes obscures the conditional independence relation. Similar examples can be constructed to show that allowing redundant su cient causes or redundant minimal su cient causes can also obscure conditional independence relations. Although allowing su cient causes that are not minimally su cient or allowing redundant su cient causes or redundant minimal su cient causes can obscure the conditional independence relations implied by the structure of the causal directed acyclic graph, it may sometimes be desirable to include non-minimal su cient causes or redundant su cient causes. For example, as noted above, non-minimal su cient cause nodes or redundant su cient cause nodes may represent separate causal mechanisms upon which it might be possible to intervene. Further discussion of conditional independence relations in su cient causation structures with non-minimally su cient causes and redundant su cient causes is given in Section 6. Theorem 1 gives rise to several further de nitions presented below. Definition 7. If some node D on a causal directed acyclic graph G admits a su cient causation structure then the parents of D are said to be the main causes 21
22 of D. Definition 8. The conjunction of main causes and their complements in a particular su cient cause for D is said to be a principal cause for D. Definition 9. If some node D admits a su cient causation structure, the additional variables A 0 ; :::; A u needed to form a set of su cient causes for D are said to be the co-causes of D. If a co-cause appears in a su cient conjunction without any main cause in its conjunction then it is said to be a residual co-cause (and will generally be denoted by A 0 ). Otherwise the co-cause is said to be a non-residual co-cause. A couple additional comments merit attention. First, if the main causes of D are labeled E 1 ; :::; E m then each su cient cause M j must either include the main cause E i in its conjunction or include E i in its conjunction or include neither E i nor E i in its conjunction; clearly it cannot include both. There are thus 3 m possible combinations of the E i s and their complements that may appear as principal causes. Second, a su cient cause need only involve one co-cause A i in its conjunction because if it involved A i1 ; :::; A ik then A i1 ; :::; A ik could be replaced by the product A 0 i = A i 1 :::A ik. In certain cases though, it may be desirable to include more than one A i in a su cient cause if this corresponds to the actual causal mechanisms. However, if only one A i is included in each su cient cause, then there will need to be at most 3 m su cient causes each of which will involve in its conjunction only one co-cause A i and one of the 3 m possible combinations of the E i s and E i s that may be included. It was noted above that if a set of variables A 0 ; :::; A u satisfying Theorem 1 can be constructed from functions of the random term U = G D of the non-parametric 22
23 structural equation for D on G and their complements so that A i = f i (U) then H can be chosen to be the graph G with the additional nodes U; A 0 ; :::; A u and with directed edges from U into each A i and from each A i into D. This gives rise to the de nition, given below, of a representation for D. Definition 10. If D and all of its parents on the causal directed acyclic graph G are binary and there exists some set fa i ; P i g such that each P i is some conjunction of the parents of D and their complements and such that there exist functions f i for which A i = f i ( D ) where D is the random term in the non-parametric structural equation for D on G and such that D = W i A ip i then fa i ; P i g is said to constitute a representation for D. The non-parametric structural equation for D is given by D = f(pa D ; " D ). Suppose D has m parents on the original causal directed acyclic graph G. Since these parents are binary there are 2 m values which pa D can take. Since f maps (pa D ; " D ) to f0; 1g each value of " D assigns to every possible realization of pa D either 0 or 1 through f. There are 2 2m such assignments. Thus without loss of generality we may assume that " D takes on some nite number of distinct values N 2 2m and so we may write the sample space for " D as D = f! 1 ; :::! N g and we may use! =! i and " D = " D (! i ) interchangeably. If, in a representation for D, for some principal cause P j the corresponding cocause is such that A j 0 then we will suppress A j P j from the disjunction W i A ip i. In other words, we will use W i A ip i as shorthand for W i:a i 6=0 A ip i and we will refer to A i as a co-cause and to P i as a principal cause only if A i is not identically 0. Furthermore, if in a representation for D, W i A ip i, a co-cause A j itself constitutes a su cient cause, M j, without any main cause in its conjunction then the principal cause for this su cient cause will be suppressed; we will write M j = A j. As noted 23
24 above, we will typically denote this co-cause with the subscript 0 i.e. M j = A 0. If, on the other hand, a principal cause is such that the principal cause itself, without any co-cause, constitutes a su cient cause, M j, for D then the co-cause for this su cient cause will be suppressed; we will write M j = P j. The su cient causes in all of the examples we have considered thus far have consisted of principal causes without co-causes; in these examples it was possible to construct a determinative set of su cient causes for D from the parents of D alone; no co-causes were necessary. If the A i variables are constructed from functions of the random term D in the non-parametric structural equation for D on G then these A i variables may or may not allow for interpretation and they may or may not be such that an intervention on these A i variables is conceivable. In certain cases the A i variables may simply be logical constructs for which no intervention is conceivable. Although in certain cases it may not be possible to intervene on the A i variables, we will still refer to conjunctions of the form A i P i as su cient causes for D as it is assumed that it is possible to intervene on the parents of D which constitute the conjunction for P i. Suppose that for some node D on a causal directed acyclic graph G, a set of variables A 0 ; :::; A u satisfying Theorem 1 can be constructed from functions of the random term U = D in the non-parametric structural equation for D on G so that a representation for D is given by D = W i A ip i. Then, in order to simplify the diagram, instead of adding to G the variable U and directed edges from U into each A i so as to form the minimal su cient causation structure, we will sometimes suppress U and simply add an asterisk next to each A i indicating that the A i variables have a common cause. Proposition 1. For any representation for D, the co-causes A i will be independent of the parents of D on the original directed acyclic graph G. 24
25 Proof. This follows immediately from the fact that for any representation for D, the co-causes are functions of the random term in the non-parametric structural equation for D. The examples considered in Figures 2 and 3 have all been such that it was possible to construct determinative sets of su cient causes for D from the parents of D on the original directed acyclic graph G and the complements of such parents; no co-causes have been necessary. If it is not possible to construct a determinative set of su cient causes from the parents of D on G or if some of the su cient causes for D are unknown then it is not obvious how one might make use of Theorem 1. The theorem allowed for a su cient causation structure on a causal directed acyclic graph provided there existed some set of co-causes A 0 ; :::; A u. Theorem 2 complements Theorem 1 in that it essentially states that when D and all of its parents are binary such a set of co-causes always exists. The variables A 0 ; :::; A u are constructed from functions of the random term D in the non-parametric structural equation for D on G. Theorem 2. Consider a causal directed acyclic graph G on which there exists some node D such that D and all its parents are binary then there exist variables A 0 ; :::; A u that satisfy the conditions of Theorem 1 and such that the su cient causes constructed from A 0 ; :::; A u along with the parents of D on G and their complements are in fact minimal su cient causes. Proof. The co-causes A 0 ; :::; A u can then be constructed as follows. Let W i be the indicator 1 "D =" D (! i ). Let P i be some conjunction of the main causes and their complements i.e. P i = F i 1 :::F i n i where each F i k is either a parent of D, say E j or its complement E j. For each potential principal cause P i, let A i 1 if F i 1 :::F i n i 25
26 is a minimal su cient cause for D and A i = W j fw j : W j F i 1:::F i n i is a minimal su cient cause for Dg otherwise. Let M i = P i if A i = 1 and M i = A i P i otherwise. It must be shown that each M i = A i F1 i:::f n i i is a minimal su cient cause and that the set of M i s constitutes a minimal su cient cause representation for D (or more precisely, the set of M i s for which A i is not identically 0 constitutes a minimal su cient cause representation for D). We rst show that each M i = A i F1 i:::f n i i is a minimal su cient cause for D. Clearly this is the case if A i 1. Now consider those A i such that A i is not identically 0 and not identically 1 and suppose A i = W1 i W W ::: W i i. If A i F i 1 :::F i n i is not a minimal su cient cause then either F i 1 :::F i n i = 1 ) D = 1 or there exists j such that A i F i 1:::F i j 1F i j+1:::f i n i ) D = 1: Suppose rst that F i 1 :::F i n i = 1 ) D = 1 then there does not exist a W j such that W j F i 1 :::F i n i is a minimal su cient cause for D; but this contradicts A i is not identically 1. On the other hand, if there exists j such that A i F i 1 :::F i j 1 F i j+1 :::F i n i ) D = 1 then it is also the case that W i 1F i 1:::F i j 1F i j+1:::f i n i ) D = 1 and so W i 1 F i 1 :::F i n i is not a minimal su cient cause for D; but this contradicts A i = W i 1 W ::: W W i i : Thus A i F i 1 :::F i n i must be a minimal su cient cause for D. It remains to be shown that the set of M i s for which A i is not identically 0 constitutes 26
27 a minimal su cient cause representation for D. We must show that if D = 1 then there exists a M i = A i P i for which M i = 1. Now D is a function of (" D ; E 1 ; :::; E m ) so let (" D ; E 1 ; :::; E m) be any particular value of (" D ; E 1 ; :::; E m ) for which D = 1. Consider the set fe 1 ; :::; E m g. If for any j, " D = " D; E 1 = E 1; :::; E j 1 = E j 1; E j+1 = E j+1; :::; E m = E m ) D = 1 remove E j from fe 1 ; :::; E m g. Continue to remove those E j from this set which are not needed to maintain the implication D = 1. Suppose the set that remains is fe h1 ; :::; E hs g then either E h1 = E h 1 ; :::; E hs = E h S ) D = 1 or E h1 = E h 1 ; :::; E hs = E h S ; D = 1 and " D = " D; E h1 = E h 1 ; :::; E hs = E h S ) D = 1: If E h1 = Eh 1 ; :::; E hs = Eh S ) D = 1 then if we de ne F j as the indicator F j = 1 (Ehj =Eh j );then F 1 :::F S is a minimal su cient cause for D and there thus exists an i such that P i = F 1 :::F S and M i = P i and when E h1 = Eh 1 ; :::; E hs = Eh S we have M i = 1. If E h1 = Eh 1 ; :::; E hs = Eh S ; D = 1 but " D = " D ; E h 1 = Eh 1 ; :::; E hs = E h S ) D = 1 then if we de ne F j as the indicator 1 (Ehj =E h j ), 1 "D =" D F 1:::F S is a minimal su cient cause for D and there exists an i such that M i = A i P i and P i = F 1 :::F S and " D = " D ) A i = 1 and such that " D = " D; E h1 = E h 1 ; :::; E hs = E h S ) M i = 1: We have thus shown when D = 1 there exists an M i such that M i = 1 and so the M i s constitutes a minimal su cient cause representation for D. 27
28 The variables A i constructed in Theorem 2 along with their corresponding principal causes P i we de ne below as the canonical representation for D. Definition 11. Consider a causal directed acyclic graph G such that some node D and all of its parents are binary. Let D be the sample space for the random term D in the non-parametric structural equation for D on G. The principal causes P i = F i 1 :::F i n i, where each F i k is either a parent of D or the complement of a parent of D, along with the variables A i constructed by A i 1 if F i 1 :::F i n i is a minimal su cient cause for D and A i = W! j 2 D f1 "D =" D (! j ) : 1 "D =" D (! j )F i 1 :::F i n i is a minimal su cient cause for Dg otherwise is said to be the canonical representation for D. As noted above there will in general exist more than one set of co-causes A 0 ; :::; A u which together with the main causes and their complements can be used to construct a su cient cause representation for D. The set of A i s in the canonical representation constitutes only one particular set of variables which can be used to construct a su cient cause representation. The canonical representation in a sense "favors" principal causes with fewer terms in their conjunction. The canonical representation will never have A i = 1 for some principal cause P i when there is a principal cause P j with A j = 1 and such that the components of P j are a subset of those in the conjunction for P i. This is made more precise below in Proposition 2 and Corollary 3. As stated and proved in Theorem 2, the canonical representation will always consist of a determinative set of minimal su cient causes. This determinative set of minimal su cient causes will sometimes but not always be a non-redundant set of minimal su cient causes. That the canonical representation may have redundant minimal su cient causes is demonstrated in Example 4. Example 4. Consider a binary variable D with three binary parents, E 1, E 2 28
MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS. By Tyler J. VanderWeele and James M. Robins University of Chicago and Harvard University
Submitted to the Annals of Statistics MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS By Tyler J. VanderWeele and James M. Robins University of Chicago and Harvard University Notions of minimal
More informationEmpirical and counterfactual conditions for su cient cause interactions
Empirical and counterfactual conditions for su cient cause interactions By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, Illinois
More informationThe identi cation of synergism in the su cient-component cause framework
The identi cation of synergism in the su cient-component cause framework By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637
More informationThe identification of synergism in the sufficient-component cause framework
* Title Page Original Article The identification of synergism in the sufficient-component cause framework Tyler J. VanderWeele Department of Health Studies, University of Chicago James M. Robins Departments
More informationFour types of e ect modi cation - a classi cation based on directed acyclic graphs
Four types of e ect modi cation - a classi cation based on directed acyclic graphs By TYLR J. VANRWL epartment of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL
More informationSigned directed acyclic graphs for causal inference
Signed directed acyclic graphs for causal inference By TYLER J. VANDERWEELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637 vanderweele@uchicago.edu
More informationCausality II: How does causal inference fit into public health and what it is the role of statistics?
Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual
More informationThe distinction between a biologic interaction or synergism
ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light
More informationANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS
ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization
More informationSensitivity analysis and distributional assumptions
Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu
More informationSingle World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality
Single World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality Thomas S. Richardson University of Washington James M. Robins Harvard University Working
More information6.3 How the Associational Criterion Fails
6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P
More informationThis paper revisits certain issues concerning differences
ORIGINAL ARTICLE On the Distinction Between Interaction and Effect Modification Tyler J. VanderWeele Abstract: This paper contrasts the concepts of interaction and effect modification using a series of
More informationDAGS. f V f V 3 V 1, V 2 f V 2 V 0 f V 1 V 0 f V 0 (V 0, f V. f V m pa m pa m are the parents of V m. Statistical Dag: Example.
DAGS (V 0, 0,, V 1,,,V M ) V 0 V 1 V 2 V 3 Statistical Dag: f V Example M m 1 f V m pa m pa m are the parents of V m f V f V 3 V 1, V 2 f V 2 V 0 f V 1 V 0 f V 0 15 DAGS (V 0, 0,, V 1,,,V M ) V 0 V 1 V
More informationA proof of Bell s inequality in quantum mechanics using causal interactions
A proof of Bell s inequality in quantum mechanics using causal interactions James M. Robins, Tyler J. VanderWeele Departments of Epidemiology and Biostatistics, Harvard School of Public Health Richard
More informationDirected acyclic graphs with edge-specific bounds
Biometrika (2012), 99,1,pp. 115 126 doi: 10.1093/biomet/asr059 C 2011 Biometrika Trust Advance Access publication 20 December 2011 Printed in Great Britain Directed acyclic graphs with edge-specific bounds
More informationCausal Effect Identification in Alternative Acyclic Directed Mixed Graphs
Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 16 A Complete Graphical Criterion for the Adjustment Formula in Mediation Analysis Ilya Shpitser, Harvard University Tyler J. VanderWeele,
More informationIgnoring the matching variables in cohort studies - when is it valid, and why?
Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association
More informationGov 2002: 4. Observational Studies and Confounding
Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What
More informationDesire-as-belief revisited
Desire-as-belief revisited Richard Bradley and Christian List June 30, 2008 1 Introduction On Hume s account of motivation, beliefs and desires are very di erent kinds of propositional attitudes. Beliefs
More informationGraphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence
Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects
More informationCausal Inference. Miguel A. Hernán, James M. Robins. May 19, 2017
Causal Inference Miguel A. Hernán, James M. Robins May 19, 2017 ii Causal Inference Part III Causal inference from complex longitudinal data Chapter 19 TIME-VARYING TREATMENTS So far this book has dealt
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationAbstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables
N-1 Experiments Suffice to Determine the Causal Relations Among N Variables Frederick Eberhardt Clark Glymour 1 Richard Scheines Carnegie Mellon University Abstract By combining experimental interventions
More informationEconomics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries
1. Introduction In this paper we reconsider the problem of axiomatizing scoring rules. Early results on this problem are due to Smith (1973) and Young (1975). They characterized social welfare and social
More informationStatistical Models for Causal Analysis
Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring
More informationAlvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587
Cycles of length two in monotonic models José Alvaro Rodrigues-Neto Research School of Economics, Australian National University ANU Working Papers in Economics and Econometrics # 587 October 20122 JEL:
More informationFrom Causality, Second edition, Contents
From Causality, Second edition, 2009. Preface to the First Edition Preface to the Second Edition page xv xix 1 Introduction to Probabilities, Graphs, and Causal Models 1 1.1 Introduction to Probability
More informationEstimation of direct causal effects.
University of California, Berkeley From the SelectedWorks of Maya Petersen May, 2006 Estimation of direct causal effects. Maya L Petersen, University of California, Berkeley Sandra E Sinisi Mark J van
More informationCausal Directed Acyclic Graphs
Causal Directed Acyclic Graphs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Causal DAGs Stat186/Gov2002 Fall 2018 1 / 15 Elements of DAGs (Pearl. 2000.
More informationRapid Introduction to Machine Learning/ Deep Learning
Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/32 Lecture 5a Bayesian network April 14, 2016 2/32 Table of contents 1 1. Objectives of Lecture 5a 2 2.Bayesian
More informationCausal Models with Hidden Variables
Causal Models with Hidden Variables Robin J. Evans www.stats.ox.ac.uk/ evans Department of Statistics, University of Oxford Quantum Networks, Oxford August 2017 1 / 44 Correlation does not imply causation
More informationIntroduction to Causal Calculus
Introduction to Causal Calculus Sanna Tyrväinen University of British Columbia August 1, 2017 1 / 1 2 / 1 Bayesian network Bayesian networks are Directed Acyclic Graphs (DAGs) whose nodes represent random
More informationA Distinction between Causal Effects in Structural and Rubin Causal Models
A istinction between Causal Effects in Structural and Rubin Causal Models ionissi Aliprantis April 28, 2017 Abstract: Unspecified mediators play different roles in the outcome equations of Structural Causal
More informationInstrumental Sets. Carlos Brito. 1 Introduction. 2 The Identification Problem
17 Instrumental Sets Carlos Brito 1 Introduction The research of Judea Pearl in the area of causality has been very much acclaimed. Here we highlight his contributions for the use of graphical languages
More informationComments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.
Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D. David Kaplan Department of Educational Psychology The General Theme
More informationCausality in Econometrics (3)
Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena moneta@econ.mpg.de 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena
More informationGraphical Representation of Causal Effects. November 10, 2016
Graphical Representation of Causal Effects November 10, 2016 Lord s Paradox: Observed Data Units: Students; Covariates: Sex, September Weight; Potential Outcomes: June Weight under Treatment and Control;
More informationCAUSALITY CORRECTIONS IMPLEMENTED IN 2nd PRINTING
1 CAUSALITY CORRECTIONS IMPLEMENTED IN 2nd PRINTING Updated 9/26/00 page v *insert* TO RUTH centered in middle of page page xv *insert* in second paragraph David Galles after Dechter page 2 *replace* 2000
More informationA Decision Theoretic Approach to Causality
A Decision Theoretic Approach to Causality Vanessa Didelez School of Mathematics University of Bristol (based on joint work with Philip Dawid) Bordeaux, June 2011 Based on: Dawid & Didelez (2010). Identifying
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationFaithfulness of Probability Distributions and Graphs
Journal of Machine Learning Research 18 (2017) 1-29 Submitted 5/17; Revised 11/17; Published 12/17 Faithfulness of Probability Distributions and Graphs Kayvan Sadeghi Statistical Laboratory University
More informationCAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles
CAUSALITY Models, Reasoning, and Inference Judea Pearl University of California, Los Angeles 1 CAMBRIDGE UNIVERSITY PRESS Preface page xiii 1 Introduction to Probabilities, Graphs, and Causal Models 1
More informationArrowhead completeness from minimal conditional independencies
Arrowhead completeness from minimal conditional independencies Tom Claassen, Tom Heskes Radboud University Nijmegen The Netherlands {tomc,tomh}@cs.ru.nl Abstract We present two inference rules, based on
More informationOn the Identification of a Class of Linear Models
On the Identification of a Class of Linear Models Jin Tian Department of Computer Science Iowa State University Ames, IA 50011 jtian@cs.iastate.edu Abstract This paper deals with the problem of identifying
More informationCasual Mediation Analysis
Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction
More informationEquivalence in Non-Recursive Structural Equation Models
Equivalence in Non-Recursive Structural Equation Models Thomas Richardson 1 Philosophy Department, Carnegie-Mellon University Pittsburgh, P 15213, US thomas.richardson@andrew.cmu.edu Introduction In the
More informationStructural Causal Models and the Specification of Time-Series-Cross-Section Models
Structural Causal Models and the Specification of Time-Series-Cross-Section Models Adam N. Glynn Kevin M. Quinn March 13, 2013 Abstract The structural causal models (SCM) of Pearl (1995, 2000, 2009) provide
More informationThe Causal Inference Problem and the Rubin Causal Model
The Causal Inference Problem and the Rubin Causal Model Lecture 2 Rebecca B. Morton NYU Exp Class Lectures R B Morton (NYU) EPS Lecture 2 Exp Class Lectures 1 / 23 Variables in Modeling the E ects of a
More informationIdentification of Conditional Interventional Distributions
Identification of Conditional Interventional Distributions Ilya Shpitser and Judea Pearl Cognitive Systems Laboratory Department of Computer Science University of California, Los Angeles Los Angeles, CA.
More informationBounding the Probability of Causation in Mediation Analysis
arxiv:1411.2636v1 [math.st] 10 Nov 2014 Bounding the Probability of Causation in Mediation Analysis A. P. Dawid R. Murtas M. Musio February 16, 2018 Abstract Given empirical evidence for the dependence
More informationarxiv: v4 [math.st] 19 Jun 2018
Complete Graphical Characterization and Construction of Adjustment Sets in Markov Equivalence Classes of Ancestral Graphs Emilija Perković, Johannes Textor, Markus Kalisch and Marloes H. Maathuis arxiv:1606.06903v4
More informationDirected Graphical Models
CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential
More informationPositive Political Theory II David Austen-Smith & Je rey S. Banks
Positive Political Theory II David Austen-Smith & Je rey S. Banks Egregious Errata Positive Political Theory II (University of Michigan Press, 2005) regrettably contains a variety of obscurities and errors,
More informationA counterfactual approach to bias and effect modification in terms of response types
uzuki et al. BM Medical Research Methodology 2013, 13:101 RARH ARTIL Open Access A counterfactual approach to bias and effect modification in terms of response types tsuji uzuki 1*, Toshiharu Mitsuhashi
More information1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution
NETWORK ANALYSIS Lourens Waldorp PROBABILITY AND GRAPHS The objective is to obtain a correspondence between the intuitive pictures (graphs) of variables of interest and the probability distributions of
More information2 Interval-valued Probability Measures
Interval-Valued Probability Measures K. David Jamison 1, Weldon A. Lodwick 2 1. Watson Wyatt & Company, 950 17th Street,Suite 1400, Denver, CO 80202, U.S.A 2. Department of Mathematics, Campus Box 170,
More informationWhat Causality Is (stats for mathematicians)
What Causality Is (stats for mathematicians) Andrew Critch UC Berkeley August 31, 2011 Introduction Foreword: The value of examples With any hard question, it helps to start with simple, concrete versions
More informationIdenti cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance
Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the
More informationOUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.
OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics
More informationConditional Independence
Conditional Independence Sargur Srihari srihari@cedar.buffalo.edu 1 Conditional Independence Topics 1. What is Conditional Independence? Factorization of probability distribution into marginals 2. Why
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More information5.3 Graphs and Identifiability
218CHAPTER 5. CAUSALIT AND STRUCTURAL MODELS IN SOCIAL SCIENCE AND ECONOMICS.83 AFFECT.65.23 BEHAVIOR COGNITION Figure 5.5: Untestable model displaying quantitative causal information derived. method elucidates
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationTime Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley
Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the
More informationFoundations of Mathematics MATH 220 FALL 2017 Lecture Notes
Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes These notes form a brief summary of what has been covered during the lectures. All the definitions must be memorized and understood. Statements
More informationExternal validity, causal interaction and randomised trials
External validity, causal interaction and randomised trials Seán M. Muller University of Cape Town Evidence and Causality in the Sciences Conference University of Kent (Canterbury) 5 September 2012 Overview
More informationSIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011
SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationCHAPTER 10. Gentzen Style Proof Systems for Classical Logic
CHAPTER 10 Gentzen Style Proof Systems for Classical Logic Hilbert style systems are easy to define and admit a simple proof of the Completeness Theorem but they are difficult to use. By humans, not mentioning
More informationGMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails
GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,
More informationCausal Inference. Prediction and causation are very different. Typical questions are:
Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting
More informationLearning Multivariate Regression Chain Graphs under Faithfulness
Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 2012 Learning Multivariate Regression Chain Graphs under Faithfulness Dag Sonntag ADIT, IDA, Linköping University, Sweden dag.sonntag@liu.se
More informationCAUSAL INFERENCE IN TIME SERIES ANALYSIS. Michael Eichler
CAUSAL INFERENCE IN TIME SERIES ANALYSIS Michael Eichler Department of Quantitative Economics, Maastricht University P.O. Box 616, 6200 MD Maastricht, The Netherlands November 11, 2011 1. ÁÒØÖÓ ÙØ ÓÒ The
More informationCMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th
CMPT 882 - Machine Learning Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th Stephen Fagan sfagan@sfu.ca Overview: Introduction - Who was Bayes? - Bayesian Statistics Versus Classical Statistics
More informationRecall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.
ecall from last time Lecture 3: onditional independence and graph structure onditional independencies implied by a belief network Independence maps (I-maps) Factorization theorem The Bayes ball algorithm
More informationTree sets. Reinhard Diestel
1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked
More informationThe decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling
The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling A.P.Dawid 1 and S.Geneletti 2 1 University of Cambridge, Statistical Laboratory 2 Imperial College Department
More informationStrong Lifting Splits
M. Alkan Department of Mathematics Akdeniz University Antalya 07050, Turkey alkan@akdeniz.edu.tr Strong Lifting Splits A.Ç. Özcan Department of Mathematics Hacettepe University Ankara 06800, Turkey ozcan@hacettepe.edu.tr
More informationThe Effects of Interventions
3 The Effects of Interventions 3.1 Interventions The ultimate aim of many statistical studies is to predict the effects of interventions. When we collect data on factors associated with wildfires in the
More informationStructure learning in human causal induction
Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use
More informationConfounding Equivalence in Causal Inference
In Proceedings of UAI, 2010. In P. Grunwald and P. Spirtes, editors, Proceedings of UAI, 433--441. AUAI, Corvallis, OR, 2010. TECHNICAL REPORT R-343 July 2010 Confounding Equivalence in Causal Inference
More informationOn the Identification of Causal Effects
On the Identification of Causal Effects Jin Tian Department of Computer Science 226 Atanasoff Hall Iowa State University Ames, IA 50011 jtian@csiastateedu Judea Pearl Cognitive Systems Laboratory Computer
More informationConfounding Equivalence in Causal Inference
Revised and submitted, October 2013. TECHNICAL REPORT R-343w October 2013 Confounding Equivalence in Causal Inference Judea Pearl Cognitive Systems Laboratory Computer Science Department University of
More informationIdentification and Overidentification of Linear Structural Equation Models
In D. D. Lee and M. Sugiyama and U. V. Luxburg and I. Guyon and R. Garnett (Eds.), Advances in Neural Information Processing Systems 29, pre-preceedings 2016. TECHNICAL REPORT R-444 October 2016 Identification
More informationOnline Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access
Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci
More informationUsing Descendants as Instrumental Variables for the Identification of Direct Causal Effects in Linear SEMs
Using Descendants as Instrumental Variables for the Identification of Direct Causal Effects in Linear SEMs Hei Chan Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan
More informationStochastic dominance with imprecise information
Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is
More informationNotes on Time Series Modeling
Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g
More informationEffect Modification and Interaction
By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions
More informationProbabilistic Causal Models
Probabilistic Causal Models A Short Introduction Robin J. Evans www.stat.washington.edu/ rje42 ACMS Seminar, University of Washington 24th February 2011 1/26 Acknowledgements This work is joint with Thomas
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationarxiv: v2 [math.st] 4 Mar 2013
Running head:: LONGITUDINAL MEDIATION ANALYSIS 1 arxiv:1205.0241v2 [math.st] 4 Mar 2013 Counterfactual Graphical Models for Longitudinal Mediation Analysis with Unobserved Confounding Ilya Shpitser School
More informationAn Introduction to Causal Inference, with Extensions to Longitudinal Data
An Introduction to Causal Inference, with Extensions to Longitudinal Data Tyler VanderWeele Harvard Catalyst Biostatistics Seminar Series November 18, 2009 Plan of Presentation Association and Causation
More informationRings, Integral Domains, and Fields
Rings, Integral Domains, and Fields S. F. Ellermeyer September 26, 2006 Suppose that A is a set of objects endowed with two binary operations called addition (and denoted by + ) and multiplication (denoted
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this
More informationImplementing Proof Systems for the Intuitionistic Propositional Logic
Implementing Proof Systems for the Intuitionistic Propositional Logic Veronica Zammit Supervisor: Dr. Adrian Francalanza Faculty of ICT University of Malta May 27, 2011 Submitted in partial fulfillment
More informationm-transportability: Transportability of a Causal Effect from Multiple Environments
m-transportability: Transportability of a Causal Effect from Multiple Environments Sanghack Lee and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State
More informationarxiv: v2 [cs.lg] 9 Mar 2017
Journal of Machine Learning Research? (????)??-?? Submitted?/??; Published?/?? Joint Causal Inference from Observational and Experimental Datasets arxiv:1611.10351v2 [cs.lg] 9 Mar 2017 Sara Magliacane
More informationGeneralized Do-Calculus with Testable Causal Assumptions
eneralized Do-Calculus with Testable Causal Assumptions Jiji Zhang Division of Humanities and Social Sciences California nstitute of Technology Pasadena CA 91125 jiji@hss.caltech.edu Abstract A primary
More information