Causal Regression Models II: Unconfoundedness and Causal Unbiasedness 1

Size: px
Start display at page:

Download "Causal Regression Models II: Unconfoundedness and Causal Unbiasedness 1"

Transcription

1 Methods of Psychological Research Online 2000, Vol.5, No.3 Internet: Institute for Science Education 2000 IPN Kiel Causal Regression Models II: Unconfoundedness and Causal Unbiasedness 1 By Rolf Steyer 2, Siegfried Gabler 3, Alina A. von Davier 4, and Christof Nachtigall 4 Abstract We consider regression models with discrete units and a discrete treatment variable. In this framework, individual and average causal effects as well as causal unbiasedness of conditional expected values E(Y X = x) and of their differences were defined in a previous paper where it was also noted that a hypothesis of causal unbiasedness is not empirically testable outside the randomized experiment. Therefore, we study a stronger causality criterion which we call unconfoundedness. To our knowledge, this is the weakest empirically testable condition implying causal unbiasedness of the conditional expected values E(Y X = x). Unconfoundedness holds in randomized experiments, but it may hold in nonrandomized experiments, as well. We derive theorems about sufficient and necessary conditions, about sufficient conditions, and about necessary conditions for unconfoundedness. The latter identify the hypotheses to be tested in nonrandomized experiments when it comes to testing the weakest empirically testable sufficient condition for conditional expected values E(Y X = x ) to be causally unbiased. Keywords: Causality; Confounding; Regression Models; Randomization; Rubin s Approach to Causality 1 We would like to thank Albrecht Iseler (Free University Berlin) for a discussion in which the idea for some parts of this paper has been developed. We would also like to extend our gratitude to Angelika von der Linde (University of Edinburgh), to Donald B. Rubin (Harvard University), and to Peter Kischka (FSU Jena) who all gave helpful critical comments on previous versions of this paper. 2 Address for correspondence: Prof. Dr. Rolf Steyer, Friedrich-Schiller-Universität Jena, Am Steiger 3 Haus 1, D Jena, Germany. rolf.steyer@uni-jena.de 3 Center for Surveys, Methods, and Analyses (ZUMA), Mannheim, Germany 4 Friedrich-Schiller-University Jena, Germany

2 56 MPR-Online 2000, Vol. 5, No. 3 In a previous paper (Steyer, Gabler, von Davier, Nachtigall, & Buhl, 2000) we reformulated the theory of individual and average causal effects in terms of classical probability theory and illustrated it by some examples. This theory goes back to Neyman (1923/1990; 1935) and has been adopted and enriched by Rubin (1973a, b, 1974, 1977, 1978, 1985, 1986, 1990), Holland and Rubin (1983), Holland (1986, 1988a, b), Rosenbaum and Rubin (1983a, b, 1984, 1985a, b), Rosenbaum (1984a, b, c), and Sobel (1994, 1995), for instance. More specifically, we introduced a probabilistic framework: a potential causal regression model with discrete units and discrete treatment variable 5, which consists of three components: a probability space (Ω, A, P ), which represents the random experiment, i.e., the empirical phenomenon considered, a regression E(Y X ), the potential causal interpretation of which is focused, where X represents a discrete (but not necessarily univariate) treatment variable and Y a real-valued (and not necessarily continuous) response variable, a nonnumeric random variable U, the value of which is the observational unit drawn. The probability space is chosen such that X, Y, and U are random variables on (Ω, A, P ), i.e., X, Y, and U have a joint distribution. In this framework and notation, we presented the basic concepts of the theory of individual and average causal effects: the individual conditional expected values E(Y X = x, U = u), the individual causal effects ICE u (i, j) := E(Y X = x i, U = u) E(Y X = x j, U = u) of a treatment x i vs. a treatment x j, the causally unbiased expected value [denoted CUE(Y X = x)], and the average causal effect ACE(i, j) (see Figure 1 of Steyer et al., 2000). We also showed how these concepts are related to the conditional expected values E(Y X = x) and the prima facie effects PFE(i, j) = E(Y X = x i ) E(Y X = x j ) that are usually focused in statistical estimation and hypothesis testing. Specifically, it has been shown that both, (a) stochastic independence of U and X (that can be created via random assignment of units to treatment conditions) as well as (b) unit-treatment homogeneity, [i.e., E(Y X, U ) = E(Y X )] imply the equations E(Y X = x) = CUE(Y X = x)], and PFE(i, j) = ACE(i, j) [i.e., causal unbiasedness of E(Y X = x) and of PFE(i, j)]. We argued that the theory of individual and average causal effects reviewed above has some major limitations. The first one is that a proposition about causal unbiasedness is not empirically falsifiable: Postulating that the prima facie effect PFE(i, j) is equal to the average causal effect ACE(i, j) or that the conditional expected values E(Y X = x) are equal to the causally unbiased expected values CUE(Y X = x) of Y given x does not imply anything one could show to be wrong in an empirical appli- 5 For a more general formal framework, see Steyer (1992).

3 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 57 cation. The reason is that the computation of the ACE(i, j) and the CUE(Y X = x) involve all individual conditional expected values E(Y X = x, U = u) in the population and that it is not possible to estimate all of them. (See the fundamental problem of causal inference in Holland, 1986 or Steyer et al., 2000). The second limitation is that even if both PFE(i, j) = ACE(i, j) and E(Y X = x) = CUE(Y X = x) hold in the total population, the corresponding equations may not hold in the subpopulations, for instance, in the subpopulations of males and females. Hence, causal unbiasedness of the prima facie effects in the subpopulations might not hold although being correct in the total population. Because of the nonfalsifiability we argued that, although the theory of individual and average causal effects provides indispensable concepts, it is not really complete as a methodological basis for causal modeling outside the randomized experiment in which treatment assignment may not be random. Completing this methodological basis is the central goal of this paper. Specifically, we aim at enriching the theory of individual and average causal effects by introducing the concept of unconfoundedness of a regression E(Y X ). Unconfoundedness will be defined such that: (a) (b) a proposition that a regression E(Y X ) is unconfounded is empirically falsifiable, unconfoundedness of E(Y X ) implies that the regressions E W = w (Y X ) in the subpopulations, are causally unbiased as well. 6 Among all criteria fulfilling these requirements we prefer that criterion which is logically the weakest, i.e., which is implied by the others, but does not imply one of the others. If requirement (a) were not fulfilled, everybody could make causal propositions, nobody would have a chance to falsify it. Requirement (b) means that causal interpretations such as causal unbiasedness should be transferable from the total population into its subpopulations and in this sense be stable or invariant. Hence, the goal of this paper is to introduce a causality criterion that fulfills the requirements (a) and (b) mentioned above, compare it to some other criteria, illustrate it by an example and study its sufficient and necessary conditions. More specifically, we will present and discuss different causality criteria in section 1, including strong ignorability and unconfoundedness. The latter is, according to requirements (a) and (b), the most favorable one. In section 2 we will illustrate the different criteria by an example. Section 3 is devoted to a necessary and sufficient condition of unconfoundedness. Next we will study sufficient conditions of unconfoundedness (section 4), and then sufficient conditions of strong ignorability (section 5). Then we turn to the necessary conditions of 6 Each value w of W represents a subpopulation.

4 58 MPR-Online 2000, Vol. 5, No. 3 unconfoundedness (section 6). One of these necessary conditions is causal unbiasedness of E(Y X ). Section 7 focusses two general procedures to falsify unconfoundedness, which will be illustrated by an example in section 8. Section 9 deals with necessary conditions for unconfoundedness under special assumptions such as linearity and W- conditional linearity of E(Y X, W ), where W is a variable each value of which represents a subpopulation. Finally, the merits and limits of the theory presented are discussed in section Some Causality Criteria The weakest criterion we consider is causal unbiasedness of the treatment regression E(Y X ) in the total population, i.e., with E(Y X = x) = CUE(Y X = x), for each value x of X, (1) CUE(Y X = x) := Σ u (Y X = x, U = u) P(U = u), (2) where the summation is across all observational units u in the (total) population Ω U (i.e., the set of all units considered). This term has been defined to be the causally unbiased expected value of Y given x (Steyer et al., 2000). Equation (1) means that each E(Y X = x) is equal to the expected value of the individual expected values E(Y X = x, U = u) across the distribution of the observational units. Equation (1) is sufficient to imply PFE(i, j) = ACE(i, j), i.e., causal unbiasedness of the prima facie effects in the total population (see Steyer et al., 2000). The equation E(Y X = x) = CUE(Y X = x), defines causal unbiasedness of a conditional expected value E(Y X = x). From the perspective of the theory of individual and average causal effects, there is no doubt that Equation (1), and with it, the concept of causal unbiasedness, is a desirable and indispensable property of the conditional expected values E(Y X = x) in meaningful substantive applications. However, as mentioned before, Equation (1) neither implies anything that could be falsified in an empirical application [see requirement (a)] nor does it imply unbiasedness of the regressions E W = w (Y X ) in the subpopulations represented by W = w [see requirement (b)]. Hence, Equation (1) [causal unbiasedness of the regression E(Y X )] is too weak to qualify as a satisfactory causality criterion applicable to experimental studies with nonrandom assignment. 7 The same ar- 7 In a similar vein Pearl (1998) argues that unbiasedness might be incidental which necessitates the need for stable unbiasedness (see also Example III in Steyer et al., 2000).

5 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 59 gument applies, of course, to the criterion PFE(i, j) = ACE(i, j), causal unbiasedness of the prima facie effects. Both criteria lack empirical falsifiability. As a second criterion we consider strong ignorability described, e.g., by Rosenbaum and Rubin (1983a). They defined potential response variables Y i : Ω IR for each treatment condition x i. In our notation we could define these variables for each value x i by Y i (ω) = f (u) = E(Y X = x i, U = u), for all values u of U. Defined in this way, it is obvious that the variables Y i, i = 1,, n x, are functions of U and therefore have a joint distribution with X, because U and X have joint distributions. 8 Hence, stochastic independence of U and X implies that the variables Y 1,, Y and X are stochastically independent (denoted Y 1,, Y n x n x X ). The condition (a) Y 1,, Y n x X and (b) 0 < P(X = x i ) < 1, for each value x i of X has been called strong ignorability (see, e.g., Rosenbaum & Rubin, 1983a, p. 213). 9 Strong ignorability implies causal unbiasedness of the conditional expected values and the prima facie effects. However, it is neither empirically testable, and it is unknown whether or not it implies causal unbiasedness in the subpopulations. (See Note 1 in Appendix A). Let us now discuss several other sufficient conditions for causal unbiasedness that are empirically falsifiable. A third criterion we might consider is stochastic independence of X and U. This criterion, in fact, fulfills both requirements discussed above. It is empirically falsifiable and it implies causal unbiasedness of the conditional expected values E(Y X = x, W = w) in the subpopulations represented by W = w (see Th. 3 and Th. 6). However, Theorem 2 of Steyer et al. (2000) shows that there is a fourth criterion that is also falsifiable and implies causal unbiasedness of the conditional expected values E(Y X = x, W = w): unit-treatment homogeneity, i.e., E(Y X, U ) = E(Y X ). Hence, independence of U and X would be unnecessarily strong. The fifth criterion to be considered is X and U are independent or E(Y X, U ) = E(Y X ). 10 This would also fulfill both requirements discussed above. In fact, this criterion is already very close to our favorite one which is somewhat weaker and still fulfills the two requirements (a) and (b). The sixth criterion will be called unconfoundedness of the regression E(Y X ). 11 It is defined as follows. Note that throughout the paper we will presume that (Ω, A, P), E(Y X ), U is a potential causal regression model. 8 In Rubin s notation, the causally unbiased expected values would be denoted E(Y i ), i.e., E(Y i ) = CUE(Y X = x i ). 9 For simplicity, we restrict our discussion to the case where there is no concomitant variable or covariate. Rosenbaum and Rubin consider the case with a covariate. Also note that strong ignorability implies E(Y i ) = E(Y i X = x) for each pair (i, x) of indices i {1,, n x } and values x of X. 10 This criterion is very close to what has been proposed by Pearl (1998) in his definitions 2 and In previous papers (see e.g. Steyer, Gabler, & Rucai, 1996 and Steyer, von Davier, Gabler, & Schuster, 1997), we have defined unconfoundedness in a different but equivalent way (see Th. 2).

6 60 MPR-Online 2000, Vol. 5, No. 3 or Definition 1. E(Y X ) is called unconfounded if for each value x of X: P(X = x U = u) = P(X = x) for each value u of U (3) E(Y X = x, U = u) = E(Y X = x) for each value u of U. (4) This sixth criterion differs from the fifth one by the fact that Equation (3) or Equation (4) is postulated within each value x of X. (For an example, see Table 2.) Hence, within each treatment condition we require equal treatment assignment probabilities for each unit u [see Eq. (3)] or homogeneity of the units. Only one of these equations has to be true within each treatment condition x. In contrast, the fifth criterion requires Equation (3) to be true for all treatment conditions x or Equation (4) to be true for all treatment conditions x. Whereas from a substantive point of view this difference seems negligible, it is important from a mathematical point of view: the two criteria are not equivalent to each other. In fact, the fifth criterion implies the sixth criterion, but not vice versa. Table 1 displays a summary of the causality criteria discussed above. Table 1. Summary of the causality criteria discussed in the paper 1. Causal unbiasedness of E(Y X ) E(Y X = x) := E(Y X = x, U = u) P(U = u), for each value x of X 2. Strong ignorability Y 1,, Y X and 0 < P(X = x i ) < 1 n x 3. Independence of X and U X U 4. Unit-treatment homogenity E(Y X, U ) = E(Y X ) 5. Independence of X and U or unit-treatment homogeneity X U or E(Y X, U ) = E(Y X ) 6. Unconfoundedness of E(Y X ) For each value x of X: P(X = x U = u) = P(X = x) for each value u of U or E(Y X = x, U = u) = E(Y X = x) for each value u of U 2. An Example Table 2 gives an example illustrating the criteria discussed above. The example is constructed in such a way that unconfoundedness and causal unbiasedness but none of the other causality criteria hold. We consider three treatment conditions x 1 to x 3 and a

7 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 61 population of eight units. We assume that each unit has the probability P(U = u) = 1/8 to be sampled. Let us first look at treatment x 1. Column 4 displays the individual assignment probabilities for treatment x 1. They are all the same, namely 1/2, for each individual unit. For this specific treatment condition x 1 we have different individual conditional expected values E(Y X = x 1, U = u) (see column 5). For treatments x 2 and x 3 things are different. Column 6 contains different treatment assignment probabilities P(X = x 2 U = u) for treatment condition x 2, and column 7 displays equal individual conditional expected values E(Y X = x 2, U = u). The same is true for treatment condition x 3. Again, we have different treatment assignment probabilities P(X = x 3 U = u) but equal individual conditional expected values E(Y X = x 3, U = u). In order to check causal unbiasedness, we may first compute the three conditional expected values E(Y X = x 1 ) = 115, E(Y X = x 2 ) = 105, and E(Y X = x 3 ) = 110 from Equation E(Y X = x) := E(Y X = x, U = u) P(U = u X = x), (5) which always holds for a discrete random variable U. Using P(U = u X = x) = P(X = x U = u) P(U = u ) / P(X = x) (6) yields P(U = u X = x 1 ) = (1/2) (1/8) / (1/2) = 1/8 for each unit u and E(Y X = x 1 ) = 82 1/ /8 = 115. For the second treatment condition x 2 the corresponding formulas lead to P(U = u X = x 2 ) = (1/10) (1/8) / (1/4) = 1/20 for u 1 and u 2, to P(U = u X = x 2 ) = (2/10) (1/8) / (1/4) = 2/20 for u 3 and u 4, to P(U = u X = x 2 ) = (3/10) (1/8) / (1/4) = 3/20 for u 5 and u 6, and to P(U = u X = x 2 ) = (4/10) (1/8) / (1/4) = 4/20 for u 7 and u 8. Hence, Equation (5) now yields E(Y X = x 2 ) = 105 (1/20 + 1/20 + 2/20 + 2/20 + 3/20 + 3/20 + 4/20 + 4/20) = 105. Analogously, we can compute the conditional expected value E(Y X = x 3 ) = 110.

8 62 MPR-Online 2000, Vol. 5, No. 3 Table 2. An example in which the treatment regression E(Y X ) is unconfounded and causally unbiased but none of the other causality criteria discussed in this paper hold Observational-unit variable U P(U = u) W (gender) P(X = x 1 U = u) E(Y X = x 1, U = u) P(X = x 2 U = u) E(Y X = x 2, U = u) P(X = x 3 U = u) E(Y X = x 3, U = u) u 1 1/8 m 1/2 82 1/ / u 2 1/8 m 1/2 89 1/ / u 3 1/8 m 1/ / / u 4 1/8 m 1/ / / u 5 1/8 f 1/ / / u 6 1/8 f 1/ / / u 7 1/8 f 1/ / / u 8 1/8 f 1/ / / Note: The (unconditional) probabilities for the three treatments are P(X = x 1 ) = 1/2, P(X = x 2 ) = P(X = x 3 ) = 1/4. In order to check causal unbiasedness, the conditional expected values computed above have to be compared to the causally unbiased conditional expected values, which may be computed by Equation (2). For the treatment condition x 1 this equation yields CUE(Y X = x 1 ) := E(Y X = x 1, U = u) P(U = u) = ( ) 1/8 = 115, for treatment condition x 2 CUE(Y X = x 2 ) := E(Y X = x 2, U = u) P(U = u) = ( ) 1/8 = 105, u and for treatment condition x 3 CUE(Y X = x 3 ) := E(Y X = x 3, U = u) P(U = u) = ( ) 1/8 = 110. Hence, in this example, the conditional expected values are unbiased, i.e., E(Y X = x) = CUE(Y X = x) for each value x of X, and the prima facie effects PFE(i, j) = E(Y X = x i ) E(Y X = x j ) are unbiased as well, i.e., PFE(i, j) = ACE(i, j), for each pair of treatment conditions x i and x j.

9 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 63 We now check the other causality criteria discussed in the last section. The second causality criterion, strong ignorability, does not hold in this example. The independence condition Y 1,, Y n X implies the conditional expected values E(Y 1 X = x 1 ) and x E(Y 1 X = x 2 ) to be identical. However, in our example: E(Y 1 X = x 1 ) = = Y 1 (u) P[Y 1 =Y 1 (u) X = x 1 ] = Y 1 (u) P(U = u X = x 1 ) = u [Y 1 (u) [P(X = x 1 U = u) P(U = u ) / P(X = x 1 )]] = 82 1/ /8 = 115 and E(Y 1 X = x 2 ) = Y 1 (u) P[Y 1 = Y 1 (u) X = x 2 ] = Y 1 (u) P(U = u X = x 2 ) = = u [Y 1 (u) [P(X = x 2 U = u) P(U = u ) / P(X = x 2 )]] = ( ) 1/20 + ( ) 2/20 + ( ) 3/20 + ( ) 4/20 = 125. Hence, we can conclude that strong ignorability does not hold in this example. The third criterion, stochastic independence of X and U does not hold, because, unlike for treatment condition x 1, the individual assignment probabilities P(X = x 2 U = u) and P(X = x 3 U = u) are not the same for each observational unit u for treatment conditions x 2 and x 3. The fourth causality criterion, unit-treatment homogeneity also does not hold. Although there are equal individual conditional expected values E(Y X = x, U = u) in treatment conditions x 2 and x 3, these values differ in treatment condition x 1. The fifth causality criterion, stochastic independence of X and U or unit-treatment homogeneity does not hold as well. However, the sixth causality criterion, unconfoundedness does hold. In each treatment condition x we have either equal assignment probabilities P(X = x U = u) (for x 1 ) or equal individual conditional expected values E(Y X = x, U = u) (for x 2 and x 3 ). 3. Some Equivalent Formulations of Unconfoundedness We will now study several conditions which are equivalent to unconfoundedness as defined above. These conditions will provide the theoretical basis for empirical falsifiability. All these conditions involve random variables W for which there exists a map-

10 64 MPR-Online 2000, Vol. 5, No. 3 ping f such that W = f (U ) is the composition of U with f. As can easily be seen, such a variable W partitions the population Ω U into subpopulations (such as the set of male and the set of female persons). Presuming a potential causal regression model (Ω, A, P), E(Y X ), U and using these variables, we may formulate the following theorem. 12 Theorem 1. E(Y X ) is unconfounded if and only if for each value x of X and for each W = f (U ) P(X = x W = w) = P(X = x) for each value w of W (7) or E(Y X = x, W = w) = E(Y X = x) for each value w of W. (8) According to this theorem, unconfoundedness postulates, within each treatment condition x, equal treatment probabilities for each subpopulation w or equal conditional expected values across all subpopulations (represented by the different values w of W ). Note that, in empirical applications, this theorem already provides a basis for falsification of unconfoundedness, because, given a specific W = f (U ), each term in these two equations is empirically estimable. At least one of the two equations above has to be true for each W = f (U ) and a given value x of X. If neither Equation (7) nor Equation (8) holds for a given W = f (U ), we can conclude that E(Y X ) is confounded. The next theorem will show the surprising fact that unconfoundedness as defined by the sixth criterion (see Def. 1) is equivalent to unconfoundedness as defined by Steyer, Gabler, and Rucai (1996). 13 Theorem 2. E(Y X ) is unconfounded if and only if for each W = f (U ) E(Y X = x) = E(Y X = x, W = w) P(W = w), for each value x of X. (9) w Note that the summation is across all values w of W. According to this theorem, postulating Equation (9) for each variable W = f (U ) is equivalent to (general) unconfoundedness of the regression E(Y X ). Also note that Equation (9) is not equivalent with Equation (7) or Equation (8). Instead Theorem 2 formulates (via Th. 1) the equivalence between Equation (9) for each W = f (U ) and Equation (7) or Equation (8) for each W = f (U ). According to Equation (9) unconfoundedness implies that each conditional expected value E(Y X = x) is the average of the corresponding ex- 12 For proofs see Appendix B. 13 This formulation of unconfoundedness has also been called weak causality by Steyer (1992). There one will also find a formulation which is not restricted to discrete treatments and a discrete unit variable.

11 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 65 pected values E(Y X = x, W = w) in the subpopulations represented by the values w of W. Corollary 1. If the treatment regression E(Y X ) is unconfounded, then for each W = f (U ): PFE(i, j)= w [E(Y X = x i, W = w) E(Y X = x j, W = w)] P(W = w). (10) According to this corollary, unconfoundedness implies that each prima facie effect, i.e., each difference E(Y X = x i ) E(Y X = x j ) between two values of the regression E(Y X ), is the average of the differences E(Y X = x i, W = w) E(Y X = x j, W = w) across all subpopulations represented by different values w of W. Also note that Theorem 2 holds irrespective of any specific parameterization of the regression E(Y X ), be it linear or not. For instance, if Y is dichotomous with values 0 and 1 and X is numerical, Theorem 2 also holds for a logistic regression E(Y X ) = P(Y = 1 X ) = exp(γ 0 + γ 1 X ) / [1 + exp(γ 0 + γ 1 X )]. One might expect that Equation (9) is always true if W is discrete. However, Example Ia of Steyer et al. (2000) (with W = U ) shows that this expectation is wrong. An equation for the conditional expected values E(Y X = x) that is always true if W is discrete is: E(Y X = x) = w E(Y X = x, W = w) P(W = w X = x), for each value x of X. (11) Note that Equation (9) also provides a basis for empirical falsification of unconfoundedness, because each term in this equation is empirically estimable. Hence, both Theorems 1 and 2 provide possibilities for falsifying unconfoundedness. It will be useful to supplement the general concept of unconfoundedness by the concept of unconfoundedness with respect to a specific variable W. The definition will be such that general unconfoundedness is equivalent to unconfoundedness with respect to each W = f (U ). Definition 2. E(Y X ) is called unconfounded with respect to W = f (U ) if Equation (9) holds for each value x of X. What has been achieved so far? We have three different but equivalent ways of formulating unconfoundedness (see Table 3). This gives us a deeper understanding of this concept. Specifically, we have shown that, in contrast to causal unbiasedness and strong ignorability, unconfoundedness is empirically falsifiable (see versions 2 and 3 in Table 3).

12 66 MPR-Online 2000, Vol. 5, No. 3 Table 3. Three equivalent formulations of unconfoundedness 1. For each value x of X: P(X = x U = u) = P(X = x) for each value u of U or E(Y X = x, U = u) = E(Y X = x) for each value u of U 2. For each variable W = f (U ) and for each value x of X: P(X = x W = w) = P(X = x) for each value w of W or E(Y X = x, W = w) = E(Y X = x) for each value w of W 3. For each variable W = f (U ) and for each value x of X: E(Y X = x) := E(Y X = x, W = w) P(W = w), for each value x of X w 4. Sufficient Conditions for Unconfoundedness Contrasting unconfoundedness to other criteria, we will now show more formally that unconfoundedness is a comparatively weak sufficient condition for causal unbiasedness. This will be achieved studying some sufficient conditions for a treatment regression E(Y X ) to be unconfounded. The sufficient conditions will also help to understand the concept and how it is related to the experiment with random assignment of units to experimental conditions. Theorem 3. Each of the following conditions is sufficient for unconfoundedness of the regression E(Y X ): (i) U and X are stochastically independent; (ii) (Unit-treatment homogeneity) E(Y X, U ) = E(Y X ); (iii)(strong causality) For each W = f ( U ) there exists a function h such that E(Y X, W ) = E(Y X ) + h( W ). (12) Note that each of these conditions is sufficient for unconfoundedness of E(Y X ) but not necessary, i.e., each of them implies unconfoundedness of E(Y X ), but unconfoundedness of E(Y X ) neither implies (i), nor (ii), nor (iii). Condition (i), independence of X and U, is important for understanding the role of randomization, i.e., random assignment of units to treatment conditions. Randomization is the only way for the experimenter to deliberately create one of the sufficient conditions of unconfoundedness.

13 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 67 Equation (12) postulates the additive decomposability of the regression E(Y X, W ) into a function g (X ) and a function h ( W ), i.e., E(Y X, W ) = g (X ) + h ( W ) (13) with the additional requirement g (X ) = E(Y X ), which, if Equation (13) holds, is equivalent to E [ h ( W ) X ] = E [ h ( W )]. For W = U, Equation (13) may be called unit-treatment additivity. It postulates the additive decomposability of the regression E(Y X, U ) into a function g (X ) and a function h ( U ), with the additional requirement g (X ) = E(Y X ). Equation (13) does not allow for interactions between X and W in the sense of analysis of variance (or moderator effects in terms of moderator regression models.) Strong causality as defined by condition (iii) requires invariance of the individual causal effects across all observational units, because, for W = U, Equation (13) implies E(Y X = x 1, U = u ) E(Y X = x 2, U = u ) = g (x 1 ) + h ( u ) [g (x 2 ) + h ( u )] = g (x 1 ) g (x 2 ). (14) It should be noted that there is no sufficient condition for such an invariance of individual causal effects that could deliberately be created by the experimenter. Specifically, Equation (13) is not necessarily (and in fact seldom) true in the randomized experiment, in contrast, to unconfoundedness that always holds if there random assignment of units to treatment conditions. Obviously, unit-treatment homogeneity is a special case of unit-treatment additivity with h ( W ) = Sufficient Conditions for Strong Ignorability In this section we state some sufficient conditions for strong ignorability. According to our next theorem, the first two sufficient conditions of unconfoundedness mentioned in Theorem 3 are also sufficient for strong ignorability. Theorem 4. Each of the following conditions is sufficient for strong ignorability: (i) U and X are stochastically independent; (ii) (Unit-treatment homogeneity) E(Y X, U ) = E(Y X ); Note that each of these conditions is sufficient for strong ignorability but not necessary, i.e., each of them implies strong ignorability but strong ignorability neither implies (i) nor (ii).

14 68 MPR-Online 2000, Vol. 5, No Necessary Conditions of Unconfoundedness The first necessary condition formulated in this section relates unconfoundedness to causal unbiasedness (Th. 5). Next we show that unconfoundedness and, therefore, causal unbiasedness, can be transferred to each subpopulation. Other necessary conditions treated are important for causal modeling in nonrandomized experiments, because they are the logical basis for falsifying a hypothesis of unconfoundedness. Theorem 5. If the treatment regression E(Y X ) is unconfounded, then each of the following propositions hold: (i) E(Y X ) is causally unbiased, i.e., E(Y X = x) = CUE(Y X = x) for each value x of X; (ii) the prima facie effects are causally unbiased, i.e., PFE(i, j) = ACE(i, j) for all pairs (x i, x j ) of values of X. We now turn to a very powerful theorem according to which unconfoundedness of a regression E(Y X ) allows to conclude that unconfoundedness and, therefore, causal unbiasedness (see Th. 5), can be transferred to each subpopulation. This is not only important for the interpretation of expected values and their differences in subpopulations but also allows to conclude that all properties of an unconfounded regression also hold for the corresponding regressions within subpopulations. Theorem 6. If E(Y X ) is unconfounded and W = f (U ), then, for each value w of W, the regression E W = w (Y X ) of Y on X in the subpopulation represented by W = w is unconfounded as well. As mentioned before, this theorem implies that all properties of an unconfounded regression also apply for the regressions E W = w (Y X ) within the subpopulations. Instead of rewriting all propositions for the subpopulations we just consider the property of causal unbiasedness of the regressions E W = w (Y X ) of Y on X in the subpopulation represented by W = w. Corollary 2. If E(Y X ) is unconfounded and W = f (U ), then the regressions E W = w (Y X ) of Y on X in the subpopulations represented by W = w are causall unbiased, i.e., E (Y X = x, W = w) = E (Y X = x, U = u, W = w) P(U = u W = w), (15) u for each pair (x, w) of values of X and W Note that E(Y X = x, U = u) = E (Y X = x, U = u, W = w), because we presuppose that W is a measurable function of U (see the definitions of U in Steyer et al., 2000 and of W above).

15 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 69 Theorem 6 and Corollary 2 show that unconfoundedness is a stable property in the sense that it carries over from the total population to all subpopulations. In contrast, causal unbiasedness may be incidentally true in the total population but fail in any subpopulation (see Example III of Steyer et al., 2000). Table 4. Necessary conditions of unconfoundedness 1. The regressions E W = w (Y X ) in each subpopulation W = w are unconfounded and, therefore, causally unbiased 2. The prima facie effects PFE(i, j) in the total population as well as the prima facie effects PFE W = w (i, j) in each subpopulation are causally unbiased 3. For each variable W = f (U ) and for each value x of X: P(X = x W = w) = P(X = x) for each value w of W or E(Y X = x, W = w) = E(Y X = x) for each value w of W 4. For each variable W = f (U ) and for each value x of X: E(Y X = x) := Σ w E(Y X = x, W = w) P(W = w), for each value x of X Table 4 gives a summary of the most important necessary conditions of unconfoundedness. The first two necessary conditions are important for substantive interpretations, because they add meaning to the concept of unconfoundedness which is otherwise not easily seen. The second two necessary conditions are also sufficient. Conditions 1, 3, and 4 may be used when it comes to falsification trials of the hypothesis of unconfoundedness. It should be noticed that, for a given variable W Equation (7) or (8) is stronger than Equation (9), i.e., Equation (7) or (8) implies Equation (9) but not vice versa. 7. General Procedures to Falsify Unconfoundedness An important goal of this paper is to present a theory that may serve as a methodological foundation of causal modeling also outside the randomized experiment. Since a hypothesis that the conditional expected values E(Y X = x) are causally unbiased is not falsifiable, we introduced unconfoundedness of E(Y X ) as the weakest empirically falsifiable sufficient condition for causal unbiasedness. How to proceed if we want to try to falsify the hypothesis of unconfoundedness? A first general principle is to choose a variable W = f (U ) and compare the estimates on the right-hand side of Equation (9)

16 70 MPR-Online 2000, Vol. 5, No. 3 to the estimates on the right-hand side of Equation (11), i.e., to compare, for each value x of X, to E adj for W (Y X = x) := Σw E(Y X = x, W = w) P(W = w) (16) E(Y X = x) = Σw E(Y X = x, W = w) P(W = w X = x). (17) Note that the last equation is always true if W is discrete. If, in a specific empirical application, the estimate of E adj for W (Y X = x) cannot be assumed to be an estimate of E(Y X = x), then unconfoundedness, too, cannot be assumed to hold. According to the considerations above, falsifying in a specific application the hypothesis of unconfoundedness of E(Y X ) can be achieved by falsifying the null hypothesis of unconfoundedness of E(Y X ) with respect to a specific W = f (U ), i.e., by falsifying the hypothesis: E adj for W (Y X = x) E(Y X = x) = 0, for each value x of X. (18) This hypothesis is equivalent to Equation (9). It may also equivalently be written: Σw E(Y X = x, W = w) [P(W = w) P(W = w X = x)] = 0, for each x of X. (19) If the unconditional probabilities P(W = w) and the conditional probabilities P(W = w X = x) are known, Equation (19) is a linear hypothesis about the parameters E(Y X = x, W = w). A second general procedure can be based on condition 4 in Table 4. Again, choose a variable W = f (U ) and check if for each given value x of X condition holds or P(X = x W = w) = P(X = x) for each value w of W (20) E(Y X = x, W = w) = E(Y X = x) for each value w of W. (21) If there is a value x of X for which none of these conditions hold, then unconfoundedness does not hold as well. 8. Another Example The example displayed in Table 5 serves to illustrate the general procedures to falsify unconfoundedness outlined in the last section. Of course, a simple look at treatment

17 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 71 condition x 2 reveals that the requirements formulated in Definition 1 do not hold in this example. For x 2 we neither have constant assignment probabilities across units nor constant individual conditional expected values E(Y X = x 2, U = u) across units. As a consequence, the conditional expected value E(Y X = x 2 ) is causally biased. Whereas the causally unbiased expected value is CUE(Y X = x 2 ) = 105, the conditional expected value E(Y X = x 2 ) is equal to 115. Hence, the prima facie effect PFE(1, 2) = E(Y X = x 1 ) E(Y X = x 2 ) = = 0 is considerably different from the average causal effect ACE(1, 2) = = 10. Whereas the average causal effect is positive, the prima facie effect is zero. In the last paragraph we used Definition 1 to check unconfoundedness. However, in empirical applications, the data on the individual unit level are neither available nor estimable. Hence, Definition 1 cannot be used in empirical falsification trials. What is estimable, however, are the data within subpopulations such as gender (see column 3 of Table 5). The conditional expected values E(Y X = x, W = w) may be computed by the formula E(Y X = x, W = w) = E(Y X = x, U = u, W = w) P(U = u X = x, W = w ), where E(Y X = x, W = w, U = u) = E(Y X = x, U = u), because W = f (U ). Hence, for this purpose we have to compute the conditional probabilities P(U = u X = x, W = w ). Using the data of Table 5 we receive for X = x 2 and W = m: E(Y X = x 2, W = m) = 68 1/ / / /6 = 88.5, and for X = x 2 and W = f : E(Y X = x 2, W = f ) = 112 3/ / / /14 = The unconditional probabilities P(W = w) are P(W = f ) = P(W = m) = 1/2, whereas the conditional probabilities P(W = w X = x 2 ) are P(W = m X = x 2 ) = 3/10 and P(W = f X = x 2 ) = 7/10. Inserting these results into [see Eq. (19)] yields w E(Y X = x 2, W = w) [P(W = w) P(W = w X = x 2 )], 88.5 (1/2 3/10) (1/2 7/10) = =

18 72 MPR-Online 2000, Vol. 5, No. 3 Hence, this is a contradiction to the hypothesis that the regression E(Y X ) is unconfounded in the example of Table 5 [see Eq. (19)]. The second general procedure outlined in section 7 is easier to follow in this example. It easily seen from Table 5 that neither Equation (20) nor Equation (21) holds for X = x 2 and W = m, for instance. It should be emphasized again that all parameters necessary for the general procedures to falsify unconfoundedness illustrated in this example are estimable in real empirical applications. In this sense unconfoundedness is empirically falsifiable. Table 5. An example in which the treatment regression E(Y X ) is confounded Observational-unit variable U P(U = u) W (gender) P(X = x 1 U = u) E(Y X = x 1, U = u) P(X = x 2 U = u) E(Y X = x 2, U = u) P(X = x 3 U = u) E(Y X = x 3, U = u) u 1 1/8 m 1/2 82 1/ / u 2 1/8 m 1/2 89 1/ / u 3 1/8 m 1/ / / u 4 1/8 m 1/ / / u 5 1/8 f 1/ / / u 6 1/8 f 1/ / / u 7 1/8 f 1/ / / u 8 1/8 f 1/ / / Note: The (unconditional) probabilities for the three treatments are P(X = x 1 ) = 1/2, P(X = x 2 ) = P(X = x 3 ) = 1/4. 9. Special Procedures to Falsify Unconfoundedness In the last two sections, we did not presuppose anything about the functional form of the regression E(Y X, W ). In empirical applications, the procedure outlined in these sections can be used if the sample is big enough to estimate each conditional expected value E(Y X = x, W = w) sufficiently well and if the probabilities P(X = x W = w) are known or can also be estimated sufficiently well. If, however,

19 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 73 compared to the sample size, the number of different combinations (x, w) of values of X and W is too large, then there might even be pairs (x, w) for which there are no observations at all. If, however, we are able to make an assumption about the functional form of the regression E(Y X, W) sparseness of data are less of a problem. We will consider two such assumptions: linearity [Eq. (22)] and conditional linearity [Eq. (32)]. The case of linearity has already been considered by Pratt and Schlaifer (1988), for instance. In the next theorem, X = (X 1,, X p )' denotes a p-dimensional regressor and W = (W 1,..., W q )' a q-dimensional function of U. Both, X and W are column vectors. 15 Theorem 7. If (a) W = f (U ), (b) the treatment regression E(Y X ) is unconfounded (or at least unconfounded with respect to W), (c) the covariance matrix Σ of the (p + q)-dimensional row vector (X ', W ') is regular, and (d) E(Y X, W ) = β 0 + X ' β X + W ' β W, β 0 IR, β X IR p, β W IR q, (22) then E(Y X ) = α 0 + X ' α X, α 0 IR, α X IR p (23) α X = β X, (24) E(W ' X ) β W = E(W ') β W, (25) ΣXW β W = 0. (26) In Equation (26) Σ XW denotes the covariance matrix of X and W. Hence, under the linearity assumption [Eq. (22)], unconfoundedness of E(Y X ) can be falsified in a specific empirical application, if it is shown that α X = β X does not hold. Alternatively, one might try to show that ΣXW β W = 0, does not hold. Hence, tests of α X = β X and tests of Σ β W = 0 are tests of unconfoundedness, provided that W = XW f (U ) and Equation (22) is true. Usually, a test of linearity or goodness of fit of Equation (22) should precede a test of unconfoundedness. Also note that, given the presuppositions of Theorem 7 [including Eq. (22)], Equations (24) and (26) are equivalent. Observe that the vectors of regression coefficients β X and β W can be computed by 1 β X = ( Σ XX + Γ E 1 Γ ') Σ XY Γ E 1 Σ WY, (27) 15 Note that all concepts treated in this paper are not restricted to a univariate treatment variable X. If X is p-variate, a value x of X consists of p values x 1,, x p. Since we are now formulating the models in terms of matrix algebra, we will denote the treatment variable by the boldface letter X. The same holds true, of course, for W.

20 74 MPR-Online 2000, Vol. 5, No. 3 and β W = E 1 ( Σ WY Γ ' Σ XY ), (28) 1 where E := Σ WW ΣWX Σ 1 XXΣ XW and Γ := Σ XXΣ XW (see, e.g., Seber, 1984). Also note that, for p = q = 1, Equations (27) and (28) simplify to the well-known formulas for partial regression coefficients: and β X = ( σ σ σ σ ) / ( σ σ σ ) W XY WX WY (29) X W WX β W = ( σ σ σ σ ) / ( σ σ σ ) X WY WX XY. (30) X W WX In Theorem 7 we presuppose that the regression E(Y X, W ) is additive. Especially, Equation (22) does not allow for multiplicative terms such as X W, which occur, e.g., in the following equation: E(Y X, W ) = β 0 + β 1 X + β 2 W + β 3 X W (31) = (β 0 + β 2 W) + (β 1 + β 3 W) X = g 0 (W) + g 1 (W) X. Note that E(Y X, W ) can always be parameterized in the form of Equation (31), if both X and W are dichotomous. In other cases, however, Equation (31) may not hold for the regression E(Y X, W ). The values of the slope coefficient function g 1 (W) are the values of the slope coefficients of the (W = w)-conditional linear regressions E W = w (Y X ) = g 0 (w) + g 1 (w) X. If, e.g., W is gender, E W = m (Y X ) may represent the regression of Y on X in the subpopulation of males (m). In the next theorem, we consider a special case which allows for a p-dimensional regressor X = (X 1,..., X p ) '. The basic idea behind Equation (32) is that for each given value w of W, the regression of Y on X is an additive linear multiple regression with the p-variate numerical regressor X = (X 1,..., X p ) '. Theorem 8. If (a) the treatment regression E(Y X ) is unconfounded, (b) X: Ω IR p, (c) W = f (U ), and (d) there are functions g 0,, g p such that E(Y X, W ) = g 0 (W) + g 1 (W) X g p (W) X p, (32) then E(Y X ) = α 0 + α 1 X α p X p (33)

21 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 75 with α 0 = E[g 0 (W)], α 1 = E[g 1 (W)],..., α p = E[g p (W)]. (34) According to this theorem, the partial regression coefficients of the regressors X 1,..., X p are the expected values of the corresponding (W = w)-conditional partial regression coefficients if unconfoundedness holds. Furthermore, this result may also be used for falsification trials of the hypothesis that E(Y X ) is unconfounded. If the Equations (33) and (34) do not hold, although Equation (32) does, then we can reject the hypothesis that E(Y X ) is unconfounded. For p = 1, this theorem shows how the slope coefficient α 1 of a simple linear regression E(Y X ) = α 0 + α 1 X is related to the slope coefficient function g 1 (W), if unconfoundedness holds. A special case of Equation (32) results if X 1 := X 1,..., X p := X p, i.e., if the regression of Y on X is a polynomial function of the one-dimensional numeric regressor X. 10. Discussion In this paper we introduced the concept of unconfoundedness of a treatment regression E(Y X ) and showed its relationship to Neyman s and Rubin s concepts of individual and average causal effects. We discussed several criteria that are relevant in causal modeling. The weakest is causal unbiasedness of the conditional expected values E(Y X = x) [see Eq. (1)] which has the disadvantage of being nonfalsifiable. Other conditions were independence of U and X, as well as unit-treatment homogeneity: E(Y X, U ) = E(Y X ) and the and/or-combination of the two. All these conditions are sufficient conditions for causal unbiasedness but they are unnecessarily strong. We finally introduced unconfoundedness as the weakest empirically falsifiable sufficient condition for causal unbiasedness. We defined a regression E(Y X ) to be unconfounded if, for each value x of X: (a) P(X = x U = u ) = P(X = x) for each value u of U, or (b) E(Y X = x, U = u ) = E(Y X= x ) for each value u of U. This concept goes beyond the observational level and still has many implications, some of which are empirically testable. Unconfoundedness implies that the conditional expected values E(Y X = x) and their differences E(Y X = x i ) E(Y X = x j ), the prima facie effects, are causally unbiased, i.e. E(Y X = x i ) E(Y X = x j ) = ACE(i, j). This provides the link to the concept of the average causal effect. Furthermore, we showed that unconfoundedness of E(Y X ) is sufficient to imply that the conditional expected values E(Y X = x, W = w) and their differences E(Y X = x i, W = w) E(Y X = x j, W = w) are causally unbiased in every subpopulation W = w (see Corollary 2).

22 76 MPR-Online 2000, Vol. 5, No. 3 The sufficient conditions for unconfoundedness of E(Y X ) may guide us in the design of experiments. Most important, however, they make clear that random assignment of units to experimental conditions serves to secure stochastic independence of X and U, which implies unconfoundedness and causal unbiasedness. Such an experiment with random assignment of observational units to experimental conditions is in fact the only way to deliberately create the circumstances in which average causal effects can be estimated. Although unit-treatment homogeneity, [i.e., E(Y X, U ) = E(Y X )] is also sufficient for unconfoundedness (see Th. 3), this condition can not deliberately be created through techniques of experimental design. Nevertheless, both sufficient conditions for unconfoundedness, independence of X and U, and unit-treatment homogeneity, show that causal modeling in randomized and nonrandomized experiments aims at the same goal: to secure causal unbiasedness. Hence, if it were possible to generalize the theory presented such that it could also cover causal modeling in observational (i.e., nonexperimental) studies, in which the X-variables do not represent experimental conditions, the two traditions of theorizing about causality in statistics identified by Cox (1992), experimental and observational, could be integrated into a single theory. The necessary conditions for unconfoundedness of E(Y X ) may guide us in nonrandomized experimental causal modeling, because they provide empirically testable consequences. The theorems presented show which statistical hypotheses have to be tested and rejected in order to falsify the hypothesis of unconfoundedness of E(Y X ) in a specific application. What happens if a W has been identified with respect to which there is confounding? First, we may look at the regressive dependencies of Y on X within each subpopulation defined by W = w, i.e., we may look at the conditional effects. In fact, this provides more detailed information about the conditional regressive dependence of Y on X given W = w. Note that this procedure is in full accordance with the framework presented in this paper. One would just replace the total population Ω U by the subpopulation defined by W = w. This procedure is, in principle, equivalent with statistically controlling for the covariate vector W provided that the equation for the regression E(Y X, W) is adequately specified. Second, if we are willing to assume that the conditional regressions E W=w (Y X ) are unbiased, we may also compute the average causal effects via Equation (16), because, under this assumption ACE(i, j) = E adj for W (Y X = x i ) E adj for W (Y X = x j ) (for details, see Wüthrich-Martone, Steyer, Nachtigall & Suhl, 1999). A limitation requiring a more general formulation of our approach follows from assuming P(X = x, U = u) > 0. Whereas this assumption has been useful for a simple presentation, it restricts generality because it is not compatible with the assumption

23 Steyer, Gabler, von Davier & Nachtigall: Causal Regression Models II 77 that X has a normal distribution, for instance, which is often made in nonexperimental causal modeling (see, e.g., Bollen, 1989). Since normality is not really necessary in structural equation models (see, e.g., Browne & Arminger, 1995), this restriction is not too serious. Nevertheless, we should look for a more general theory which is not restricted to discrete treatments and units. Another open problem is how the theory presented relates to statistical sampling models. Especially, statistical tests and procedures have to be developed to help us in deciding whether or not unconfoundedness holds in a specific application. For some cases, these tests and procedures have already been presented (see, e.g., Allison, 1995; Clogg et al., 1992; Clogg et al., 1995; Steyer, Gabler, Rucai & Schuster, 1997; von Davier, 2000), for others they still need to be developed. A significance test of unconfoundedness in very large samples may not really be meaningful in many empirical applications, because the exact null hypothesis will rarely hold in nonrandomized experiments. Instead, one might rather be interested in estimating a coefficient reflecting the strength of confounding of a regression E(Y X ) with respect to a specific variable W. Such a coefficient has been proposed by Steyer, Gabler, and Rucai (1996). More research and experience will be necessary to work out principles dealing with the practical relevance of confoundings in concrete empirical studies. References [1] Allison, P. D. (1995). The impact of random predictors on comparisons of coefficients between models: Comment on Clogg, Petkova, and Haritou. American Journal of Sociology, 100, [2] Bollen, K. A. (1989). Structural equation models with latent variables. New York: Wiley. [3] Browne, M. W. & Arminger, G. (1995). Specification and estimation of mean- and covariance-structure models. In G. Arminger, C. C. Clogg & M. E. Sobel (eds.), Handbook of Statistical Modeling for the Social and Behavioral Sciences (pp ). New York: Plenum. [4] Clogg, C. C., Petkova, E., & Shidadeh, E. S. (1992). Statistical methods for analyzing collapsibility in regression models. Journal of Educational Statistics, 17, [5] Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100,

Lehrstuhl für Methodenlehre und Evaluationsforschung Institut für Psychologie FSU Jena. An IRT model with person-specific item difficulties

Lehrstuhl für Methodenlehre und Evaluationsforschung Institut für Psychologie FSU Jena. An IRT model with person-specific item difficulties metheval Lehrstuhl für Methodenlehre und Evaluationsforschung Institut für Psychologie FSU Jena An IRT model with person-specific item Rolf Steyer Friedrich Schiller Universität Jena Institut für Psychology

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Published online: 25 Sep 2014.

Published online: 25 Sep 2014. This article was downloaded by: [Cornell University Library] On: 23 February 2015, At: 11:15 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:

More information

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS Judea earl Department of Computer Science UCLA OUTLINE Statistical vs. Causal Modeling: distinction and mental barriers N-R vs. structural model: strengths

More information

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior David R. Johnson Department of Sociology and Haskell Sie Department

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Notes on Decision Theory and Prediction

Notes on Decision Theory and Prediction Notes on Decision Theory and Prediction Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico October 7, 2014 1. Decision Theory Decision theory is

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics

More information

PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES

PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES Patrick Suppes PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES This article is concerned with the prospects and problems of causal analysis in the social sciences. On the one hand, over the past 40

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Derivation of Monotone Likelihood Ratio Using Two Sided Uniformly Normal Distribution Techniques

Derivation of Monotone Likelihood Ratio Using Two Sided Uniformly Normal Distribution Techniques Vol:7, No:0, 203 Derivation of Monotone Likelihood Ratio Using Two Sided Uniformly Normal Distribution Techniques D. A. Farinde International Science Index, Mathematical and Computational Sciences Vol:7,

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong

More information

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models Chapter 8 Models with Structural and Measurement Components Good people are good because they've come to wisdom through failure. Overview William Saroyan Characteristics of SR models Estimation of SR models

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

CHAPTER 12 Boolean Algebra

CHAPTER 12 Boolean Algebra 318 Chapter 12 Boolean Algebra CHAPTER 12 Boolean Algebra SECTION 12.1 Boolean Functions 2. a) Since x 1 = x, the only solution is x = 0. b) Since 0 + 0 = 0 and 1 + 1 = 1, the only solution is x = 0. c)

More information

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis /3/26 Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy

More information

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data? Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis /9/27 Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Dealing with the assumption of independence between samples - introducing the paired design.

Dealing with the assumption of independence between samples - introducing the paired design. Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such

More information

Key Algebraic Results in Linear Regression

Key Algebraic Results in Linear Regression Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

1 Correlation between an independent variable and the error

1 Correlation between an independent variable and the error Chapter 7 outline, Econometrics Instrumental variables and model estimation 1 Correlation between an independent variable and the error Recall that one of the assumptions that we make when proving the

More information

Introduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric

Introduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric CS 189 Introduction to Machine Learning Fall 2017 Note 5 1 Overview Recall from our previous note that for a fixed input x, our measurement Y is a noisy measurement of the true underlying response f x):

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Prof. Massimo Guidolin 019 Financial Econometrics Winter/Spring 018 Overview ARCH models and their limitations Generalized ARCH models

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Popper s Measure of Corroboration and P h b

Popper s Measure of Corroboration and P h b Popper s Measure of Corroboration and P h b Darrell P. Rowbottom This paper shows that Popper s measure of corroboration is inapplicable if, as Popper also argued, the logical probability of synthetic

More information

Test Yourself! Methodological and Statistical Requirements for M.Sc. Early Childhood Research

Test Yourself! Methodological and Statistical Requirements for M.Sc. Early Childhood Research Test Yourself! Methodological and Statistical Requirements for M.Sc. Early Childhood Research HOW IT WORKS For the M.Sc. Early Childhood Research, sufficient knowledge in methods and statistics is one

More information

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling A.P.Dawid 1 and S.Geneletti 2 1 University of Cambridge, Statistical Laboratory 2 Imperial College Department

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

A Rejoinder to Mackintosh and some Remarks on the. Concept of General Intelligence

A Rejoinder to Mackintosh and some Remarks on the. Concept of General Intelligence A Rejoinder to Mackintosh and some Remarks on the Concept of General Intelligence Moritz Heene Department of Psychology, Ludwig Maximilian University, Munich, Germany. 1 Abstract In 2000 Nicholas J. Mackintosh

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003. A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about

More information

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (www.cs.ucla. THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Modeling: Statistical vs. Causal Causal Models and Identifiability to

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1 Imbens/Wooldridge, IRP Lecture Notes 2, August 08 IRP Lectures Madison, WI, August 2008 Lecture 2, Monday, Aug 4th, 0.00-.00am Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Introduction

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,

More information

Relative Improvement by Alternative Solutions for Classes of Simple Shortest Path Problems with Uncertain Data

Relative Improvement by Alternative Solutions for Classes of Simple Shortest Path Problems with Uncertain Data Relative Improvement by Alternative Solutions for Classes of Simple Shortest Path Problems with Uncertain Data Part II: Strings of Pearls G n,r with Biased Perturbations Jörg Sameith Graduiertenkolleg

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017 Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability Alessandra Mattei, Federico Ricciardi and Fabrizia Mealli Department of Statistics, Computer Science, Applications, University

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

WHEN ARE PROBABILISTIC EXPLANATIONS

WHEN ARE PROBABILISTIC EXPLANATIONS PATRICK SUPPES AND MARIO ZANOTTI WHEN ARE PROBABILISTIC EXPLANATIONS The primary criterion of ade uacy of a probabilistic causal analysis is causal that the variable should render simultaneous the phenomenological

More information

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla. OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Research Note: A more powerful test statistic for reasoning about interference between units

Research Note: A more powerful test statistic for reasoning about interference between units Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work Human Journals Short Communication December 2018 Vol.:11, Issue:2 All rights are reserved by Stephen Gorard FRSA FAcSS Significance Testing with Incompletely Randomised Cases Cannot Possibly Work Keywords:

More information

Review of the General Linear Model

Review of the General Linear Model Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest Department of methodology and statistics, Tilburg University WorkingGroupStructuralEquationModeling26-27.02.2015,

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

Reconciling factor-based and composite-based approaches to structural equation modeling

Reconciling factor-based and composite-based approaches to structural equation modeling Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Alternatives to Difference Scores: Polynomial Regression and Response Surface Methodology. Jeffrey R. Edwards University of North Carolina

Alternatives to Difference Scores: Polynomial Regression and Response Surface Methodology. Jeffrey R. Edwards University of North Carolina Alternatives to Difference Scores: Polynomial Regression and Response Surface Methodology Jeffrey R. Edwards University of North Carolina 1 Outline I. Types of Difference Scores II. Questions Difference

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Rerandomization to Balance Covariates

Rerandomization to Balance Covariates Rerandomization to Balance Covariates Kari Lock Morgan Department of Statistics Penn State University Joint work with Don Rubin University of Minnesota Biostatistics 4/27/16 The Gold Standard Randomized

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information