Effect Modification and Interaction

Size: px
Start display at page:

Download "Effect Modification and Interaction"

Transcription

1 By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions and concepts of effect modification, interaction, synergism, and related concepts and terms. The term effect modification has been applied to two distinct phenomena. For the first phenomenon, effect modification simply means that some chosen measure of effect varies across levels of background variables. This phenomenon is thus more precisely termed effect-measure modification, and in the statistics literature is more often termed heterogeneity or interaction [1]. For the second phenomenon, effect modification means that the mechanism of effect differs with background variables, which is known in the biomedical literature as dependent action or (again) interaction. The two phenomena are often confused, as reflected by the use of the same terms (effect modification and interaction) for both. In fact they have only limited points of contact. 1 Effect-Measure Modification (Heterogeneity of Effect) To make the concepts and distinctions precise, suppose we are studying the effects that change in a variable X will have on a subsequent variable Y, in the presence of a background variable Z that precedes X and Y. For example, X might be treatment level such as dose or treatment arm, Y might be a health outcome variable such as life expectancy following treatment, and Z might be sex (1 = female, 0 = male). To measure effects, write Y x for the outcome one would have if administered treatment level x of X; for example, if X = 1 for active treatment, X = 0 for placebo, then Y 1 is the outcome a subject will have if X = 1 is administered, and Y 0 is the outcome a subject will have if X = 0 is administered. The Y x are often called potential outcomes (see Causality/Causation). One measure of the effect of changing X from 0 to 1 on the outcome is the difference Y 1 Y 0 ; for example, if Y were life expectancy, Y 1 Y 0 would be the change in life expectancy. If this difference varied with sex in a systematic fashion, one could say that the difference was modified by sex, or that there was University of California, Los Angeles, CA, USA This article was originally published online in 2008 in Encyclopedia of Quantitative Risk Analysis and Assessment, c John Wiley & Sons, Ltd and republished in Wiley StatsRef: Statistics Reference Online, Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 1

2 heterogeneity of the difference across sex. Another common measure of effect is the ratio Y 1 /Y 0 ; if this ratio varied with sex in a systematic fashion, one could say that the ratio was modified by sex. For purely algebraic reasons, two measures may be modified in very different ways by the same variable. Furthermore, if both X and Z affect Y, absence of modification of the difference implies modification of the ratio, and vice versa. For example, suppose for the subjects under study Y 1 = 20 and Y 0 = 10 for all the males, but Y 1 = 30 and Y 0 = 15 for all the females. Then Y 1 Y 0 = 10 for males but Y 1 Y 0 = 15 for females, so there is a 5-year modification of the difference measure by sex. However, suppose we measured the effects by expectancy ratios Y 1 /Y 0, instead of differences. Then Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/15 = 2 for females as well, so there is no modification of the ratio measure by sex. Consider next an example in which Y 1 = 20 and Y 0 = 10 for males, and Y 1 = 30 and Y 0 = 20 for females. Then Y 1 Y 0 = 10 for both males and females, so there is no modification of the difference by sex. However, Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/20 = 1.5 for females, so there is modification of the ratio by sex. Finally, suppose Y 1 = 20 and Y 0 = 10 for males, and Y 1 = 60 and Y 0 = 40 for females. Then Y 1 Y 0 = 10 for males and Y 1 Y 0 = 20 for females, so the Y difference is smaller among males than among females. However, Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/20 = 1.5 for females, so the Y ratio is larger among males than among females. Thus, modification can be in the opposite direction for different measures of effect. 2 Biologic Interaction The preceding examples show that one should not, in general, equate the presence or absence of effectmeasure modification to the presence or absence of interactions in the biologic (mechanistic) sense, because effect-measure modification depends entirely on what measure one chooses to examine, whereas the mechanism is the same regardless of that choice. Nonetheless, it is possible to formulate mechanisms of action that imply homogeneity (no modification) of a particular measure. For such a mechanism, the observation of heterogeneity in that measure can be taken as evidence against the mechanism (assuming of course that the observations are valid). It would be fallacious, however, to infer the mechanism is correct if homogeneity was observed, for the usual reason that many other mechanisms (some unimagined) would imply the observation. A classic example is the simple independent-action model for the effect of X and Z on Y, in which subjects affected by changes in X are disjoint from subjects affected by changes in Z [2, 3]. This model implies homogeneity (absence of modification by Z) of the average X effect on Y when that effect is measured by the difference in the average Y. In particular, suppose Y is a disease indicator (1 if disease occurs, 0 if not). Then the average of Y is the proportion getting disease (the incidence proportion, often called the risk) and the average Y difference is the risk difference. Thus, in this context, the independentaction model implies that the risk difference for the effect of X on Y will be constant across levels of Z; in other words, the risk difference will be homogeneous across Z, or unmodified by Z. If X and Z both have effects, this homogeneity of the difference forces ratio measures of the effect of X on Y to be heterogeneous across Z. When additional factors are present in the model (such as confounders) homogeneity of the risk differences can also lead to heterogeneity of the excess risk ratios [4]. Thus, under the simple independent-action model, the independence of the X and Z effects will cause the measures other than the risk difference to be heterogeneous, or modified, across Z. Biologic models for the mechanism of X and Z interactions can lead to other patterns. For example, certain multistage models in which X and Z act at completely separate stages of a multistage mechanism can lead to homogeneity of ratios rather than differences, as well as particular dose response patterns. Special caution is needed in interpreting observed patterns, however, because converse relations do not hold. Many different plausible biologic models will imply identical patterns in the effect measures [5]. 2 Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved.

3 2.1 Synergism and Antagonism Taking the simple independent-action model as a baseline, one may offer the following dependent-action definitions for an outcome indicator Y as a function of the causal antecedents X and Z. Synergism of X = 1 and Z = 1 in causing Y = 1 is defined as necessity and sufficiency of X = 1 and Z = 1 for causing Y = 1, i.e., Y = 1 if and only if X = 1 and Z = 1. We may also say that Y = 1 in a given individual would be a synergistic response to X = 1 and Z = 1 if Y = 0 would have occurred instead if either X = 0 or Z = 0 or both. In potential-outcome notation where Y xz is the outcome when X = x and Z = z, this definition says synergistic responders have Y 11 = 1 and Y 10 = Y 01 = Y 00 = 0. Antagonism of X = 1 by Z = 1 in causing Y = 1 is defined as necessity and sufficiency of Z = 0 in order for X = 1 to cause Y = 1. This definition says antagonistic responders to X or Z have Y 10 = 1 or Y 01 = 1 or both, and Y 11 = Y 00 = 0. With these definitions, synergism and antagonism are not logically distinct concepts, but depend on the coding of X and Z. For example, switching the labels of exposed and unexposed for one factor can change apparent synergy to apparent antagonism, and vice versa [1, 2]. The only label-invariant property is whether the effect of X on a given person is altered by the level of Z, i.e., the action of X is dependent on Z. If so, by definition we have biologic interaction. Absence of any synergistic or antagonistic interaction among levels of X and Z implies homogeneity (absence of modification by Z) of the average X effect across levels of Z when the X effect is measured by the differences in Y across levels of X [2, 6]. The converse is false, however. Homogeneity of the difference measures (e.g., lack of modification of the risk difference) does not imply absence of synergy or antagonism, because such homogeneity can arise through other means (e.g., averaging out of the synergistic and antagonistic effects across the population being examined). A more restrictive set of definitions is based on the sufficient-component cause model of causation [1, 7]. Here, synergism of the indicators X and Z is defined as the presence of X = 1 and Z = 1 in the same sufficient cause of Y = 1, i.e., the sufficient cause cannot act without both X = 1 and Z = 1. Similarly, antagonism of X = 1 by Z = 1 is defined as the presence of X = 1 and Z = 0 in the same sufficient cause of Y = 1. These definitions are also coding dependent. 2.2 Extensions to Continuous Outcomes The use of indicators in the above definitions may appear restrictive but is not. For example, to subsume a continuous outcome T such as death time, we may define Y t as the indicator for T t and apply the above definitions to each Y t. Similar devices can be applied to incorporate continuous exposure variables [6]. The resulting set of indicators is of course unwieldy, and in application has to be simplified by modeling constraints (e.g., proportional hazards for T). 3 Noncausal (Statistical) Interaction Both the preceding usages of effect modification and interaction refer to causal phenomena (see Causality/Causation). In the statistics literature, interaction is often used without explicit reference to causality. For example, in the context of regression modeling, an interaction term is usually nothing more than a term involving the product of two or more variables. Consider a logistic regression (See Logistic Regression in Practice) to predict a man s actual sexual preference A (A = 1 for men, 0 for women) from Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 3

4 his self-reported preference R and the interviewer s gender G (G = 1 for male, 0 for female), P(A = 1 R.G) = expit(" +$R +(G +* R G) (1) where expit(x) = e x /(1 + e x ) is the logistic function (See Logistic distribution). Such a model can be useful in correcting for misreporting. The term * R G (or sometimes just R G or just *) is often called an interaction term. It is, however, more accurately called a product term, for presumably, neither self-report nor interviewer status has any causal effect on actual preference, and thus cannot interact causally or modify each other s effect (since there is no effect to modify). If * 0, the product term implies that the regression of A on R depends on G: for male interviewers the regression of A on R is P(A = 1 R.G = 1) = expit(" +$R +( 1 +* R 1) = expit(" +(+($ +*)R) (2) whereas for female interviewers the regression of A on R is P(A = 1 R.G = 0) = expit(" +$R +( 0 +* R 0) = expit(" +$R) (3) Thus we can say that the gender of the interviewer affects or modifies the logistic regression of actual preference on self-report. Nonetheless, since neither interviewer gender nor self-report affect actual preference (biologically or otherwise), they have no biologic interaction. When both the factors in the regression do causally affect the outcome, it is common to take the presence of a product term in a model as implying biologic interaction, and conversely to take absence of a product term as implying no biologic interaction. Neither inference is correct. The size and even direction of the product term can change with choice regression model (e.g., linear versus logistic), whereas biologic interaction is a natural phenomenon oblivious to our choice of model for analysis [1, 8]. Assuming no bias is present, however, a product term in a linear statistical model for a causal dependency (See Linear Statistical Models for Causation: A Critical Review) can only arise from the presence of biologic interaction in the dependent-action sense [1, 2]. 4 Related Articles Linear Statistical Models for Causation: A Critical Review Interactions with examples Interaction Model: Overview Interaction Effects Interaction Effect modification Compliance with Treatment Allocation 4 Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved.

5 References [1] Greenland, S., Lash, T.L. & Rothman, K.J. (2008). In Modern Epidemiology, 3rd Edition, K.J. Rothman, S. Greenland & T.L. Lash, eds, Lippincott, Philadelphia, Chapter 5. [2] Greenland, S. & Poole, C. (1988). Invariants and noninvariants in the concept of interdependent effects, Scandinavian Journal of Work, Environment, and Health 14, [3] Weinberg, C.R. (1986). Applicability of simple independent-action model to epidemiologic studies involving two factors and a dichotomous outcome, American Journal of Epidemiology 123, [4] Greenland, S. (1993). Additive-risk versus additive relative-risk models, Epidemiology 4, [5] Moolgavkar, S. (1986). Carcinogenesis modeling: from molecular biology to epidemiology, Annual Review of Public Health 7, [6] Greenland, S. (1993). Basic problems in interaction assessment, Environmental Health Perspectives 101,(Suppl. 4), [7] Vander Weele, T.J. & Robins, J.M. (2007). The identification of synergism in the sufficient-component cause framework, Epidemiology 18, [8] Rothman, K.J. (1976). Causes, American Journal of Epidemiology 104, Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 5

The distinction between a biologic interaction or synergism

The distinction between a biologic interaction or synergism ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light

More information

The identification of synergism in the sufficient-component cause framework

The identification of synergism in the sufficient-component cause framework * Title Page Original Article The identification of synergism in the sufficient-component cause framework Tyler J. VanderWeele Department of Health Studies, University of Chicago James M. Robins Departments

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

This paper revisits certain issues concerning differences

This paper revisits certain issues concerning differences ORIGINAL ARTICLE On the Distinction Between Interaction and Effect Modification Tyler J. VanderWeele Abstract: This paper contrasts the concepts of interaction and effect modification using a series of

More information

Chapter 5 Effect Modification

Chapter 5 Effect Modification Chapter 5 Effect Modification Sources of confusion Effect modification is one of the trickier concepts in epidemiology for at least three reasons. First, it goes by another name interaction and the synonym

More information

The identi cation of synergism in the su cient-component cause framework

The identi cation of synergism in the su cient-component cause framework The identi cation of synergism in the su cient-component cause framework By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Known unknowns : using multiple imputation to fill in the blanks for missing data

Known unknowns : using multiple imputation to fill in the blanks for missing data Known unknowns : using multiple imputation to fill in the blanks for missing data James Stanley Department of Public Health University of Otago, Wellington james.stanley@otago.ac.nz Acknowledgments Cancer

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal

More information

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A.

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Simple Sensitivity Analysis for Differential Measurement Error By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Abstract Simple sensitivity analysis results are given for differential

More information

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla. OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics

More information

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Empirical and counterfactual conditions for su cient cause interactions

Empirical and counterfactual conditions for su cient cause interactions Empirical and counterfactual conditions for su cient cause interactions By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, Illinois

More information

Causal Diagrams. By Sander Greenland and Judea Pearl

Causal Diagrams. By Sander Greenland and Judea Pearl Wiley StatsRef: Statistics Reference Online, 2014 2017 John Wiley & Sons, Ltd. 1 Previous version in M. Lovric (Ed.), International Encyclopedia of Statistical Science 2011, Part 3, pp. 208-216, DOI: 10.1007/978-3-642-04898-2_162

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Matched-Pair Case-Control Studies when Risk Factors are Correlated within the Pairs

Matched-Pair Case-Control Studies when Risk Factors are Correlated within the Pairs International Journal of Epidemiology O International Epidemlologlcal Association 1996 Vol. 25. No. 2 Printed In Great Britain Matched-Pair Case-Control Studies when Risk Factors are Correlated within

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Causal mediation analysis: Definition of effects and common identification assumptions

Causal mediation analysis: Definition of effects and common identification assumptions Causal mediation analysis: Definition of effects and common identification assumptions Trang Quynh Nguyen Seminar on Statistical Methods for Mental Health Research Johns Hopkins Bloomberg School of Public

More information

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele Original Article A Unification of Mediation and Interaction A 4-Way Decomposition Tyler J. VanderWeele Abstract: The overall effect of an exposure on an outcome, in the presence of a mediator with which

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018 , Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of

More information

Estimation of direct causal effects.

Estimation of direct causal effects. University of California, Berkeley From the SelectedWorks of Maya Petersen May, 2006 Estimation of direct causal effects. Maya L Petersen, University of California, Berkeley Sandra E Sinisi Mark J van

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Causality theory for policy uses of epidemiological measures

Causality theory for policy uses of epidemiological measures Chapter 6.2 Causality theory for policy uses of epidemiological measures Sander Greenland This paper provides an introduction to measures of causal effects and focuses on underlying conceptual models,

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome

More information

Informally, the consistency rule states that an individual s

Informally, the consistency rule states that an individual s BRIEF REPORT TECHNICAL REPORT R-358 August 2010 On the Consistency Rule in Causal Inference Axiom, Definition, Assumption, or Theorem? Judea Pearl Abstract: In 2 recent communications, Cole and Frangakis

More information

Four types of e ect modi cation - a classi cation based on directed acyclic graphs

Four types of e ect modi cation - a classi cation based on directed acyclic graphs Four types of e ect modi cation - a classi cation based on directed acyclic graphs By TYLR J. VANRWL epartment of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

Bounding the Probability of Causation in Mediation Analysis

Bounding the Probability of Causation in Mediation Analysis arxiv:1411.2636v1 [math.st] 10 Nov 2014 Bounding the Probability of Causation in Mediation Analysis A. P. Dawid R. Murtas M. Musio February 16, 2018 Abstract Given empirical evidence for the dependence

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

1. Let A and B be two events such that P(A)=0.6 and P(B)=0.6. Which of the following MUST be true?

1. Let A and B be two events such that P(A)=0.6 and P(B)=0.6. Which of the following MUST be true? 1 UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004 Exercises with Solutions Topic 2 Introduction to Probability Due: Monday

More information

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING. Judea Pearl University of California Los Angeles (

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING. Judea Pearl University of California Los Angeles ( CAUSAL INFERENCE AS COMUTATIONAL LEARNING Judea earl University of California Los Angeles www.cs.ucla.edu/~judea OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

Practical Biostatistics

Practical Biostatistics Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Example name. Subgroups analysis, Regression. Synopsis

Example name. Subgroups analysis, Regression. Synopsis 589 Example name Effect size Analysis type Level BCG Risk ratio Subgroups analysis, Regression Advanced Synopsis This analysis includes studies where patients were randomized to receive either a vaccine

More information

On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm

On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm Richard Wyss 1, Bruce Fireman 2, Jeremy A. Rassen 3, Sebastian Schneeweiss 1 Author Affiliations:

More information

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein,

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback University of South Carolina Scholar Commons Theses and Dissertations 2017 Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback Yanan Zhang University of South Carolina Follow

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Subgroup analysis using regression modeling multiple regression. Aeilko H Zwinderman

Subgroup analysis using regression modeling multiple regression. Aeilko H Zwinderman Subgroup analysis using regression modeling multiple regression Aeilko H Zwinderman who has unusual large response? Is such occurrence associated with subgroups of patients? such question is hypothesis-generating:

More information

Sampling approaches in animal health studies. Al Ain November 2017

Sampling approaches in animal health studies. Al Ain November 2017 Sampling approaches in animal health studies Al Ain 13-14 November 2017 Definitions Populations and Samples From a statistical point of view: Population is a set of measurements, that may be finite or

More information

Logic and Propositional Calculus

Logic and Propositional Calculus CHAPTER 4 Logic and Propositional Calculus 4.1 INTRODUCTION Many algorithms and proofs use logical expressions such as: IF p THEN q or If p 1 AND p 2, THEN q 1 OR q 2 Therefore it is necessary to know

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Data, Design, and Background Knowledge in Etiologic Inference

Data, Design, and Background Knowledge in Etiologic Inference Data, Design, and Background Knowledge in Etiologic Inference James M. Robins I use two examples to demonstrate that an appropriate etiologic analysis of an epidemiologic study depends as much on study

More information

Causal Inference. Prediction and causation are very different. Typical questions are:

Causal Inference. Prediction and causation are very different. Typical questions are: Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting

More information

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds American Journal of Epidemiology ª The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

More information

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA

OF CAUSAL INFERENCE THE MATHEMATICS IN STATISTICS. Department of Computer Science. Judea Pearl UCLA THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS Judea earl Department of Computer Science UCLA OUTLINE Statistical vs. Causal Modeling: distinction and mental barriers N-R vs. structural model: strengths

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012 Lecture Notes 2 Advanced Topics Econ 2050, Principles of Statistics Kevin R Foster, CCNY Spring 202 Endogenous Independent Variables are Invalid Need to have X causing Y not vice-versa or both! NEVER regress

More information

Author's response to reviews

Author's response to reviews Author's response to reviews Title: Diverse risks of incident cardiovascular disease and all-cause mortality in men and women with low cash margins living alone: cohort data from 60-year-olds Authors:

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Additive and multiplicative models for the joint effect of two risk factors

Additive and multiplicative models for the joint effect of two risk factors Biostatistics (2005), 6, 1,pp. 1 9 doi: 10.1093/biostatistics/kxh024 Additive and multiplicative models for the joint effect of two risk factors A. BERRINGTON DE GONZÁLEZ Cancer Research UK Epidemiology

More information

Difference-in-Differences Methods

Difference-in-Differences Methods Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

Species Distribution Models

Species Distribution Models Species Distribution Models Whitney Preisser ESSM 689 Quantitative Methods in Ecology, Evolution, and Biogeography Overview What are SDMs? What are they used for? Assumptions and Limitations Data Types

More information

DISCRETE PROBABILITY DISTRIBUTIONS

DISCRETE PROBABILITY DISTRIBUTIONS DISCRETE PROBABILITY DISTRIBUTIONS REVIEW OF KEY CONCEPTS SECTION 41 Random Variable A random variable X is a numerically valued quantity that takes on specific values with different probabilities The

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

ABSTRACT INTRODUCTION. SESUG Paper

ABSTRACT INTRODUCTION. SESUG Paper SESUG Paper 140-2017 Backward Variable Selection for Logistic Regression Based on Percentage Change in Odds Ratio Evan Kwiatkowski, University of North Carolina at Chapel Hill; Hannah Crooke, PAREXEL International

More information

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Stephen Burgess Department of Public Health & Primary Care, University of Cambridge September 6, 014 Short title:

More information

Interpretation of the Fitted Logistic Regression Model

Interpretation of the Fitted Logistic Regression Model CHAPTER 3 Interpretation of the Fitted Logistic Regression Model 3.1 INTRODUCTION In Chapters 1 and 2 we discussed the methods for fitting and testing for the significance of the logistic regression model.

More information

Chapter 5 Friday, May 21st

Chapter 5 Friday, May 21st Chapter 5 Friday, May 21 st Overview In this Chapter we will see three different methods we can use to describe a relationship between two quantitative variables. These methods are: Scatterplot Correlation

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

An overview of relations among causal modelling methods

An overview of relations among causal modelling methods International Epidemiological Association 2002 Printed in Great Britain International Journal of Epidemiology 2002;31:1030 1037 THEORY AND METHODS An overview of relations among causal modelling methods

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014 Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of

More information

CAUSAL DIAGRAMS. From their inception in the early 20th century, causal systems models (more commonly known as structural-equations

CAUSAL DIAGRAMS. From their inception in the early 20th century, causal systems models (more commonly known as structural-equations Greenland, S. and Pearl, J. (2007). Article on Causal iagrams. In: Boslaugh, S. (ed.). ncyclopedia of pidemiology. Thousand Oaks, CA: Sage Publications, 149-156. TCHNICAL RPORT R-332 Causal iagrams 149

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models

Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models Counting (Review) 1. There are 10 people in a club. How many ways are there to choose the following:

More information

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (www.cs.ucla. THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Modeling: Statistical vs. Causal Causal Models and Identifiability to

More information

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction Econometrics I Professor William Greene Stern School of Business Department of Economics 1-1/40 http://people.stern.nyu.edu/wgreene/econometrics/econometrics.htm 1-2/40 Overview: This is an intermediate

More information

GIS and Health Geography. What is epidemiology?

GIS and Health Geography. What is epidemiology? GIS and Health Geography { What is epidemiology? TOC GIS and health geography Major applications for GIS Epidemiology What is health (and how location matters) What is a disease (and how to identify one)

More information

Applied Epidemiologic Analysis

Applied Epidemiologic Analysis Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie Kranick Chelsea Morroni Sylvia Taylor Judith Weissman Lecture 13 Interactional questions and analyses Goals: To understand how

More information

Challenges (& Some Solutions) and Making Connections

Challenges (& Some Solutions) and Making Connections Challenges (& Some Solutions) and Making Connections Real-life Search All search algorithm theorems have form: If the world behaves like, then (probability 1) the algorithm will recover the true structure

More information

Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

More information

Effects of multiple interventions

Effects of multiple interventions Chapter 28 Effects of multiple interventions James Robins, Miguel Hernan and Uwe Siebert 1. Introduction The purpose of this chapter is (i) to describe some currently available analytical methods for using

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

Controlling for Time Invariant Heterogeneity

Controlling for Time Invariant Heterogeneity Controlling for Time Invariant Heterogeneity Yona Rubinstein July 2016 Yona Rubinstein (LSE) Controlling for Time Invariant Heterogeneity 07/16 1 / 19 Observables and Unobservables Confounding Factors

More information

Statistical Methods for Causal Mediation Analysis

Statistical Methods for Causal Mediation Analysis Statistical Methods for Causal Mediation Analysis The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

Mediation for the 21st Century

Mediation for the 21st Century Mediation for the 21st Century Ross Boylan ross@biostat.ucsf.edu Center for Aids Prevention Studies and Division of Biostatistics University of California, San Francisco Mediation for the 21st Century

More information

DATA-ADAPTIVE VARIABLE SELECTION FOR

DATA-ADAPTIVE VARIABLE SELECTION FOR DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department

More information