Effect Modification and Interaction

Size: px

Start display at page:

Download "Effect Modification and Interaction"

Samuel Thompson
5 years ago
Views:

By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses

1 By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions and concepts of effect modification, interaction, synergism, and related concepts and terms. The term effect modification has been applied to two distinct phenomena. For the first phenomenon, effect modification simply means that some chosen measure of effect varies across levels of background variables. This phenomenon is thus more precisely termed effect-measure modification, and in the statistics literature is more often termed heterogeneity or interaction [1]. For the second phenomenon, effect modification means that the mechanism of effect differs with background variables, which is known in the biomedical literature as dependent action or (again) interaction. The two phenomena are often confused, as reflected by the use of the same terms (effect modification and interaction) for both. In fact they have only limited points of contact. 1 Effect-Measure Modification (Heterogeneity of Effect) To make the concepts and distinctions precise, suppose we are studying the effects that change in a variable X will have on a subsequent variable Y, in the presence of a background variable Z that precedes X and Y. For example, X might be treatment level such as dose or treatment arm, Y might be a health outcome variable such as life expectancy following treatment, and Z might be sex (1 = female, 0 = male). To measure effects, write Y x for the outcome one would have if administered treatment level x of X; for example, if X = 1 for active treatment, X = 0 for placebo, then Y 1 is the outcome a subject will have if X = 1 is administered, and Y 0 is the outcome a subject will have if X = 0 is administered. The Y x are often called potential outcomes (see Causality/Causation). One measure of the effect of changing X from 0 to 1 on the outcome is the difference Y 1 Y 0 ; for example, if Y were life expectancy, Y 1 Y 0 would be the change in life expectancy. If this difference varied with sex in a systematic fashion, one could say that the difference was modified by sex, or that there was University of California, Los Angeles, CA, USA This article was originally published online in 2008 in Encyclopedia of Quantitative Risk Analysis and Assessment, c John Wiley & Sons, Ltd and republished in Wiley StatsRef: Statistics Reference Online, Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 1

2 heterogeneity of the difference across sex. Another common measure of effect is the ratio Y 1 /Y 0 ; if this ratio varied with sex in a systematic fashion, one could say that the ratio was modified by sex. For purely algebraic reasons, two measures may be modified in very different ways by the same variable. Furthermore, if both X and Z affect Y, absence of modification of the difference implies modification of the ratio, and vice versa. For example, suppose for the subjects under study Y 1 = 20 and Y 0 = 10 for all the males, but Y 1 = 30 and Y 0 = 15 for all the females. Then Y 1 Y 0 = 10 for males but Y 1 Y 0 = 15 for females, so there is a 5-year modification of the difference measure by sex. However, suppose we measured the effects by expectancy ratios Y 1 /Y 0, instead of differences. Then Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/15 = 2 for females as well, so there is no modification of the ratio measure by sex. Consider next an example in which Y 1 = 20 and Y 0 = 10 for males, and Y 1 = 30 and Y 0 = 20 for females. Then Y 1 Y 0 = 10 for both males and females, so there is no modification of the difference by sex. However, Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/20 = 1.5 for females, so there is modification of the ratio by sex. Finally, suppose Y 1 = 20 and Y 0 = 10 for males, and Y 1 = 60 and Y 0 = 40 for females. Then Y 1 Y 0 = 10 for males and Y 1 Y 0 = 20 for females, so the Y difference is smaller among males than among females. However, Y 1 /Y 0 = 20/10 = 2 for males and Y 1 /Y 0 = 30/20 = 1.5 for females, so the Y ratio is larger among males than among females. Thus, modification can be in the opposite direction for different measures of effect. 2 Biologic Interaction The preceding examples show that one should not, in general, equate the presence or absence of effectmeasure modification to the presence or absence of interactions in the biologic (mechanistic) sense, because effect-measure modification depends entirely on what measure one chooses to examine, whereas the mechanism is the same regardless of that choice. Nonetheless, it is possible to formulate mechanisms of action that imply homogeneity (no modification) of a particular measure. For such a mechanism, the observation of heterogeneity in that measure can be taken as evidence against the mechanism (assuming of course that the observations are valid). It would be fallacious, however, to infer the mechanism is correct if homogeneity was observed, for the usual reason that many other mechanisms (some unimagined) would imply the observation. A classic example is the simple independent-action model for the effect of X and Z on Y, in which subjects affected by changes in X are disjoint from subjects affected by changes in Z [2, 3]. This model implies homogeneity (absence of modification by Z) of the average X effect on Y when that effect is measured by the difference in the average Y. In particular, suppose Y is a disease indicator (1 if disease occurs, 0 if not). Then the average of Y is the proportion getting disease (the incidence proportion, often called the risk) and the average Y difference is the risk difference. Thus, in this context, the independentaction model implies that the risk difference for the effect of X on Y will be constant across levels of Z; in other words, the risk difference will be homogeneous across Z, or unmodified by Z. If X and Z both have effects, this homogeneity of the difference forces ratio measures of the effect of X on Y to be heterogeneous across Z. When additional factors are present in the model (such as confounders) homogeneity of the risk differences can also lead to heterogeneity of the excess risk ratios [4]. Thus, under the simple independent-action model, the independence of the X and Z effects will cause the measures other than the risk difference to be heterogeneous, or modified, across Z. Biologic models for the mechanism of X and Z interactions can lead to other patterns. For example, certain multistage models in which X and Z act at completely separate stages of a multistage mechanism can lead to homogeneity of ratios rather than differences, as well as particular dose response patterns. Special caution is needed in interpreting observed patterns, however, because converse relations do not hold. Many different plausible biologic models will imply identical patterns in the effect measures [5]. 2 Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved.

3 2.1 Synergism and Antagonism Taking the simple independent-action model as a baseline, one may offer the following dependent-action definitions for an outcome indicator Y as a function of the causal antecedents X and Z. Synergism of X = 1 and Z = 1 in causing Y = 1 is defined as necessity and sufficiency of X = 1 and Z = 1 for causing Y = 1, i.e., Y = 1 if and only if X = 1 and Z = 1. We may also say that Y = 1 in a given individual would be a synergistic response to X = 1 and Z = 1 if Y = 0 would have occurred instead if either X = 0 or Z = 0 or both. In potential-outcome notation where Y xz is the outcome when X = x and Z = z, this definition says synergistic responders have Y 11 = 1 and Y 10 = Y 01 = Y 00 = 0. Antagonism of X = 1 by Z = 1 in causing Y = 1 is defined as necessity and sufficiency of Z = 0 in order for X = 1 to cause Y = 1. This definition says antagonistic responders to X or Z have Y 10 = 1 or Y 01 = 1 or both, and Y 11 = Y 00 = 0. With these definitions, synergism and antagonism are not logically distinct concepts, but depend on the coding of X and Z. For example, switching the labels of exposed and unexposed for one factor can change apparent synergy to apparent antagonism, and vice versa [1, 2]. The only label-invariant property is whether the effect of X on a given person is altered by the level of Z, i.e., the action of X is dependent on Z. If so, by definition we have biologic interaction. Absence of any synergistic or antagonistic interaction among levels of X and Z implies homogeneity (absence of modification by Z) of the average X effect across levels of Z when the X effect is measured by the differences in Y across levels of X [2, 6]. The converse is false, however. Homogeneity of the difference measures (e.g., lack of modification of the risk difference) does not imply absence of synergy or antagonism, because such homogeneity can arise through other means (e.g., averaging out of the synergistic and antagonistic effects across the population being examined). A more restrictive set of definitions is based on the sufficient-component cause model of causation [1, 7]. Here, synergism of the indicators X and Z is defined as the presence of X = 1 and Z = 1 in the same sufficient cause of Y = 1, i.e., the sufficient cause cannot act without both X = 1 and Z = 1. Similarly, antagonism of X = 1 by Z = 1 is defined as the presence of X = 1 and Z = 0 in the same sufficient cause of Y = 1. These definitions are also coding dependent. 2.2 Extensions to Continuous Outcomes The use of indicators in the above definitions may appear restrictive but is not. For example, to subsume a continuous outcome T such as death time, we may define Y t as the indicator for T t and apply the above definitions to each Y t. Similar devices can be applied to incorporate continuous exposure variables [6]. The resulting set of indicators is of course unwieldy, and in application has to be simplified by modeling constraints (e.g., proportional hazards for T). 3 Noncausal (Statistical) Interaction Both the preceding usages of effect modification and interaction refer to causal phenomena (see Causality/Causation). In the statistics literature, interaction is often used without explicit reference to causality. For example, in the context of regression modeling, an interaction term is usually nothing more than a term involving the product of two or more variables. Consider a logistic regression (See Logistic Regression in Practice) to predict a man s actual sexual preference A (A = 1 for men, 0 for women) from Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 3

4 his self-reported preference R and the interviewer s gender G (G = 1 for male, 0 for female), P(A = 1 R.G) = expit(" +$R +(G +* R G) (1) where expit(x) = e x /(1 + e x ) is the logistic function (See Logistic distribution). Such a model can be useful in correcting for misreporting. The term * R G (or sometimes just R G or just *) is often called an interaction term. It is, however, more accurately called a product term, for presumably, neither self-report nor interviewer status has any causal effect on actual preference, and thus cannot interact causally or modify each other s effect (since there is no effect to modify). If * 0, the product term implies that the regression of A on R depends on G: for male interviewers the regression of A on R is P(A = 1 R.G = 1) = expit(" +$R +( 1 +* R 1) = expit(" +(+($ +*)R) (2) whereas for female interviewers the regression of A on R is P(A = 1 R.G = 0) = expit(" +$R +( 0 +* R 0) = expit(" +$R) (3) Thus we can say that the gender of the interviewer affects or modifies the logistic regression of actual preference on self-report. Nonetheless, since neither interviewer gender nor self-report affect actual preference (biologically or otherwise), they have no biologic interaction. When both the factors in the regression do causally affect the outcome, it is common to take the presence of a product term in a model as implying biologic interaction, and conversely to take absence of a product term as implying no biologic interaction. Neither inference is correct. The size and even direction of the product term can change with choice regression model (e.g., linear versus logistic), whereas biologic interaction is a natural phenomenon oblivious to our choice of model for analysis [1, 8]. Assuming no bias is present, however, a product term in a linear statistical model for a causal dependency (See Linear Statistical Models for Causation: A Critical Review) can only arise from the presence of biologic interaction in the dependent-action sense [1, 2]. 4 Related Articles Linear Statistical Models for Causation: A Critical Review Interactions with examples Interaction Model: Overview Interaction Effects Interaction Effect modification Compliance with Treatment Allocation 4 Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved.

5 References [1] Greenland, S., Lash, T.L. & Rothman, K.J. (2008). In Modern Epidemiology, 3rd Edition, K.J. Rothman, S. Greenland & T.L. Lash, eds, Lippincott, Philadelphia, Chapter 5. [2] Greenland, S. & Poole, C. (1988). Invariants and noninvariants in the concept of interdependent effects, Scandinavian Journal of Work, Environment, and Health 14, [3] Weinberg, C.R. (1986). Applicability of simple independent-action model to epidemiologic studies involving two factors and a dichotomous outcome, American Journal of Epidemiology 123, [4] Greenland, S. (1993). Additive-risk versus additive relative-risk models, Epidemiology 4, [5] Moolgavkar, S. (1986). Carcinogenesis modeling: from molecular biology to epidemiology, Annual Review of Public Health 7, [6] Greenland, S. (1993). Basic problems in interaction assessment, Environmental Health Perspectives 101,(Suppl. 4), [7] Vander Weele, T.J. & Robins, J.M. (2007). The identification of synergism in the sufficient-component cause framework, Epidemiology 18, [8] Rothman, K.J. (1976). Causes, American Journal of Epidemiology 104, Copyright c 2008 John Wiley & Sons, Ltd. All rights reserved. 5

The distinction between a biologic interaction or synergism

The distinction between a biologic interaction or synergism ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light