Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula
|
|
- Flora Day
- 5 years ago
- Views:
Transcription
1 Multivariate Behavioral Research ISSN: (Print) (Online) Journal homepage: Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula Tom Loeys, Beatrijs Moerkerke, Olivia De Smet, Ann Buysse, Johan Steen & Stijn Vansteelandt To cite this article: Tom Loeys, Beatrijs Moerkerke, Olivia De Smet, Ann Buysse, Johan Steen & Stijn Vansteelandt (2013) Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula, Multivariate Behavioral Research, 48:6, , DOI: / To link to this article: Published online: 11 Dec Submit your article to this journal Article views: 594 View related articles Citing articles: 7 View citing articles Full Terms & Conditions of access and use can be found at Download by: [Wagner College] Date: 12 August 2017, At: 08:17
2 Multivariate Behavioral Research, 48: , 2013 Copyright Taylor & Francis Group, LLC ISSN: print/ online DOI: / Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula Tom Loeys and Beatrijs Moerkerke Department of Data Analysis, Ghent University, Belgium Olivia De Smet and Ann Buysse Department of Experimental-Clinical and Health Psychology, Ghent University, Belgium Johan Steen and Stijn Vansteelandt Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium In the social sciences, mediation analysis has typically been formulated in the context of linear models using the Baron & Kenny (1986) approach. Extensions to nonlinear models have been considered but lack formal justification. By placing mediation analysis within the counterfactual framework of causal inference one can define causal mediation effects in a way that is not tied to a specific statistical model and identify them under certain no unmeasured confounding assumptions. Corresponding estimation procedures using parametric or nonparametric models, based on the so-called mediation formula, have recently been proposed in the psychological literature and made accessible through the R-package mediation. A number of limitations of the latter approach are discussed and a more flexible approach using natural effects models is proposed as an alternative. The latter builds on the same counterfactual framework but enables interpretable and parsimonious modeling of direct and mediated effects and facilitates tests of hypotheses that would otherwise be difficult or impossible to test. We illustrate the approach in a Correspondence concerning this article should be addressed to Tom Loeys, Department of Data Analysis Ghent University, Henri Dunantlaan 1, 9000 Gent, Belgium. tom.loeys@ugent.be 871
3 872 LOEYS ET AL. study of individuals who ended a romantic relationship and explore whether the effect of attachment anxiety during the relationship on unwanted pursuit behavior after the breakup is mediated by negative affect during the breakup. This article discusses the expansion of traditional mediation analysis (Baron & Kenny, 1986; MacKinnon, 2008) to settings where the mediator and/or outcome are not measured at the interval level but are categorical (i.e., binary, ordinal, or count). As an illustrating example throughout this article we consider data from the Interdisciplinary Project for the Optimization of Separation Trajectories conducted in Flanders ( a cooperation of psychologists, lawyers, and economists from Ghent University and the University of Leuven in Belgium. This research project carried out a large-scale recruitment of formerly married people. We focus on a sample of 385 individuals (i.e., not both ex-partners but individuals were targeted) who responded to an adapted version of the Relational Pursuit-Pursuer Short Form (RP-PSF; Cupach & Spitzberg, 2004) used to assess the extent of unwanted pursuit behaviors (UPBs) the participant showed toward the ex-partner since the breakup. The sum of 28 RP-PSF items (ranging from leaving unwanted gifts to threatening to hurt yourself ), each measured on a 5-point Likert scale from 0 (never) to 4 (more than 5 times), was used as an overall index of perpetration (with higher scores indicating higher levels of perpetrations). As about 67% of the participants did not show any UPB, we mainly focus on a dichotomized yes or no UPB outcome throughout this article. Although many relationship characteristics were evaluated as predictors for the UPB outcome (De Smet, Loeys, & Buysse, 2012), we assess the effect of the level of anxious attachment in the relationship with the ex-partner before the breakup, which was measured using a total of five anxious attachment items (e.g., My desire to be very close sometimes scared my ex-partner away ) from an adapted Experience in Close Relationships scaleshort form (Wei, Russell, Mallinckrodt, & Vogel, 2007), on showing UPB or not. Furthermore, respondents rated on a 9-point Likert scale (from 0 D not at all to 8 D very much) how strongly they experienced 10 negative emotions when reflecting on their breakup (anxious, angry, frustrated, sad, jealous, ashamed, guilty, hurt, depressed, unhappy). The sum of these items was considered as a measure of negative affect. Here, researchers are interested in knowing whether the effect of anxious relationship attachment before the breakup (denoted as X and referred to as the independent variable or exposure ) on showing UPB toward the expartner after the breakup or not (denoted as the outcome Y ) is mediated by negative affect during the breakup (denoted by mediator M ). Negative affect is considered a mediator if a change in anxious attachment level causes a change in negative affect, which in turn causes a change in showing UPB (MacKinnon, 2008). The corresponding indirect or mediated effect is represented
4 FLEXIBLE MEDIATION ANALYSIS 873 FIGURE 1 Mediational causal model with baseline covariates C, independent variable X, mediator M, and outcome Y. by the path from X to Y through M in the path diagram in Figure 1. All of the effect that is not mediated by M will be termed the direct effect, and is depicted by the arrow from X to Y in Figure 1. In this setting, we have further measured the following baseline covariates C : age, gender and education level of the respondent. Careful consideration of which baseline covariates to measure is an important task at the design stage. Indeed, given these observed baseline covariates, the following four assumptions are typically required in mediation analyses (Pearl, 2001; VanderWeele & Vansteelandt, 2009): (A1) no unmeasured confounding of the X-M relationship, (A2) no unmeasured confounding of the X-Y relationship, (A3) no unmeasured confounding of the M- Y relationship, and (A4) no confounders of the M-Y relationship that are affected by X. Assumptions (A1) through (A4) are required in standard approaches for mediation analysis but are often not explicitly expressed. Violations of each of these assumptions can lead to considerably biased estimates of the direct and indirect effects of interest. Although assumptions (A1) and (A2) are met in simple randomized experiments, this is no longer true in observational studies like our illustration. For example, if the number of previously failed relationships predicted both the anxious attachment in the relationship before the breakup and negative affect during the breakup (UPB, respectively) but was unmeasured, assumption (A1) and assumption (A2), respectively, would be violated. Assumptions (A3) and (A4) are never guaranteed to hold, even if the independent variable were randomly assigned. Assumption (A3) would be violated if, for example, a family history of divorce predicted both negative affect and UPB but was unmeasured. In the earlier mediation literature, this point about controlling for the mediator-outcome confounders was already made by Judd and Kenny (1981) but was not pointed out by Baron and Kenny (1986) and subsequently ingored by much of the social science literature. Ignoring such confounders may induce a spurious correlation (Holland, 1986) between negative affect and UPB and lead to an apparent mediation effect even if in reality there
5 874 LOEYS ET AL. is none. Finally, if the individual of the couple who initiated the breakup ( the ex-partner, myself, or both ) is causally affected by the attachment level and predict both negative affect and UPB, assumption (A4) would be violated. In the remainder of this article, we assume that the aforementioned set of measured baseline covariates C is sufficient for assumptions (A1) through (A4) to hold. The article is organized as follows: We first describe traditional mediation analysis in linear models and show how this same approach is typically adapted for nonlinear associations in the psychological literature. As the causal mediation literature has pointed out over the last few years (Muthén, 2011; Pearl, 2012; VanderWeele & Vansteelandt, 2009), the latter lacks formal justification and interpretability and may yield biased estimates of the causal effects of interest. Using the mediation formula (Pearl, 2012), Imai, Keele, and Tingley (2010) recently proposed a general framework for causal mediation analysis in the presence of nonlinear associations. We highlight the pros and cons of their approach, and show how natural effects models (Lange, Vansteelandt, & Bekaert, 2012) may overcome some of the limitations by offering more modeling flexibility and simplifying testing for direct and indirect effects. We end with a discussion. STANDARD MEDIATION ANALYSIS In the standard mediation analysis model with linear associations, the widely known Baron and Kenny (1986) approach focuses on the following three key parameters: (a) the effect of the independent variable on the mediator conditional on baseline covariates (denoted by coefficient a in Figure 1), (b) the effect of mediator on the outcome conditional on the independent variable and baseline covariates (denoted by b), and (c) the effect of the independent variable on the outcome conditional on the mediator and baseline covariates (denoted by c 0 ). The latter reflects a direct effect, whereas a measure of indirect effect is obtained as a product of the two effects a and b or alternatively as the difference between the total effect (denoted by c) and the direct effect (MacKinnon, 2008). Assuming linear relationships for the variables depicted in Figure 1, this can be formally expressed by the following three equations: EŒM i j X i ; C i D i 1 C ax i C dc i (1) EŒY i j X i ; C i D i 2 C cx i C ec i (2) EŒY i j X i ; C i ; M i D i 3 C bm i C c 0 X i C f C i : (3) Note that in these equations we also need to adjust for baseline confounders C i, if present, as no adjustment for such measured confounders would result in biased causal effect estimates. This approach yields direct and indirect effects
6 FLEXIBLE MEDIATION ANALYSIS 875 that can be causally interpreted under the assumptions (A1) through (A4) as long as linear relationships can be assumed and there are no independent variable-bymediator interactions. VanderWeele and Vansteelandt (2009) relaxed the latter assumption and derived closed-form expressions for the direct and indirect effect in the presence of interactions in linear models. When the mediator and/or outcome are not measured on the interval level, linear relationships such as Equations (1), (2), and (3) are typically no longer appropriate. In our illustration, the mediator, negative affect, is measured at the interval level but the (dichotomized) outcome, showing UPB or not, is binary. Logistic or probit regression is typically used for the latter, and Equations (2) and (3) may be respectively replaced by g.eœy i j X i ; C i / D i 2 C cx i C ec i (4) g.eœy i j X i ; C i ; M i / D i 3 C bm i C c 0 X i C f C i (5) with g the logit or probit link. MacKinnon (2008) clearly described how estimation of the mediation effect based on the product of the effects a and b and on the difference between c and c 0 can yield very different results in such models. The reason relates to the different scales that are being used. MacKinnon and Dwyer (1993) provided details for a method based on standardization that yields much closer estimates. Coxe and MacKinnon (2010) performed a similar exploration for count outcomes and reached similar conclusions about the effect of standardization. As an illustration, we return to our example where interest lies in estimating the effect of anxious attachment on showing UPB that is mediated by negative affect. We fit a logistic (and probit) regression model for the outcome and a linear regression model for the mediator (Table 1) conditional on the aforementioned potential baseline confounders (gender, age, and education level). To facilitate the interpretation of the regression coefficients, both the anxious attachment score and negative affect scores were standardized. We find that at fixed levels of gender, age, and education, a standard deviation increase in anxious attachment increases the odds of showing UPB with a factor exp.0:497/ D 1:64 (95% CI from 1.32 to 2.05, p < :001). Further adjustment for negative affect leads to an attenuated but still significant effect of attachment anxiety on showing UPB (OR D exp.0:316/ D 1:37 with 95% CI from 1.08 to 1.74, p D :009). Similarly, a standard deviation increase in negative affect increases the odds of showing UPB with a factor exp.0:618/ D 1:86 (95% CI from 1.45 to 2.36, p < :001) at fixed levels of baseline covariates and anxious attachment. A standard deviation increase in anxious attachment is associated with an average increase in negative affect of 0:34 standard deviations (95% CI from 0.25 to 0.43, p < :001). The product of coefficient estimate of the mediated effect is not very different here
7 876 LOEYS ET AL. TABLE 1 Direct and Indirect Effects of Attachment Anxiety on Unwanted Pursuit Behavior Logit-Link Probit-Link Unstandardized Standardized Unstandardized Standardized Est. SE Est. SE Est. SE Est. SE c c c c a b a b Note. Estimates (with standard errors) obtained by applying the Baron and Kenny approach in the presence of nonlinear associations with and without standardization. from the difference of coefficients estimate. For completeness we also present the standardized coefficients (used to equate the scale across models) but reach similar conclusions. The results from probit analyses are also presented, but the coefficients are more difficult to interpret. Regardless of the link function that is used, is it valid to use the product of coefficients or difference in coefficient method here? And what is the interpretation that can be given to these mediated effects? VanderWeele and Vansteelandt (2010) showed that under assumptions (A1) through (A4), normally distributed error in Equation (1), no interaction between the independent variable and mediator in Model (4), and a rare outcome of interest, both of these estimators are nearly identical and appproximate a well-defined measure of a mediated effect but not generally otherwise. In more general cases, Imai, Keele, and Tingley (2010) and Pearl (2012) among others pointed out that the nonlinearity of those logistic or probit models implies that unlike the linear case, neither the (standardized) product nor the (standardized) difference method consistently estimate the average causal mediation effect and that the bias can be substantial in some situations. In the next section we more clearly define the notion of the average causal mediation effect and show how the mediation formula can be used to estimate the latter. THE COUNTERFACTUAL FRAMEWORK AND THE MEDIATION FORMULA To address questions concerning mediation, we use the counterfactual framework (Imai, Booil, & Stuart, 2011; Rubin, 1978) and first introduce some notation.
8 FLEXIBLE MEDIATION ANALYSIS 877 Let M i.x/ and Y i.x/ denote the mediator and outcome that would have been observed for participant i had the independent variable X i been set at the value x. The mediator M i.x/ and outcome Y i.x/ may not be the mediator and outcome that are observed and are therefore possibly counterfactual. For example, although the observed anxious attachment level of participant i might be one standard deviation above the sample mean (i.e., X i D 1), and the value of negative affect M i.1/ and UPB Y i.1/ under this value are measured, one may ask the question what would have been the value of the mediator and the observed outcome if his or her anxious attachment level were at the sample mean. In other words, we would like to know the counterfactual outcomes M i.0/ and Y i.0/. Similarly, let Y i.x; M i.x // denote the counterfactual outcome that would have been observed if the level of the independent variable X i for participant i were set to x and M i to the value it would have taken if X i were set to x. Such nested counterfactuals are very useful to formulate direct and indirect effects. Indeed, the direct effect (Pearl, 2001; Robins & Greenland, 1992) can be expressed as EŒY i.x; M i.x // j C i EŒY i.x ; M i.x // j C i : (6) Because the first argument (i.e., the level of the independent variable) changes but the second (i.e., the mediator) does not, this expression can be seen as the difference in outcomes when the independent variable is changed from some reference level x to some level of interest x while the mediator is held constant at the values it would obtain for the reference level. In our example, EŒY i.1; M i.0// EŒY i.0; M i.0// denotes the direct effect on showing UPB of a standard deviation increase in anxious attachment from its mean level while fixing the negative affect at the level observed at average anxious attachment levels. To contrast it with the so-called controlled direct effect (i.e., EŒY i.x; m/ j C i EŒY i.x ; m/ j C i ), the quantity in Equation (6) is sometimes referred to as the natural direct effect (Pearl, 2001; VanderWeele & Vansteelandt, 2009). The controlled direct effect expresses the effect of the independent variable that would be realized if the mediator were fixed at level m uniformly in the population. This measure of direct effect is less natural because it is often not realistic to imagine scenarios where one would consider forcing the mediator to be the same in the whole population. The natural direct effect overcomes this limitation because the level M.x / at which the mediator is controlled allows for variation between participants (Pearl, 2001). The indirect effect or causal mediation effect (Pearl, 2001; Robins & Greenland, 1992) is defined as EŒY i.x ; M i.x// j C i EŒY i.x ; M i.x // j C i : (7) Now the first argument is held fixed at some reference level x of the independent variable while the mediator is changed from values that would be
9 878 LOEYS ET AL. obtained under the level of interest x versus the reference level x of the independent variable. Interestingly, the difference between the total causal effect (i.e., EŒY i.x/ j C i EŒY i.x / j C i ) and the natural direct effect of Equation (6) equals Equation (7) and thus carries the interpretation of a natural indirect effect. The latter property may not always hold for controlled direct effects (Robins & Greenland, 1992; VanderWeele & Vansteelandt, 2009). Under assumptions (A1) to (A4), the (conditional) expectation of the counterfactual outcomes Y i.x; M i.x //, that is, EŒY i.x; M i.x // j C i D c, can be be calculated using the mediation formula (Pearl, 2012): X E.Y i j X i D x; M i D m; C i D c/ Pr.M i D m j X i D x ; C i D c/: (8) m As long as the linearity assumptions as expressed in Equations (1) and (3) hold, it can easily be shown using Expressions (7) and (8) that the causal mediation effect is consistently estimated by the product of coefficients methods and the direct effect by the coefficient c 0 (Imai, Keele, & Yamamoto, 2010; VanderWeele & Vansteelandt, 2009). When the linearity assumptions do not hold, straightforward closed-form expressions for the causal mediation effect can sometimes but often not readily be obtained (VanderWeele & Vansteelandt, 2010); alternatively one can, for example, rely on Monte Carlo sampling to derive causal direct and indirect effects (Imai, Keele, Tingley, & Yamamoto, 2010). Specifically, given a statistical model for the mediator and outcome, one can first sample M i.x / from the mediator model and next Y i.x; M i.x // from the outcome model. Once draws for these counterfactual outcomes are obtained, one can use Equation (8) to derive direct and indirect effects of interest. Imai, Keele, Tingley, et al. (2010) proposed two possible algorithms and implemented their approaches in the R-library mediation. This package is very convenient to use: (a) it can deal with a wide range of types of mediators and outcomes, (b) one simply has to specify a model for the mediator and outcome, (c) it provides a nice summary table with direct and indirect effect estimates, and (d) it can be complemented with an intuitive sensitivity analysis addressing the impact of violation of assumption (A3). Using this package to analyze the UPB data and assuming a linear model for the mediator (i.e., negative affect) with predictor anxious attachment; baseline covariates gender, age, and education; and a logistic (probit, respectively) regression model for the probability of showing UPB with anxious attachment, negative affect, and the baseline covariates as predictors but no interactions, one finds both a significant direct and mediation effect (Table 2). Note that these effects cannot be compared with the ones obtained in Table 1 because the causal effects from the mediation package should be interpreted on an additive scale (i.e., as risk differences) rather than the logit/probit scale in the previous analyses.
10 FLEXIBLE MEDIATION ANALYSIS 879 TABLE 2 Direct and Indirect Effects of Attachment Anxiety on Unwanted Pursuit Behavior (With 95% CI) Logit-Link Probit-Link No Interactions Direct EŒY.1; M.0// Y.0; M.0// (0.023, 0.122) (0.020, 0.125) Mediated EŒY.0; M.1// Y.0; M.0// (0.028, 0.069) (0.027, 0.070) Independent Variable-by-Mediator Interaction Direct EŒY.1; M.0// Y.0; M.0// (0.020, 0.125) (0.020, 0.126) EŒY.1; M.1// Y.0; M.1// (0.034, 0.147) (0.035, 0.152) Mediated EŒY.0; M.1// Y.0; M.0// (0.023, 0.065) (0.022, 0.066) EŒY.1; M.1// Y.1; M.0// (0.037, 0.089) (0.036, 0.088) Note. Results based on causal mediation analysis using the mediation package. For example, fixing the mediator at the value observed under a zero value for the independent variable (i.e., at an average anxious attachment level), the effect of a standard deviation increase in attachment anxiety, which is not mediated by negative affect, amounts to a 7% increase in the probability of showing UPB. Similarly, the causal mediation effect amounts to an increase of about 5%. This example also illustrates limitations of direct application of the mediation formula. First, the direct and indirect effect estimates from the mediation package were reported at the default levels of 1 and 0 of the independent variable but may be quite different for a unit increase at other exposure levels. In our illustration, the different estimated effects for a unit increase in the independent variable (e.g., the direct effect when changing the level of the independent variable from 1 to 0 or from 1 to 2; : : : instead of the default 0 to 1) change only slightly within the relevant range of the independent variable. In Appendix A we provide an artificial example illustrating that the estimated direct and indirect effects for such unit increase may heavily depend on the choice of the reference level and thereby mislead the practitioner to naively conclude, based on the default settings, that there is no direct effect of the independent variable on the outcome. As the number of such choices is infinite for a continuous independent variable, the practitioner simply cannot provide a single answer on this scale to the question of an (in-)direct effect or not. This also makes testing for the presence of a direct effect impractical and calls for a more parsimonious parametrization. Second, the effects obtained via the mediation package are marginalized over the baseline covariates that are included as predictors in the model for the mediator and outcome thereby ruling out the possibility to perform moderated mediation tests, which are often of interest (Preacher, Rucker, &
11 880 LOEYS ET AL. Hayes, 2007). Although one can request conditional or stratum-specific effects, this is impractical for continuous predictors like age because of the sparsity of the resulting strata. Moreover, the stratum-specific effects obtained from the mediation package have a more limited utility because it does not use the stratum-specific (but the overall) covariate distribution to marginalize over other covariates. As covariate distributions may vary substantially over strata, such marginalized stratum-specific effects may lack interpretation (see Appendix B for an artificial example). Adding an interaction between the independent variable and mediator in the logistic/probit model for the outcome in our example reveals some evidence of an independent variable -by-mediator interaction effect on showing UPB (p < :05). Using the mediation package, we find that the direct effect of anxious attachment on showing UPB is increasing with higher levels of negative affect (Table 2). Similarly, there is a significant difference of (bootstrap 95% CI to 0.035) between the mediation effect at the mean level of anxious attachment (i.e., at value 0) and at levels one standard deviation above the average. Figure 2 presents the estimated mediation effect for various levels of anxious attachment. In Table 2 mediation effects are merely presented at two specific levels of the independent variable. The default choices are 0 and 1, but these choices are rather arbitrary. Other choices can straightforwardly be specified in the mediation package. Because the conclusion of a significant independent variable -by-mediator interaction should not depend on such arbitrary choices, a single test for an independent variable -by-mediator interaction would be an important asset. Third, a further FIGURE 2 The estimated mediated effect of negative affect (on an additive scale) with 95% confidence interval at various levels of attachment anxiety, that is, EŒY. ; M.1// EŒY. ; M.0// versus. Results obtained by the mediation package.
12 FLEXIBLE MEDIATION ANALYSIS 881 concern of direct application of the mediation formula approach is that nonlinear models for the outcome (e.g., a logistic regression model) induce nonadditivity. Even if a moderation test could be developed, it would therefore likely have an inflated Type I error rate. In summary, direct application of the mediation formula (e.g., the Monte Carlo integration method by Imai, Keele, Tingley, et al. (2010) in the mediation package) is very helpful to make the complex calculation of direct and indirect effects from the mediation formula manageable. It produces easy-tointerpret effect estimates for the practitioner, but its simplicity may be somewhat deceiving. Indeed, the user is left with arbitrary choices on the levels of the independent variable and population-averaged or stratum-specific effects in the final reporting, making it difficult to quantify and test the direct and indirect effects, with or without interactions. The practitioner is therefore in need of a more flexible approach that can directly model the effects of interest such that results become easier to report and hypotheses of interest become easier to test. In the remainder of this article, we discuss natural effects models that can accommodate most of these limitations by directly parameterizing the causal effects of interest. NATURAL EFFECTS MODELS Natural effects models, first introduced by Lange et al. (2012) and Vansteelandt, Bekaert, and Lange (2012) in the epidemiological literature, are conditional mean models for nested counterfactuals Y i.x; M i.x //: g EfY i.x; M i.x // j C i g D 0 W i.x; x ; C i /; (9) where g.:/ is a link function (e.g., the identity link or probit link) and W i.x; x ; C/ is a known vector whose components may depend on x, x and C i. These models are termed natural effects models because their parameters encode both natural direct and indirect effects. For example, from Expressions (6) and (7), it directly follows that in a linear model for a continuous counterfactual outcome Y i.x; M i.x // without any interactions, EfY i.x; M i.x // j C i g D 0 C 1 x C 2 x C 3 C i ; 1 and 2 capture the (natural) direct and indirect effect of an unit increase in the independent variable, respectively. Indeed, under the aforementioned linear natural effects model, the natural direct EŒY i.x; M i.x // Y i.x ; M i.x // j C i, as defined by Pearl (2001), equals 1.x x /, whereas the indirect effect EŒY i.x ; M i.x// Y i.x ; M i.x // j C i equals 2.x x /. The mediation package estimates these same quantities.
13 882 LOEYS ET AL. Under the following logistic regression model for a binary counterfactual outcome Y i.x; M i.x //, logitfefy i.x; M i.x // j C i gg D 0 C 1 x C 2 x C 3 C i ; (10) we find that the natural direct effect odds ratio oddsfy i.x; M i.x // D 1 j C i g oddsfy i.x ; M i.x // D 1 j C i g D expf 1.x x /g (11) and that the natural indirect effect odds ratio oddsfy i.x ; M i.x// D 1 j C i g oddsfy i.x ; M i.x // D 1 j C i g D expf 2.x x /g: (12) Natural effects models for binary models thus allow to quantify these effects on a more natural scale (VanderWeele & Vansteelandt, 2010) than the additive scale used in the mediation package. This makes it possible to capture each of these effects by a single parameter, thereby avoiding the aforementioned arbitrariness with respect to the choices of the level of the independent variable. As a final example, consider a Poisson regression model for a count outcome Y i.x; M i.x // allowing for moderated effects logfefy i.x; M i.x // j C i gg D 0 C 1 x C 2 x C 3 C i C 4 x C i C 5 x C i : The natural direct and indirect rate ratios (i.e., EŒY i.x; M i.x // =EŒY i.x ; M i.x // and EŒY i.x ; M i.x// =EŒY i.x ; M i.x //, respectively) equal expf. 1 C 4 C i /.x x /g and expf. 2 C 5 C i /.x x /g, respectively. Moderation of the direct or mediated effect can now easily be assessed by a simple test of 4 D 0 or 5 D 0, respectively. Because the nested counterfactual Y i.x; M i.x // is only observed when x equals x, and x corresponds to the observed level of the independent variable X i, Model (9) cannot directly be fitted. A possible estimation strategy relies on the notion that even when x differs from x, this counterfactual can still be predicted from E.Y i j X i D x; M i ; C i /, that is, a model for the outcome given the desired level of the independent variable, the observed level of the mediator, and the observed baseline confounders. This can be seen upon noting that M.x / equals M among participants with x equaling the observed value of X. Specifically, the following procedure may be adopted (Vansteelandt et al., 2012): 1. Fit imputation model: Using the observed data, build an appropriate model for the outcome conditional on the independent variable X, mediator M, and baseline variables C. This model is referred to as the imputation model.
14 FLEXIBLE MEDIATION ANALYSIS Impute nested counterfactuals: Create a new data set by repeating the observed data K times and adding two variables: (a) x, which is equal to the original level of the independent variable for the first replication and equal to a different random draw from the conditional distribution of the independent variable, given C, for all K 1 remaining replications, and (b) x, which is equal to the observed level of the independent variable. The nested counterfactual Y.x; M.x // is predicted by the observed Y when x D x and by E.Y j X D x; M; C/ when x x. 3. Fit natural effects model: Parameters of the natural effects model (9) can be estimated by regressing all imputed (observed and counterfactual) outcomes from Step (3) on x, x and C. Standard errors and confidence intervals can be obtained using the bootstrap. The estimators from this procedure are referred to as the simple (regression mean) imputation estimators (Vansteelandt et al., 2012). When assumptions (A1) to (A4) hold, and the imputation and natural effects model are correctly specified, these estimators are shown to be unbiased in large samples for the parameters indexing the natural effects model (Vansteelandt et al., 2012). Lange et al. (2012) and Vansteelandt et al. (2012) discussed alternative estimators based on inverse probability weighting but these are less appealing when the mediator and/or independent variable are continuous. Like the approach taken by the mediation package, which relies on the same set of assumptions, the natural effects model approach has advantages over traditional approaches: (a) it can deal with a wide range of mediator and outcome types and (b) it is easily implemented in standard software packages. Moreover, and in contrast to direct application of the mediation formula, the natural effects model approach (a) does not require a model for the mediator; (b) yields direct and indirect effects on the preferred (or most natural) scale; (c) renders conditional effects rather than marginal effects; and (d) allows direct effects, indirect effects, moderation effects, and independent variable -by-mediator interactions to be captured by a single or low-dimensional parameter, so these become well interpretable and hypotheses concerning these effects become easy to test. The proposed imputation estimator has close connections to imputationbased strategies for G-computation (Snowden, Rose, & Mortimer, 2011) and shares their virtues and limitations. Its attractiveness lies in its simplicity and avoidance of inverse probability weights that can make alternative proposals unstable in certain situations. However, in some cases, the simplicity of the imputation estimator for the natural effects model may come at the price of so-called model incongeniality. That is, certain combinations of models for the outcome in Step (2) and natural effects models in Step (3) may be impossible. For
15 884 LOEYS ET AL. example, if the model for a binary outcome is of the form E.Y i j X i ; M i ; C i / D ˆ. 0 C 1 X i C 2 M i C 3 C i /, and the natural effects model EŒY i.x; M i.x // j C i D ˆ. 0 C 1 xc 2 x C 3 C i C 4 x C i / is used to explore a moderation effect, the imputation model would preclude the existence of such moderation and thus bias 4 to zero. Similar problems are common for typical missing data procedures (Meng, 1994). Although one may typically want the natural effects model to be as parsimonious as possible to facilitate the interpretation of its parameters, it is recommended to include the terms from the natural effects model as a minimal set of predictor terms for the imputation model (with x replaced by M ). ILLUSTRATING EXAMPLE As an illustration we estimate the direct and indirect effect of anxious attachment style on perpetration of UPB using the natural effects model approach. First we assume no moderated mediation, that is, g EfY i.x; M i.x // j C i g D 0 C 1 x C 2 x C 3 C i ; (13) with g logit or probit link, respectively, and where C includes gender, age, and education. Following the aforementioned estimation procedure we specify an imputation in Step (1) and use here the same predictors as in Model (13) but with x replaced by M, that is, g ŒEfY i j X i ; M i ; C i g D 0 C 1 X i C 2 M i C 3 C i. For Step (2) of the estimation procedure, we take for each individual three additional values of X that are randomly drawn from the conditional distribution of X i, given C i, that is, N. 0 C 1 C i ; 2 /. Corresponding R-code for the logistic natural effects model is provided in Appendix C, and results are presented in Table 3. From Model (13) with the logit-link, we find that the direct and indirect effect odds ratio are given by Equations (11) and (12), respectively. The estimates in Table 3 hence suggest that the direct and indirect effect of a standard deviation increase in attachment anxiety on the odds of showing UPB amount to odds ratios of exp.0:298/ 1:35 (95% CI: 1.08 to 1.68) and exp.0:202/ 1:22 (95% CI: 1.11 to 1.35), respectively. Results using the probit link are presented in Table 3. Interestingly, the estimated direct and indirect effects under the naive approach (Table 1) and the natural effects model estimator are very close here for both link functions. As noted before by Vansteelandt et al. (2012) and Muthén (2011), when a linear model is assumed for the mediator and a probit model for the outcome, that is, EŒM i j X i ; C i D 0 C 1 X i C 2 C i EŒY i j X i ; M i ; C i D ˆ. 0 C 1 X i C 2 M i C 3 C i /; with ˆ the cumulative normal distribution function;
16 FLEXIBLE MEDIATION ANALYSIS 885 TABLE 3 The Direct and Indirect Effect of Attachment Anxiety on Unwanted Pursuit Behavior (With 95% CI Based on Bootstrap) Logit-Link Probit-Link No Interactions (0.077, 0.520) (0.032, 0.307) (0.102, 0.302) (0.068, 0.189) Independent Variable-by-Mediator Interaction (0.067, 0.502) (0.045, 0.323) (0.100, 0.295) (0.054, 0.173) ( 0.041, 0.136) ( 0.028, 0.080) Note. Analysis based on natural effects models: parameter estimates with 95% CI. it follows from Expressions (6) and q (7) that the causal direct q and indirect effect on the probit scale equal 1 = 1 C M and 1 2 = 1 C M, with M 2 the (residual) variance of M. Hence, when 2 and/or M 2 are small, one will indeed find close estimators (in the illustration 2 0:375 and M 2 0:884). Such closed form expressions are not available when the logit-link is used, however. In other circumstances the natural direct and indirect effects may differ substantially from the estimated effects under the naive approach. For instance, when the mediator is dichotomized in our illustrating example at the median value of attachment level, a logistic regression analysis shows that a standard deviation increase in anxious attachment increases the odds of high versus low negative affect with a factor exp.0:63/ D 1:88 (95% CI from 1.50 to 2.34, p < :001). A logistic regression for the outcome now shows that a standard deviation increase in anxious attachment increases the odds of showing UPB with a factor exp.0:37/ D 1:45 (95% CI from 1.15 to 1.83, p < :001) whereas the odds of showing UPB is exp.0:92/ D 2:52 (95% CI from 1.60 to 3.96, p < :001) times higher in the high negative affect group versus low negative affect. The product of coefficients method hence amounts to an increase of 0:59 on the log odds ratio scale in contrast to 0:13 (95% CI from 0.06 to 0.22) under the logistic natural effects model. Next we explore the independent variable -by-mediator interaction, that is, g EfY i.x; M i.x // j C i g D 0 0 C 0 1 x C 0 2 x C 0 3 C C 0 4 x x ; (14) with g logit or probit link;
17 886 LOEYS ET AL. We now let the imputation model in Step (1) of the estimation procedure have an independent variable -by-mediator interaction term. The natural effects Model (15) naturally enables one to assess whether the effect of attachment anxiety on showing UPB that is mediated by negative affect (now again considered as a continuous variable) depends on the level of attachment anxiety. This can be done by simply testing the hypothesis 4 0 D 0. Interestingly, based on Equation (15), there is no evidence for a significant independent variable - by-mediator interaction on showing UPB (Table 3). Although previously a significant difference in mediation effect at two difference levels of the independent variable (i.e., the average and one standard deviation above the average, respectively) was found using the mediation package, note that this was testing for interaction on the additive (i.e., risk difference) scale rather than the logistic scale. The lack of interaction on the logistic scale may point toward the greater aptness of this scale. As in most mediation analysis approaches, assumptions (A1) to (A4) are also critical for the natural effects model approach. Control must be made not only for variables that confound the independent variable -mediator and independent variable -outcome relationship but also for the variables that confound the mediator-outcome relationship. Even when the independent variable is randomized, the mediator-outcome relationship may be confounded, and so variables that potentially confound the latter deserve special attention. In practice, it is often not feasible to collect data on all these variables, and it is thus important to develop tools to assess the sensitivity of one s results to unmeasured confounding variables. Under the setting of Figure 3, where U and X are uncorrelated given C, Vanderweele (2010) derived easy-to-calculate bias formulas for the natural direct effect risk ratio. More specifically, it can be shown (Theorem 6, FIGURE 3 Mediational causal model underlying the sensitivity analysis with baseline covariates C, independent variable X, mediator M, outcome Y, and unmeasured confounder U.
18 FLEXIBLE MEDIATION ANALYSIS 887 VanderWeele, 2010) that under the simplifying assumptions that U is binary, and if EŒY j X D x; M D m; C D c; U D 1 =EŒY j X D X; M D m; C D c; U D 0 is constant across strata for X and C at fixed M and equal to and EŒY j X D x; M D m; C D c; U D 0 =EŒY j X D x; M D m 0 ; C D c; U D 0 D 1, P.U D 1 j X D x; C D c/ D x and P.U D 1 j X D x ; C D c/ D x, the bias for natural direct effect risk ratio reduces to 1 C. 1/ x : 1 C. 1/ x When the outcome is rare at all levels of the independent variable X, the mediator M, the baseline covariates C, and the unmeasured confounder U, this bias formula for the risk ratio can also be used for the odds ratio. When the outcome is not rare, such approximation may deviate considerably from the true bias, and additional assumptions on unobserved quantities need to be invoked, making the sensitivity analysis more cumbersome (VanderWeele & Arah, 2011). This sensitivity analysis furthermore assumes that there are no effects of the independent variable X that affect both M and Y (i.e., assumption (A4) holds). A straightforward way to explore this assumption is (a) to regress such possible measured confounder L on X and C to examine the conditional association between L and X given C ; (b) to regress the mediator M on L, X, and C and to conduct a test to see whether M and L are associated even after conditioning on X and C ; and (c) to regress the outcome Y on X, M, L, and C and to conduct a test to see whether Y and L are associated even after conditioning on X, M, and C. However, it should be noted that the failure to reject these null hypotheses of no association does not necessarily imply the absence of such confounder. When assumption (A4) is violated, the natural direct effect is typically no longer identified (Avin, Shpitser, & Pearl, 2005), but one could rely on G-computation (Robins, 1986), G-estimation (Goetgeluk, Vansteelandt, & Goetghebeur, 2009; Vansteelandt, 2012; Loeys et al., in press), or inverse probability weighting (Coffmann & Zhong, 2012) for estimation of the controlled direct effect. DISCUSSION Statistical mediation forms an important tool in the behavioral sciences but has primarily been confined to continuous mediators and outcomes. Although variables are often inherently categorical, mediation analysis with such data has been largely ignored (Hayes & Preacher, 2010). For a binary mediator or outcome, MacKinnon and Dwyer (1993) addressed the complication that the scale in logistic regression is not constant across models, unlike in the case of linear models. Although their work is frequently applied, the causal
19 888 LOEYS ET AL. interpretability of their estimator based on standardized coefficients can be problematic in some situations. The recent work of Imai, Keele, and Tingley (2010) undoubtedly forms an important milestone for the mediation literature in the social sciences. Under the aforementioned assumptions (A1) through (A4), causal direct and indirect effects can now be easily obtained in settings with essentially any type of mediator and outcome. Moreover, the availability of the mediation package (Imai, Keele, Tingley, et al., 2010) makes the approach very accessible to practitioners. Although this approach has many strengths, we also highlighted a few weaknesses in this article. Some of these limitations can easily be overcome through the proposed natural effects model approach (Vansteelandt et al., 2012), which enables parsimonious modeling of the same direct and indirect effect measures without relying on the existence of closed-form expressions as in other software (Muthén, 2011). Although we focused the illustration toward the setting where the mediator is continuous and the outcome is binary, the approach outlined in this article is much more generally applicable. It can easily be implemented in standard software packages, and an R-package is currently under development for the practitioner s convenience. The utility of the approach we described here is predicated on proper model specification. As with any statistical model, improper model specification can lead to spurious results and misleading conclusions, and our proposal forms no exception. The imputation-based estimator that we presented here relies on two models: the imputation model and the natural effects model for the nested counterfactual outcome. As for the widely adopted multiple imputation strategies for the analysis of incomplete data, the model for the observed data ( the imputation model ) is not the model of scientific interest ( the analysis model ), and this can lead to so-called model incongeniality (Meng, 1994). This is most likely to occur if the imputation model is simpler than the natural effects model and, moreover, misspecified. Sensibly using all available information has therefore been a key guideline in practice for constructing imputation models and this has been repeatedly emphasized in the multiple imputation literature (Meng, 1994). We thus favor here a rich imputation model and a parsimonious natural effects model that allows answering the researcher s main questions in a transparent way. We believe that by following this guideline, model incongeniality may have limited impact and is much less of a concern than in typical missing data settings where the imputation relies on the high-dimensional distribution of outcomes and covariates rather than just the outcome mean. Model incongeniality can in particular be avoided when the independent variable is dichotomous and randomly assigned, for then covariates C may be excluded from the natural effects model so that a saturated model can be used. An aspect in the estimation procedure that deserves further exploration is the type and number of imputations K in Step (2) that are leading to ef-
20 FLEXIBLE MEDIATION ANALYSIS 889 ficient estimators within reasonable computation time. When the number of possible levels of the independent variable is small, one can simply repeat the observed data as many times as the number of levels. Such an approach does not work for a continuous independent variable, but preliminary simulation studies (Vansteelandt et al., 2012) do not show major differences between possible sampling strategies (e.g. percentile-based or random draws from the conditional distribution). Note that misspecification of the model for the exposure given the baseline covariates does not induce bias but at most a loss of precision. Finally, the natural effects model approach outlined in this article relies on assumptions (A1) to (A4). Although assumption (A1) and (A2) are met in randomized studies, sensitivity analyses have mostly focused so far on violations of assumption (A3). Note, however, that several different sensitivity analysis approaches to handle violations of assumption (A4) have started to develop too. For a fairly simple setting, Imai and Yamamoto (2013) proposed parametric sensitivity analyses for linear models, which require data on the exposureinduced mediator-outcome confounder. Tchetgen Tchetgen and Shpitser (2012) and VanderWeele and Chiba (in press) proposed more general nonparametric techniques that do not require data on the exposure-induced mediator-outcome confounder but require specifying a large number of unidentified sensitivity analysis parameters. Vansteelandt and VanderWeele (2012) described a technique that requires data on the exposure-induced mediator-outcome confounder and moreover requires specifying an unidentified selection bias function. Although this function can be difficult to interpret in practice, their approach does have the advantage that the selection bias function is zero in the absence of threeway interactions between the exposure, mediator, and exposure-induced confounder. ACKNOWLEDGMENT Tom Loeys, Beatrijs Moerkerke, Johan Steen, and Stijn Vansteelandt thank the Flemish Research Council for financial support (Grant G ). REFERENCES Avin, C., Shpitser, I., & Pearl, J. (2005). Identifiability of path-specific effects. In L. P. Kaelbling & A. Saffiotti (Eds.), Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence IJCAI-05 (pp ). Edinburgh, UK: Morgan-Kaufmann. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic and statistical considerations. Journal of Personality and Social Psychology, 51,
Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.
FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke
More informationAn Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016
An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions
More informationCasual Mediation Analysis
Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction
More informationCausal Mechanisms Short Course Part II:
Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal
More informationRevision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE
Revision list for Pearl s THE FOUNDATIONS OF CAUSAL INFERENCE insert p. 90: in graphical terms or plain causal language. The mediation problem of Section 6 illustrates how such symbiosis clarifies the
More informationCausal mediation analysis: Definition of effects and common identification assumptions
Causal mediation analysis: Definition of effects and common identification assumptions Trang Quynh Nguyen Seminar on Statistical Methods for Mental Health Research Johns Hopkins Bloomberg School of Public
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More information13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,
13 Appendix 13.1 Causal effects with continuous mediator and continuous outcome Consider the model of Section 3, y i = β 0 + β 1 m i + β 2 x i + β 3 x i m i + β 4 c i + ɛ 1i, (49) m i = γ 0 + γ 1 x i +
More informationNon-parametric Mediation Analysis for direct effect with categorial outcomes
Non-parametric Mediation Analysis for direct effect with categorial outcomes JM GALHARRET, A. PHILIPPE, P ROCHET July 3, 2018 1 Introduction Within the human sciences, mediation designates a particular
More informationRatio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects
Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal
More informationSC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)
SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,
More informationarxiv: v2 [math.st] 4 Mar 2013
Running head:: LONGITUDINAL MEDIATION ANALYSIS 1 arxiv:1205.0241v2 [math.st] 4 Mar 2013 Counterfactual Graphical Models for Longitudinal Mediation Analysis with Unobserved Confounding Ilya Shpitser School
More informationOutline
2559 Outline cvonck@111zeelandnet.nl 1. Review of analysis of variance (ANOVA), simple regression analysis (SRA), and path analysis (PA) 1.1 Similarities and differences between MRA with dummy variables
More informationEstimating and contextualizing the attenuation of odds ratios due to non-collapsibility
Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Stephen Burgess Department of Public Health & Primary Care, University of Cambridge September 6, 014 Short title:
More informationStatistical Methods for Causal Mediation Analysis
Statistical Methods for Causal Mediation Analysis The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable
More information6.3 How the Associational Criterion Fails
6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P
More informationMediation Analysis: A Practitioner s Guide
ANNUAL REVIEWS Further Click here to view this article's online features: Download figures as PPT slides Navigate linked references Download citations Explore related articles Search keywords Mediation
More informationMediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016
Mediation analyses Advanced Psychometrics Methods in Cognitive Aging Research Workshop June 6, 2016 1 / 40 1 2 3 4 5 2 / 40 Goals for today Motivate mediation analysis Survey rapidly developing field in
More informationHelp! Statistics! Mediation Analysis
Help! Statistics! Lunch time lectures Help! Statistics! Mediation Analysis What? Frequently used statistical methods and questions in a manageable timeframe for all researchers at the UMCG. No knowledge
More informationDepartment of Biostatistics University of Copenhagen
Comparison of five software solutions to mediation analysis Liis Starkopf Mikkel Porsborg Andersen Thomas Alexander Gerds Christian Torp-Pedersen Theis Lange Research Report 17/01 Department of Biostatistics
More informationOnline Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha
Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d
More informationA Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele
Original Article A Unification of Mediation and Interaction A 4-Way Decomposition Tyler J. VanderWeele Abstract: The overall effect of an exposure on an outcome, in the presence of a mediator with which
More informationCausal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms
Causal Mediation Analysis in R Kosuke Imai Princeton University June 18, 2009 Joint work with Luke Keele (Ohio State) Dustin Tingley and Teppei Yamamoto (Princeton) Kosuke Imai (Princeton) Causal Mediation
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationUnpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies
Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies Kosuke Imai Princeton University Joint work with Keele (Ohio State), Tingley (Harvard), Yamamoto (Princeton)
More informationChapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models
Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationResearch Note: A more powerful test statistic for reasoning about interference between units
Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)
More informationHarvard University. Harvard University Biostatistics Working Paper Series. Semiparametric Estimation of Models for Natural Direct and Indirect Effects
Harvard University Harvard University Biostatistics Working Paper Series Year 2011 Paper 129 Semiparametric Estimation of Models for Natural Direct and Indirect Effects Eric J. Tchetgen Tchetgen Ilya Shpitser
More informationHarvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen
Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationAn Introduction to Causal Analysis on Observational Data using Propensity Scores
An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut
More informationIntroduction. Consider a variable X that is assumed to affect another variable Y. The variable X is called the causal variable and the
1 di 23 21/10/2013 19:08 David A. Kenny October 19, 2013 Recently updated. Please let me know if your find any errors or have any suggestions. Learn how you can do a mediation analysis and output a text
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationConceptual overview: Techniques for establishing causal pathways in programs and policies
Conceptual overview: Techniques for establishing causal pathways in programs and policies Antonio A. Morgan-Lopez, Ph.D. OPRE/ACF Meeting on Unpacking the Black Box of Programs and Policies 4 September
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 288 Targeted Maximum Likelihood Estimation of Natural Direct Effect Wenjing Zheng Mark J.
More informationSensitivity analysis and distributional assumptions
Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu
More informationMediation question: Does executive functioning mediate the relation between shyness and vocabulary? Plot data, descriptives, etc. Check for outliers
Plot data, descriptives, etc. Check for outliers A. Nayena Blankson, Ph.D. Spelman College University of Southern California GC3 Lecture Series September 6, 2013 Treat missing i data Listwise Pairwise
More informationIdentification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments
Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments Kosuke Imai Teppei Yamamoto First Draft: May 17, 2011 This Draft: January 10, 2012 Abstract
More informationCausal mediation analysis: Multiple mediators
Causal mediation analysis: ultiple mediators Trang Quynh guyen Seminar on Statistical ethods for ental Health Research Johns Hopkins Bloomberg School of Public Health 330.805.01 term 4 session 4 - ay 5,
More informationCausality II: How does causal inference fit into public health and what it is the role of statistics?
Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual
More informationCalculating Effect-Sizes. David B. Wilson, PhD George Mason University
Calculating Effect-Sizes David B. Wilson, PhD George Mason University The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction and
More informationUnbiased estimation of exposure odds ratios in complete records logistic regression
Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationA comparison of 5 software implementations of mediation analysis
Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen Illustrative example
More informationObservational Studies 4 (2018) Submitted 12/17; Published 6/18
Observational Studies 4 (2018) 193-216 Submitted 12/17; Published 6/18 Comparing logistic and log-binomial models for causal mediation analyses of binary mediators and rare binary outcomes: evidence to
More informationCausal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD
Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationMediation and Interaction Analysis
Mediation and Interaction Analysis Andrea Bellavia abellavi@hsph.harvard.edu May 17, 2017 Andrea Bellavia Mediation and Interaction May 17, 2017 1 / 43 Epidemiology, public health, and clinical research
More informationGeoffrey T. Wodtke. University of Toronto. Daniel Almirall. University of Michigan. Population Studies Center Research Report July 2015
Estimating Heterogeneous Causal Effects with Time-Varying Treatments and Time-Varying Effect Moderators: Structural Nested Mean Models and Regression-with-Residuals Geoffrey T. Wodtke University of Toronto
More informationNatural direct and indirect effects on the exposed: effect decomposition under. weaker assumptions
Biometrics 59, 1?? December 2006 DOI: 10.1111/j.1541-0420.2005.00454.x Natural direct and indirect effects on the exposed: effect decomposition under weaker assumptions Stijn Vansteelandt Department of
More information2 Naïve Methods. 2.1 Complete or available case analysis
2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling
More informationIgnoring the matching variables in cohort studies - when is it valid, and why?
Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationThe propensity score with continuous treatments
7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.
More informationAn Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin
Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University
More informationCausal Inference for Mediation Effects
Causal Inference for Mediation Effects by Jing Zhang B.S., University of Science and Technology of China, 2006 M.S., Brown University, 2008 A Dissertation Submitted in Partial Fulfillment of the Requirements
More informationParametric and Non-Parametric Weighting Methods for Mediation Analysis: An Application to the National Evaluation of Welfare-to-Work Strategies
Parametric and Non-Parametric Weighting Methods for Mediation Analysis: An Application to the National Evaluation of Welfare-to-Work Strategies Guanglei Hong, Jonah Deutsch, Heather Hill University of
More informationCAUSAL MEDIATION ANALYSIS FOR NON-LINEAR MODELS WEI WANG. for the degree of Doctor of Philosophy. Thesis Adviser: Dr. Jeffrey M.
CAUSAL MEDIATION ANALYSIS FOR NON-LINEAR MODELS BY WEI WANG Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Thesis Adviser: Dr. Jeffrey M. Albert Department
More informationRatio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction
Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis in the Presence of Treatment-by-Mediator Interaction Guanglei Hong Jonah Deutsch Heather D. Hill University of Chicago (This is a working
More informationSIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS
SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid UK Stata Users Group Meeting London, September 10, 2007 CONTENT Presentation of a Stata
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationMediation: Background, Motivation, and Methodology
Mediation: Background, Motivation, and Methodology Israel Christie, Ph.D. Presentation to Statistical Modeling Workshop for Genetics of Addiction 2014/10/31 Outline & Goals Points for this talk: What is
More informationLogistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015
Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 155 Estimation of Direct and Indirect Causal Effects in Longitudinal Studies Mark J. van
More informationStatistical Analysis of Causal Mechanisms
Statistical Analysis of Causal Mechanisms Kosuke Imai Princeton University April 13, 2009 Kosuke Imai (Princeton) Causal Mechanisms April 13, 2009 1 / 26 Papers and Software Collaborators: Luke Keele,
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationAbstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size
Abstract Title Page Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size Authors and Affiliations: Ben Kelcey University of Cincinnati SREE Spring
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems
Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including
More informationNew developments in structural equation modeling
New developments in structural equation modeling Rex B Kline Concordia University Montréal Set A: SCM A1 UNL Methodology Workshop A2 A3 A4 Topics o Graph theory o Mediation: Design Conditional Causal A5
More informationDiscussion of Papers on the Extensions of Propensity Score
Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationTelescope Matching: A Flexible Approach to Estimating Direct Effects *
Telescope Matching: A Flexible Approach to Estimating Direct Effects * Matthew Blackwell Anton Strezhnev August 4, 2018 Abstract Estimating the direct effect of a treatment fixing the value of a consequence
More information8 Configural Moderator Models
This is a chapter excerpt from Guilford Publications. Advances in Configural Frequency Analysis. By Alexander A. von Eye, Patrick Mair, and Eun-Young Mun. Copyright 2010. 8 Configural Moderator Models
More informationUnpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies Kosuke Imai Princeton University January 23, 2012 Joint work with L. Keele (Penn State)
More informationHarvard University. Harvard University Biostatistics Working Paper Series
Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 192 Negative Outcome Control for Unobserved Confounding Under a Cox Proportional Hazards Model Eric J. Tchetgen
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationMixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755
Mixed- Model Analysis of Variance Sohad Murrar & Markus Brauer University of Wisconsin- Madison The SAGE Encyclopedia of Educational Research, Measurement and Evaluation Target Word Count: 3000 - Actual
More informationAssessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis
Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis Part 1: Vanessa Didelez with help from Ryan M Andrews Leibniz Institute for Prevention Research & Epidemiology
More informationA Distinction between Causal Effects in Structural and Rubin Causal Models
A istinction between Causal Effects in Structural and Rubin Causal Models ionissi Aliprantis April 28, 2017 Abstract: Unspecified mediators play different roles in the outcome equations of Structural Causal
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level
More informationResearch Design: Causal inference and counterfactuals
Research Design: Causal inference and counterfactuals University College Dublin 8 March 2013 1 2 3 4 Outline 1 2 3 4 Inference In regression analysis we look at the relationship between (a set of) independent
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University
More informationUnpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies Kosuke Imai Princeton University February 23, 2012 Joint work with L. Keele (Penn State)
More informationInterpreting and using heterogeneous choice & generalized ordered logit models
Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2
More informationTargeted Maximum Likelihood Estimation in Safety Analysis
Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35
More informationComparing IRT with Other Models
Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used
More informationTelescope Matching: A Flexible Approach to Estimating Direct Effects
Telescope Matching: A Flexible Approach to Estimating Direct Effects Matthew Blackwell and Anton Strezhnev International Methods Colloquium October 12, 2018 direct effect direct effect effect of treatment
More informationPropensity Score Matching
Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In
More informationJournal of Biostatistics and Epidemiology
Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationIdentification and Estimation of Causal Mediation Effects with Treatment Noncompliance
Identification and Estimation of Causal Mediation Effects with Treatment Noncompliance Teppei Yamamoto First Draft: May 10, 2013 This Draft: March 26, 2014 Abstract Treatment noncompliance, a common problem
More informationEXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science
EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned
More informationSEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION
ADVANCED SEM REX B KLINE CONCORDIA D1 D. MODERATION, MEDIATION X 1 DY Y DM 1 M D2 topics moderation mmr mpa D3 topics cpm mod. mediation med. moderation D4 topics cma cause mediator most general D5 MMR
More information