Estimating Post-Treatment Effect Modification With Generalized Structural Mean Models

Size: px

Start display at page:

Download "Estimating Post-Treatment Effect Modification With Generalized Structural Mean Models"

Juliana Palmer
5 years ago
Views:

1 Estimating Post-Treatment Effect Modification With Generalized Structural Mean Models Alisa Stephens Luke Keele Marshall Joffe December 5, 2013 Abstract In randomized controlled trials, the evaluation of an overall treatment effect is often followed by effect modification or subgroup analyses, where the possibility of a different magnitude or direction of effect for varying values of a covariate is explored. While studies of effect modification are typically restricted to pretreatment covariates, longitudinal experimental designs permit the examination of treatment effect modification by intermediate outcomes, where intermediates are measured after treatment but before the final outcome. We present a generalized structural mean model (GSMM) for analyzing treatment effect modification by posttreatment covariates. The model can accommodate posttreatment effect modification with both full compliance and noncompliance to assigned treatment status. The methods are evaluated using a simulation study that demonstrates that our approach retains unbiased estimation of effect modification by intermediate variables which are affected by treatment and also predict outcomes. We illustrate the method using a randomized trial designed to promote re-employment through teaching skills to enhance self-esteem and inoculate job seekers against setbacks in the job search process. Our analysis provides some evidence that the intervention was much less successful among subjects that displayed higher levels of depression at intermediate posttreatment waves of the study. For helpful comments and suggestions, we thank Stijn Vansteelandt, Eric Tchetgen Tchetgen, and Teppei Yamamoto. Department of Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine 624 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104, alisaste@mail.med.upenn.edu Associate Professor, 211 Pond Lab, Penn State University, University Park, PA 16802, ljk20@psu.edu, Phone: Department of Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine 602 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104

2 1 Treatment Effects That Vary With Covariates 1.1 Posttreatment Effect Modification and Adaptive Treatment Strategies In a randomized study of treatment effects, effects may be heterogenous due to effect modification. Effect modification is an interaction between a treatment and a pretreatment covariate such that the average treatment effect varies across values of the prerandomization covariate. While analyses with effect modification by a pretreatment covariate are relatively common, it is also possible for effect modification to occur as a function of a posttreatment covariate. In many randomized studies, data on intermediate covariates, defined as variables measured post-intervention but prior to the study endpoint, are often collected from subjects. That is, after treatment, intermediate measures are collected at time intervals such as six months, one year and 18 months later, whereas the outcome may be measured at two years. In such designs, we may suspect that the treatment effect may vary across levels of a covariate measured after the treatment but before the final outcome. Identification of posttreatment effect modification may be a useful tool in the development of adaptive treatment strategies. Under an adaptive treatment strategy, the treatment level and type are adjusted according to individual level characteristics (Murphy 2005; Lavori and Dawson 2000; Almirall et al. 2012). The design of adaptive treatment strategies requires choosing tailoring variables, variables that are used to decide how to adapt the treatment to specific individuals. Posttreatment effect modification provides one method for identification of tailoring variables. If the effect of a treatment varies across levels of a postrandomization variable, this would suggest that this covariate may be a good tailoring variable. Thus models where posttreatment covariates are allowed to modify causal effect estimates could be used for further actions within a study or to tailor clinical decision-making. Posttreatment effect modification may also be useful for generating hypotheses that can be tested in future trials. Before considering how one might identify and estimate models with posttreatment effect 2

3 modification, it is helpful to consider a motivating example. 1.2 Motivating Example: The Job Search Intervention Study The Job Search Intervention Study (Vinokur et al. 1995) was a randomized trial conducted in the U.S. to study job training effectiveness and the prevention of poor mental health. The goal was to promote re-employment among unemployed workers through training that focused on not only specific career skills but also on teaching active learning skills to enhance self-esteem and inoculate job seekers against setbacks in the employment search process. For the approximately 1,800 unemployed participants that enrolled, the study investigators collected baseline characteristics on education, income, sex, age, occupation, race, risk for failure, level of economic hardship, and depression level which were measured with a subscale of 11 items based on the Hopkins Symptom Checklist. Participants were randomized to either participate in five four-hour sessions in groups of 15 to 20 led by two group leaders or were mailed a booklet that contained information on job-search methods. In the study, the treatment condition was only available to individuals assigned to the intervention. A substantial percentage (46%) of those assigned to the intervention failed to participate in the training seminars. After the training sessions, outcomes were measured at six weeks, six months, and two years. One outcome of interest is a binary indicator for whether subjects were employed 20 or more hours per week at the two year follow-up period. The intent-totreat analysis reveals that the odds ratio of success in the treatment arm as compared to the control arm is 1.43 with a 95% confidence interval (1.06, 1.95). An IV analysis based on the double-logistic GSMM (Vansteelandt and Goetghebeur 2003), which accounts for noncompliance among the treated, shows that the odds ratio of success for participating in the job training seminars versus not participating is 1.73 with corresponding 95% confidence interval (1.11, 2.73). This analysis indicates that for subjects assigned to the intervention the odds of re-employment increased, and for those subjects who complied with the intervention the odds of re-employment increased further. 3

4 In this paper, we investigate the possibility of causal effect modification by a posttreatment covariate in the JOBS II study. Specifically, our aim in this analysis is to assess whether the causal effect of the JOBS II intervention was modified by depression levels at the initial two follow-up periods. Why might we expect the treatment effect to differ at levels of posttreatment covariate in the JOBS II study? While the intervention was designed to decrease depression through teaching coping skills, it also made participants aware of the mental health aspects of the job search process which may alter the effect of the treatment. Specifically, the treatment encouraged participants to avoid depressive feelings in the job search process. However, this awareness combined with early failure at reemployment may only serve to increase feelings of failure and depression, which may then lower the odds of re-employment at later follow-up periods. Under an adaptive treatment strategy, investigators may wish to identify post-randomization factors early in the follow-up period to allow for the tailoring of subsequent interventions. For example, if at an early stage the job training appears to be ineffective among subjects with high levels of posttreatment depression, investigators can design future interventions for this sub-population. Thus identifying posttreatment effect modifiers can be used to alter the course of future treatments. In this paper, we modify generalized structural mean models (GSMMs) for binary outcomes to estimate the posttreatment modification of causal treatment effects, where the causal quantity of interest is the causal odds ratio. GSMMs for binary outcomes are semiparametric two-stage models comprised of a structural model and association model (Vansteelandt and Goetghebeur 2003). The structural model is of primary concern as it is used to estimate the causal effect and, as we demonstrate, the amount of effect modification. The association model, which is a model for expected outcomes given observed exposure and covariates, is needed for estimation of the structural model parameters. Such models are robust to model misspecification through special weight functions in the estimating equations. Our paper has the following structure. Section 2 outlines our notation, and Section 3 4

5 describes the generalized structural mean model (GSMM) for the causal odds ratio and introduces the two parameter GSMM that allows for posttreatment effect modification. In Section 4, we detail the estimation procedure. Section 5 evaluates the properties of the twoparameter logistic GSMM for posttreatment effect modification through a simulation study. Section 6 presents estimates of posttreatment effect modification of causal effects in JOBS II. Section 7 includes discussion and concluding remarks. 2 Notation for Treatment Effects In this section, we begin with a basic causal model before outlining extensions to both posttreatment effect modification and noncompliance. Since the primary outcome of interest in JOBS II is binary, we focus on estimation of the causal odds ratio. 2.1 Simple Treatment Effects In JOBS II, subjects (i = 1,..., n) are randomly assigned to either treatment (R i = 1) or control (R i = 0). Covariates X i = (X 1,..., X k ) are measured at baseline prior to randomization. As stated in Section 1.2, investigators measured a binary outcome Y i, which was an indicator of employment of 20 hours or more per week. One popular way to define causal effects is in terms of counterfactual or potential outcomes (Neyman 1923; Rubin 1978; Holland 1986). Under the potential outcomes framework, we augment observed outcomes with a set of potential outcomes Y i,r, where Y i,0 indicates a treatment-free response that would have been observed if, perhaps counter to fact, subject i had not received treatment. We can define the treatment effect of R i on binary Y i under the following contrast logit(p [Y i R i = 1, X i ]) logit(p [Y i,0 R i = 1, X i ]). Next we parametrize this contrast logit(p [Y i R i = 1, X i ]) logit(p [Y i,0 R i = 1, X i ]) = η s (X i ; ψ 0 ) = exp(ψ 0 ). 5

6 Identification of the causal parameter ψ 0 requires three assumptions. First, there must be independence of the treatment assignment and potential outcomes. In JOBS II, this assumption is justified by the randomized design. Formally, this restriction can be expressed as Y i,r R i (1) Second, we require consistency of the observed outcome Y i for the counterfactual outcome Y i,r under R i = r. We also require that the stable unit treatment values assignment (SUTVA) holds, which implies that a subject s potential outcome is not affected by other subjects exposures (Rubin 1986). Under these assumptions, standard logistic regression conditioning on R i and X i with some minor adjustments may be used to obtain an unbiased estimate of ψ 0 (Freedman 2008). In this model, the log odds of the probability of success for subjects who were assigned to the training program (R i = 1) would increase by ψ 0 relative to subjects who were assigned to control (R i = 0). We can alter the parameterization so that η s (X i ; ψ 0 ) = ψ 01 + ψ 02 X i which allows for the possibility that the causal odds ratio varies over levels of X i. Such a parameterization allows for effect modification by pretreatment covariates. Logistic regression via maximum likelihood may still be used to estimate ψ 0 = (ψ 01, ψ 02 ) when η s (X i ; ψ 0 ) is restricted to pre-treatement covariates. In the evaluation of effect modification by posttreatment covariates, one might be tempted to enter posttreatment variables into η s (X i ; ψ 0 ) and again estimate ψ 0 through logistic regression. Rosenbaum (1984) demonstrates, however, that conditioning on posttreatment variables may result in biased estimates of the causal parameter ψ 0 in standard regression models. Estimation of posttreatment effect modification therefore necessitates the use of methods that properly account for conditioning on posttreatment quantities. In the next section, we define a GSMM for unbiased estimation of the causal odds ratio under effect modification by posttreatment variables. 6

7 3 Causal models for posttreatment effect modification 3.1 Modeling posttreatment effect modification under full compliance To evaluate modification of the causal odds ratio of obtaining employment of 20 hours or more for having received versus not receiving the JOBS training intervention by intermediate variables, we use the GSMM (Vansteelandt and Goetghebeur 2003). For the moment, we consider modeling posttreatment effect modification without accounting for noncompliance. We define potential posttreatment effect modifiers, S i, as the set of intermediate covariates observed after treatment but prior to the outcome, also possibly multivariate. The structural model is defined by logit(p [Y i S i, R i = 1, X i ]) logit(p [Y i,0 S i, R i = 1, X i ]) = η s (S i, X i ; ψ 0 ) = f 1 (ψ 01 ) + f 2 (S i ; ψ 02 ) (2) for arbitary known functions f 1 ( ) and f 2 ( ) up to an unknown p-dimensional parameter ψ 0 = (ψ 01, ψ 02 ), where p = dim(ψ 01 ) + dim(ψ 02 ). For the structural model in (2), when ψ 0 = 0 or R i = 0 we set f 1 (ψ 01 ) = f 2 (S i ; ψ 02 )=0, and for S i = 0 or ψ 02 = 0, we set f 2 (S i ; ψ 02 ) = 0. Equation 2 is an example of a retrospective structural mean model, which is characterized by exposure effects defined conditional on variables observed subsequent treatment. Unlike standard structural mean models (Robins 1994), which follow a decision-theoretic approach of defining causal effects as a function of covariates observed prior to treatment, retrospective structural mean models allow effects to be defined in subsets of the data not identifiable at baseline. For example, Vansteelandt (2010) considered retrospective models for assessing mediation when outcomes are binary and modeled using a logit link. Under our causal model, the second parameter ψ 02 quantifies the differential in the effect 7

8 of intervention R i at varying levels of S i. When ψ 02 = 0, homogeneous causal effects are estimated across values of S i. One model for a binary treatment R i and scalar intermediate variable S i might assume logit(p (Y i = 1 S i, R i = 1, X i )) logit(p (Y i,0 = 1 S i, R i = 1, X i )) = ψ 01 + ψ 02 S i (3) for scalars ψ 01, ψ 02. Under ψ 02 = 0, the causal odds ratio P (Y i,1 = 1 S i, R i = 1, X i ))/P (Y i,1 = 0 S i, R i = 1, X i ) P (Y i,0 = 1 S i, R i = 1, X i ))/P (Y i,0 = 0 S i, R i = 1, X i ) is quantified by exp(ψ 01 ) for all subjects when ψ 02 = 0 or the set of subjects with S i = 0 in the case that ψ For ψ 02 0 and S i 0, the causal odds ratio considering subjects with S i = s is captured by exp(ψ 1 + ψ 2 s), which allows the effect of R i to vary with the observed value s. 3.2 Identifiability We next consider the identifiability of SMMs with posttreatment effect modification. We do not present a theorem, but instead address identifiability under a theorem originally presented by Vansteelandt and Goetghebeur (2004) in the context of Strong Structural Mean Models. The restriction in (1) along with SUTVA and the assumption of consistency is sufficient to nonparametrically identify the effect of treatment in subgroups defined by baseline covariates. For example, consider the following model with a pretreatment binary covariate x i and treatment R i, E(Y i Y i,0 R i, x i ) = ψ 01 R i + ψ 02 R i x i (4) This model is nonparametrically identified given that under ignorability (1) the quantities in E(Y i R i, x i ) are defined in terms of observable quantities and the conditional expectations E(Y i,0 R i, x i ) and E(Y i R i = 0, x i ), are equivalent and defined in terms of observable 8

9 quantities. Our model in (3) is analogous to the following linear model: E(Y i Y i,0 S i, R i, x i ) = ψ 01 R i + ψ 02 R i S i. (5) Nonparametric identification for the above model does not hold since it imposes additional restrictions on the sub-group specific causal effects. There are four such possible effects (one for each subgroup defined by joint levels of S i and x i ),) whereas (5) involves only two parameters. The restrictions in (5) allow us to write x E(Y i Y i,0 R i = 1, x i ) = s P r(s i = s R i = 1, x i )E(Y i Y i,0 S i, R i = 1, x i )) (6) = s π sx (ψ 01 + ψ 02 s) = π 1x (ψ 01 + ψ 02 ) + π 0x (ψ 01 ) = ψ 01 + π 1x ψ 02 where π sx P r(s i = s R i = 1, x i ). Equation (6) involves two equations in two unknown quantities, ψ 01 and ψ 02, and has a unique solution so long as x i predicts the effect of treatment R i on S i (i.e., that π 1x varies with x i ). A model that is subtly different from (4) is E(Y i Y i,0 S i, R i, x i ) = ψ 01 R i + ψ 02 R i x i. (7) Unlike (4), (7) assumes that the effect of R i does not change across the strata of S i. Like (5), (7) allows us to express x in terms of two quantities: x = ψ 01 + ψ 02 x i. Models (5) and (7) are indistinguishable in the data, as each can be used to fit or explain the observed data under ignorability (1) equally well. More complex models such as E(Y i Y i,0 S i, R i, x i ) = ψ 01 R i + ψ 02 R i x i + ψ 03 R i S i (8) 9

10 are not identifiable from these data, as they involve more parameters than the number of restrictions imposed by (1). More generally, nonparametric identifiability does not hold for models like these, as the number of strata defined jointly by baseline covariates x i and posttreatment covariates S i is greater than the number of restrictions imposed by ignorable treatment assignment (1). Identifiability under the model in (8) which is parameterized by ψ = (ψ 01, ψ 02, ψ 03 ) depends on the parameter ψ being of no larger dimension than the number of restrictions imposed by (1). As such, we must restrict some subset of the R i x i interactions to identify effect modification by S i. Thus identifiability holds under a no-interaction assumption between R i and x i. This no-interaction assumption does allow for model-based identification but no set of assumptions allows for nonparametric identification. No-interaction assumptions are often used for identification of causal effects with instrumental variable analysis (Hernán and Robins 2006), in the estimation of direct and indirect effects (Robins and Greenland 1992; Ten Have et al. 2007; Vansteelandt 2010), and for other causal analyses(vansteelandt and Goetghebeur 2004). In many cases no-interaction assumptions are untestable, but in the context here the identifying assumption does have a testable implication. The analyst can restrict the posttreatment interactions to be zero and estimate the model with pretreatment effect modification. If there is considerable treatment heterogeneity, then the no-interaction assumption is implausible. One might argue that absent nonparametric identification, the model should be abandoned. We think that would be premature. The model can still serve a number of useful purposes so long as the estimates are not given a strong causal interpretation. First, posttreatment effect modification can be used for intermediate decision making if the trial is ongoing. Analysts could use the model to identify subgroups were the treatment is particularly ineffective and a new intervention might be implemented. Second, results from an analysis of this form could also be used as a method for hypothesis-generation and the design of future interventions. Third, the model could also be used as a form of sensitivity analysis 10

11 in conjunction with other causal models. 3.3 An Alternative Definition of the Estimand Next, we develop an alternative characterization of S i the posttreatment modifier, which more clearly reveals linkages between our model and the method of principal stratification (PS) (Frangakis and Rubin 2002; Mealli and Rubin 2003). Given that treatment status R i is allowed to affect S i, our causal model might treat it in two different ways. Thus far the casual model describes stratification on an observed posttreatment variable. Alternatively, we might write S i as a potential outcome such that counterfactual value is denoted as S i,r under R i = r. In both cases, treatment effects are defined for subsets of the population using variables that are not observable at the time of treatment assignment (Joffe et al. 2007). First, using potential outcomes for the posttreatment effect modifier alters the causal model little. Under our stated assumptions, the following set of equalities hold. P (Y i = 1 S i, R i = 1, X i ) = P (Y i,1 = 1 S i,1, R i = 1, X i ) = P (Y i,1 = 1 S i,1, X i )) (9) The consistency assumption justifies the first equality, while the second holds due to ignorability in (1). Similarly, due to ignorability, P (Y i,0 = 1 S i,1, R i = 1, X i ) = P (Y i,0 = 1 S i,1, X i ). Under this notation, we would write the causal model as logit{p (Y i,1 = 1 S i,1, X i )} logit{p (Y i,0 = 1 S i,1, X i )} = ψ 01 + ψ 02 S i,1 (10) Under this causal model, we condition on the potential auxiliary S i,1 rather than the observed value of S i, but do so for the population as a whole. For a binary treatment, stratification on an observed posttreatment variable is equivalent to conditioning on a single potential outcome and so is in general a coarsening of the PS estimand logit{p (Y i,1 = 1 S i,1, S i,0, X i )} logit{p (Y i,0 = 1 S i,1, S i,1 X i )}. When S i,0 has the same value for all subjects our estimand is essentially the same as the PS estimand (see Gilbert et al. (2011)). Thus our estimand can 11

12 be characterized as either one articulated under principal stratification or as stratification on an observed auxiliary variable (Joffe et al. 2007). Hereafter, we retain the notation of S i as observed auxiliary variable. 3.4 Noncompliance As we noted in Section 1.2, JOBS II was subject to noncompliance. We first review the double-logistic SMM for settings with noncompliance and then adapt the structural model to allow for posttreatment effect modification of exposure to treatment rather than assignment to treatment. We denote actual exposure to the treatment by A i. As we noted above in JOBS II many subjects did not comply by failing to attend the training sessions. In JOBS II, those assigned to control did not have access to the training sessions. Therefore when R i = 0 then A i = 0. Conversely, if R i = 1, then A i = 1 if subject i complies with treatment assignment and if A i = 0 the subject does not comply with treatment assignment. Therefore, the experimental design justifies what is known as the no-contamination restriction (Cuzick et al. 2007), so that P r(a i = 0 R i = 0) = 1. One causal parameter that may be of interest is the causal effect of A i on the response. To estimate this quantity, we would like to contrast how each subject in the treatment group would respond if he or she fails to attend the training sessions instead of selecting to attend the intervention. We now denote such potential treatment-free outcomes as Y ir,0. Here, we assume the exclusion restriction holds (Angrist et al. 1996) which implies that Y i1,0 is equal to Y i0,0, and we denote Y i0 = Y i1,0 = Y i0,1. Given the exclusion restriction, the contrast of interest compares P (Y i = 0 A i = 1) to P (Y i0 = 1 A i = 0), which is the total effect of A i, self-selection into the job-training sessions, on the outcome re-employment status. We use the term total effect to reflect that the effect of A i may be mediated through S i, since under our structural model S i is not manipulated. Finally, we assume that R i has a nonzero causal effect on A i (Angrist et al. 1996). The double logistic SMM (Vansteelandt and Goetghebeur 2003) models differences in 12

13 P (Y i = 0) to P (Y i0 = 1), through the following model logit{e[y i A i, X i, R i = 1]} logit{e[y i0 A i, X i, R i = 1]} = η s(a i, X i ; ψ 0 ) (11) In this model, the odds ratio of success for subjects who chose to be exposed to the training program (A i = 1) versus not exposed (A i = 0) is determined by ψ 0, depending on the functional form of η s(a i, X i ; ψ 0 ). Here ψ 0 corresponds to the log odds ratio of success for participating versus not participating in the intervention. Allowing for posttreatment effect modification of A i instead of R i requires minimal alteration of the causal model. To allow for posttreatment effect modification of A i, model (2) becomes logit(e[y i S i, A i, R i = 1, X i ]) logit(e[y i,0 S i, A i, R i = 1, X i ]) = f 1 (A; ψ 01 ) + f 2 (A, S i ; ψ 02 ), (12) Interpretation of the model parameters largely parallels the case with full compliance. 3.5 Identifiability Basic issues of identifiability change little. While noncompliance entails additional assumptions for identification, posttreatment effect modification requires a no-interaction assumption in the same form as the case with full compliance. We must also maintain an exclusion restriction for S i as ell. While noncompliance itself requires untestable assumptions for identification, some of those assumptions are justified by the design in JOBS II and the effect of R i on A i is a testable condition. The design by itself does not provide justification for the no-interaction assumption. 13

14 4 Estimation 4.1 Estimation under full compliance The Vansteelandt and Goetghebeur (2003) approach to the estimation of causal effects under generalized structural mean models of binary outcomes using the logit link was developed as a solution to Robins (1999), which showed that the causal odds ratio could not be estimated using the same G-estimation procedure as used for identity and log links. To facilitate the definition of mean treatment-free outcomes to be used in a modified version of G-estimation, the first of two stages defines an association model among subjects receiving intervention. A detailed argument motivating the need for the association model is described in Vansteelandt and Goetghebeur (2003) and largely follows from the noncollapsibility of the logit link. When the associated model is nonsaturated, it can be uncongenial to the logistic SMM (Robins and Rotnitzky 2004). Vansteelandt et al. (2011) argue that the biases from uncongenial estimators are small compared with other assumption failures. Moreover, alternatives are computationally demanding. The double-logistic SMM has similar asymptotic properties to G-estimators when the association model is correctly specified. Under the null hypothesis ψ 0 = 0, one may construct locally robust estimating equations that yield consistent inference under misspecified association models (Vansteelandt and Goetghebeur 2003). For a binary randomization assignment R i the first stage of estimation defines the model logit(e(y i S i, R i = 1, X i ) = η a (S i, X i ; β) (13) for a known function η a and unknown finite-dimensional parameter vector β. The subscript a distinguishes the association model from the structural, causal model. As in generalized linear regression models, components of the the linear predictor η a (S i, R i = 1, X i ; β) may take any functional form, including main effects, interactions, and higher order terms. For subjects randomized to treatment, we define the pseudo treatment free outcome H i (ψ) = 14

15 h(s i, X i, R i ; ψ) by [ ] H i (ψ) = expit η a (S i, R i = 1, X i ; β) [{f 1 (ψ 1 ) + f 2 (S i ; ψ 2 )}. For subjects with R i = 0, the observed outcome Y i equals the treatment free outcome Y i,0 following the consistency assumption. In the control arm, it is therefore unnecessary to estimate H i (ψ) by removing the treatment effect from a model-predicted mean outcome; H i (ψ) is simply set to H i (ψ) = Y i. The second stage of estimation then defines the estimating function U(S i, R i, X i, H i (ψ)) = d(x i, R i ) [H i (ψ) q(x i )], (14) for d(x i, R i ), a p dimensional weight function defined such that its elements d 1 (X i, R i ),... d p (X i, R i ) are non-collinear, and q(x i ), a function of baseline covariates. The causal parameter ψ is n estimated as the solution to U(S i, R i, X i, H i (ψ)) = 0. By the randomization assumption, [ i=1 n ] under the true ψ and β, E U{S i, R i, X i, H i (ψ)} = 0 for U(S i, X i, H i (ψ)) as defined i=1 in (14). The chosen d(x i, R i ) and q(x i ) affect efficiency but not bias in the resulting estimate ˆψ when the association model is correctly specified. Under a misspecified association model, the robust estimating equations referenced above are constructed through strategic selection of d(x i, R i ) and guarantee that type I error is preserved. Term q(x i ) does not affect bias under correct or incorrect specification of the association model. Vansteelandt and d Goetghebeur (2003) recommend the choice d(x i, R i ) = (X i )( 1) R i, with R i P (R=1 X)+(1 R i )(1 P (R i =1 X i )) [ ] d (X i ) = E Hi (ψ) X ψ i, following semiparametric optimality arguments for their GSMM under known β. Under the logit model H i(ψ) ψ evaluated at an initial estimate of ψ, ˆψ b. 4.2 Estimation under non-compliance = H i (ψ)(1 H i (ψ))[r i R i S i ] where H i (ψ) is The two-stage approach may also be used to estimate post-intervention effect modification amid non-compliance. Estimation considers the observed treatment variable A i in the asso- 15

16 ciation model amd predicted treatment-free mean outcomes H i (ψ). The first stage model is defined as logit(e(y i S i, A i, R i = 1, X i ; β r ) = η a (S i, A i, X i ; β), (15) for a known function η a and unknown finite-dimensional parameter vector β, and predicted mean treatment-free outcomes are constructed as H i (ψ) = expit[η a (S i, A i, X i ; β) [{f 1 (A; ψ 1 ) + f 2 (A, S i ; ψ 2 )}]. (16) For subjects randomized to control and hence unable to access treatment, we again set H i (ψ) = Y i. Estimation then proceeds as described in section 4.2 setting n U(S i, A i, R i, X i, H i (ψ)) = i=1 n d(x i, R i ) [H i (ψ) q(x i )] = 0, (17) i=1 noting the dependence of our estimator on A i. 5 Simulation Study A simulation study was conducted to evaluate the proposed estimator for assessing effect modification by posttreatment variables. Each simulated dataset consisted of n=5,000 subjects. In simulation set 1, we generate data under perfect compliance. For each subject, independent baseline covariates X i1 Bernoulli(p = 0.4) and X i2 Bernoulli(p = 0.7) were generated. Binary treatment R i was simulated following an unstratified randomization design, with R i Bernoulli(p = 0.5). The posttreatment variable S i was simulated from the model logit(p (S i = 1 R i, X i )) = γ 0 + γ 1 X i1 + γ 2 X i2 + γ 3 R i, with γ = (logit(0.2), 1.0, 1.0, γ 3 ), with γ 3 =0, 0.3, or 1.2 for S i not associated, weakly associated, or strongly associated with R i. The data design contained binary covariates to ensure congeniality among various stages of conditional mean treatment-free potential outcome models. Outcomes were generated by first defining logit(e[y i,0 X i1, X i2 ]) = ρ X 0 +ρ X 1 X i1 +ρ X 2 X i2 + 16

17 ρ X 3 X i1 X i2, with ρ X = (logit(0.35), 0.8, 0.8, 1.5). To adjust mean treatment-free potential outcomes for S i, we followed a strategy similar to that described in Robins and Scharfstein (1999). We set E[Y i,0 S i, R i, X i ] = logit[logit(e[y i,0 X i1, X i2 ]) + ρ S 1 S i + ρ S 0 j (1 S i )], where ρ S 1 = 0.4, and for given ρ X, γ and values of covariates R, X i1, X i2, ρ S 0 j was the solution to E S [E[Y i,0 S i, R i, X i ]] = E[Y i,0 X i ] for j = 1,..., 8, indexing unique profiles of R, X i1, X i2. Observed outcomes were then simulated as Bernoulli(p Y ), where logit(p y ) = logit(e[y i,0 S i, R i, X i ]) + ψ 1 R i + ψ 2 R i S i for ψ=(0.5,-0.5). In applying the two-stage G- estimation to each simulated dataset, a fully saturated association model was considered E[Y i S i, R i, X i ], followed by a estimation under a misspecified association model containing only main effects of S i, X i1, X i2 and the two-way interaction X i1 X i2. Standard logistic regression was also applied in two forms, both of which included R i, R i S i as covariates. For the application of the GSMM with a saturated association model, the first application of logistic regression was fully saturated for S i, X i1, X i2, whereas the second application excluded all terms containing S i other than the interaction R i S i of interest. For GSMM with a misspecified association model, potentially comparable logistic regression models included 1) R i, R i S i, S i, X i1, X i2, X i1 X i2 and 2) R i, R i S i, X i1, X i2, X i1 X i2. Simulation set 2 considers estimation of posttreatment effect modification in the presence of non-compliance. Variables X i1,x i2,r i, and S i were generated as in simulation set 1. To simulate non-compliance, the data-generating mechanism and estimation procedure were modified in several ways. First, a compliance variable A i was generated following the model logit(p (A i = 1 X i1 )) = logit(0.9) 3X i1. For subjects with R i = 0, we set A i =0 following the setting where subjects randomized to control are unable to access active treatment. Under this design, compliance was approximately 66% among those randomized to treatment. Under ignorable non-compliance, mean treatment-free outcomes were generated identically to simulation set 1, setting E[Y i,0 S i, A i, R i, X i ] = E[Y i,0 S i, R i, X i ]. Under non-ignorable non-compliance E[Y i,0 S i, A i, R i, X i ] was generated by adjusting the conditional treatmentfree mean E[Y i,0 S i, R i, X i ] for compliance A i and then S i while maintaining congeniality 17

18 across conditional means. Observed outcomes Y i under noncompliance were distributed Bernoulli(p y,nc ), where logit(p y,nc ) = logit(e[y i,0 S i, A i, R i, X i ]) + ψ 1 A i + ψ 2 A i S i, also with ψ = (0.5, 0.5). The application of the two-stage GSMM considered the association model fully saturated for S i, A i, X i1, X i2. Two logistic regression models were fit: model 1 was similar to an intent-to-treat analysis with the addition of post-intervention covariates and contained terms R i and R i S i with full saturation for S i, X i1, X i2 ; and model 2 was an astreated approach, including A i and A i S i, also with full saturation for S i, X i1, X i2. All results were based on 1,000 replicated datasets. Tables 1-4 contain detailed results from the simulations across several different scenarios. Table 1 shows the simulation study results under full compliance when saturated association and logistic regression models are applied. Figure 1 graphically compares percent bias between GSMM and logistic regression estimates as shown numerically in Table 1. We only include results for the logistic regression model containing terms involving S i beyond the R i S i interaction of interest, as this was the less biased of the two logistic regression estimators. For settings characterized by S i associated with Y i,0 and S i affected by R i, bias was observed for standard logistic regression coefficients. For S i and R i weakly associated, the bias in logistic regression coefficients was approximately 20%. For a strong association between treatment and the post treatment variable, bias was more severe. Considering the estimated magnitude of effect, logistic regression estimates were 34%-50% biased; bias increased when noting the reversed sign of the estimated effect compared to the true value. When one of the two characterizing dependencies did not hold, logistic regression estimates were unbiased and more efficient than GSMM estimates, which were typically more biased than logistic regression estimates (2-3%) but within reasonable range considering Monte Carlo variation. G-estimation estimators exhibited the most bias compared to logistic regression when treatment predicted the posttreatment variable, but the posttreatment variable was not associated with Y i,0. Under this design, bias of the G-estimation was 5% to 10%, whereas logistic regressions was < 1% biased. Monte Carlo standard deviations 18

19 were much larger for estimates of the two-parameter GSMM compared to standard logistic regression. Table 2 compares GSMM and logistic regression estimates when terms in the logistic association model are misspecified as described in the simulation design. Bias in logistic regression estimates was similar to that observed in scenario 1. Considering the GSMM, the use of robust weights did not protect against bias when the treatment was strongly predictive of the post-intervention variable. For simulation settings with γ 3 = 1.2 and S i associated with the outcome, the magnitude of causal parameters was overestimated by 38-63% and Monte Carlo standard deviations were substantially larger than under the correct model. For a more modest association of the treatment and intermediate variable (γ 3 = 0.3) or estimation when at least one of the characterizing dependencies did not hold, bias and standard errors were similar to that observed under saturated, correct association models. Theoretically, robust weights guarantee protection against inflation of type I error under the null hypothesis of no treatment affect. Moving away from the null value in the parameter space may result in bias in estimation, as was observed in some scenarios. An additional simulation study was conducted to evaluate the behavior of robust weights as the true parameter value moved away from the null. Results are shown in the appendix and illustrate the trend for increased bias as ψ moves further from 0. The behavior of GSMM and logistic regression for estimating posttreatment effect modification under non-compliance is displayed in Tables 3 and 4, with Table 3 featuring ignorable noncompliance and Table 4 demonstrating nonignorable non-compliance. The intent-to-treat style analysis using logistic regression demonstrated considerably greater bias than in the case of full compliance investigated in scenario 1, with at least 28% bias observed in all settings, and up to nearly 200% bias when treatment strongly predicted the intermediate variable. When the intermediate was not affected by treatment or associated with the outcome, estimates were generally attenuated compared to the true value for both ignorable and non-ignorable noncompliance. Moderate bias (9-18%) was observed for the GSMM under 19

20 weak prediction of treatment for the intermediate variable. Additional simulations showed that this bias is removed by considering larger sample sizes or by increasing the predictiveness of baseline covariates for the intermediate variable. Results of these additional simulations are shown in the appendix. The as-treated analytic approach was unbiased and more efficient than the GSMM under ignorable non-compliance when the intermediate was not affected by treatment or not associated with the outcome but exhibited substantial bias under all scenarios considering non-ignorable missingness. 6 Data Analysis We now turn to the analysis of the data from the JOBS II study. We restrict the analysis to the subset of the subjects for which depression levels and the re-employment outcome are fully observed at all follow up periods. We compare the GSMM estimates to estimates from logistic regression. In all cases, we condition on a large set of pre-treatment covariates that were measured in the JOBS II study. We use pre-treatment covariates to specify the association model, which models the observed outcomes. The pre-treatment covariates include binary indicators for seven categories of occupation type, sex, marital status, whether the subject was nonwhite, years of education, income, age, a measure of financial strain, and depression at baseline. We use the same covariates in the specification of the logistic regression model. In the JOBS II study, depression levels were measured six weeks and six months after subjects completed the training sessions which comprised the intervention. We conduct separate analyses for the two intermediate follow-up periods. In the first analysis, the causal effect of being exposed to the treatment is modified by depression levels at six weeks, and in the second analysis the effect of the intervention is modified by depression levels at six months. The two separate analyses allow us to understand whether the magnitude of effect modification varies over time. We re-scaled the depression scale to range from 0 to 100, which aids interpretation. We found that model convergence was somewhat sensitive to specification of the association model. In particular, we found that when we failed to 20

21 condition on depressive symptoms at baseline estimates either became so large as to signal a lack of convergence or convergence failed outright. Richer specifications also did little to aid precision of the model estimates. Table 5 contains estimates for the two causal parameters, ψ 1 and ψ 2, under two-stage G- estimation and logistic regression. The GSMM causal estimates (robust standard errors are in parenthesis with p-values appearing below) for the ψ 2 parameter: (0.05) at the six week follow-up and (0.04) at the six month follow-up. Estimates of the ψ 2 parameter from logistic regression are much smaller in comparison: (0.008) at six weeks and (0.007) at six months. The bias we observe in the simulations when logistic regression is applied appears to be present in this application as well. The separate parameter estimates in Table 5 do not readily convey the conditional nature of the causal effect implied by the model. We next explore in more detail how posttreatment levels of depression modify the effect of the JOBS II intervention. Here, we use the measure of depressive symptoms from the six week follow-up with the parameter estimates from GSMM. We calculate the causal odds ratio and associated confidence interval for the intervention conditional on three levels of the depression scale. Recall that if we ignore treatment effect modification, the estimated causal odds ratio for compliers is 1.73, which suggests the intervention was effective at increasing employment. How does this estimate change if we consider effect modification by depression at six weeks? In the sample, approximately ten percent of subjects recorded no depressive symptoms. The estimated causal odds ratio for subjects with a score of zero is 4.26 with an associated 95% confidence interval of (1.01, 18.06). While one is just excluded from the confidence interval, the width of this interval is quite long, which reflects the fact that the support in the data is relatively low at that level of depression. Next we calculate the causal odds ratio for subjects with a score of seven on the depression scale, which represents the 25th percentile. The causal odds ratio is nearly 50% smaller at 2.84 with a corresponding 95% confidence interval (1.22, 6.58). When depressive symptoms increase to a score of 16, the median of the depression scale, the causal odds ratio 21

22 decreases further to 1.68 with 95% confidence interval (0.96, 2.96). The magnitude of the treatment effect is further reduced to the point that it appears to be completely without effect for those with higher levels of depression at six weeks. We plot the pattern of effect modification in Figure 2. In Figure 2, we also include estimates and corresponding confidence intervals for two levels of depression above the median. Here, the point estimate falls below 1. Given the sensitivity of estimates to association model specification, we caution against using these estimates as conclusive. The estimates do suggest that in the future, an intervention might be developed or applied to target subjects with higher depression levels observed at 6 weeks. Given the observed pattern of effect modification, investigators may decide that an additional intervention is necessary for subjects with greater than median levels of depression. Our extension of SMMs to the estimation of posttreatment effect modification allows for the tailoring of future interventions when it appears that the treatment is performing poorly among some subpopulations. Under the identification assumption, we must assume that the effect of exposure to the training intervention did not vary across levels of baseline covariates. For example, this implies that the effect of the intervention among compliers did not vary across the seven occupation types measured at baseline. Whether such an assumption is realistic depends largely on subject matter knowledge. While such an assumption might appear implausible, the original study did not place any weight on the possibility of interactions between baseline covariates and exposure to the intervention. In fact, the set of baseline covariates is rich enough that there are more possible sub-populations with varying treatment effects than could be fully specified. As such, the original analysis implicitly invoked the restriction needed for identifiability in our analysis. 7 Discussion We have demonstrated using GSMMs to estimate causal effects that may be modified by posttreatment variables. Our work complements existing literature on noncompliance and 22

23 mediation where conditioning occurs on posttreatment variables. It is natural to contrast the model here with these other causal models that condition on posttreatment quantities. One natural comparison is to casual mediation analysis. It would appear that the analysis we have proposed differs substantially from the purpose of a causal mediation analysis. In mediation, the goal is to decompose a treatment effect into direct and indirect components (Imai et al. 2010). The indirect treatment effect is an effect mediated by a third variable which transmits the treatment effect to the outcome. Mediation effects were of key interest in the original analysis (Vinokur et al. 1995). For example, in the original study it was thought that exposure to the training intervention had a direct effect on re-employment but also had an indirect effect through a measure of self-confidence. In contrast, we stipulate only a total effect of the treatment that is conditional on levels of S. The analysis here also differs from mediation in that the causal contrast here involves only the effect of the treatment and does not attempt to parse out the separate pathways by which treatment influences the outcome or the role of the mediator in those pathways, whereas mediation analysis attempts to do those things. The mediation analysis is, in principle, more ambitious, and, thus, potentially less subject to testing. To see this, consider a (structural nested) model for a mediation analysis in which the effect of the treatment and the mediator are both modeled conditional on levels of the treatment A and the mediator S. By modeling the effect of treatment and the mediator jointly conditional on the same things, we introduce even more parameters into our problem. Under the assumption (often warranted in studies where the main intervention is randomly assigned but the mediator is not) that initial randomization holds but sequential ignorability does not, we have the same number of identifying restrictions. In such circumstances, the models of joint effects involved in mediation analysis will be even less subject to testing than our models of effect modification. Our analysis has focused on a binary treatment. For treatments with more than two levels, the analysis may be extended by fitting a separate association model for each level of treatment, and defining H i (ψ) for each subject by subtracting off the parametrized effect of 23

24 the subjects observed treatment according to the proposed structural model. Restrictions on the effects of various treatment levels may be enforced through the parametrization of ψ in the blip down function from each treatment arm. One weakness of this approach is its dependence on the specification of the association model. Although robust weights provide valid testing in the absence of treatment effects, estimation of treatment effects may be subject to bias under the alternative. Moreover, in data analysis, nonconvergence was observed when baseline depression, a covariate that was highly predictive of the intermediate variable 6-week or 6-month depression, was omitted from the auxiliary model. The implication of this for practitioners is that model fitting of the association model should be completed carefully, with careful attention to functional form and the potential presence of interaction. Additional methodology to enhance robustness is one potential area for further research. A second drawback is the increased variability of estimates determined through G- estimation versus standard logistic regression. This additional variability may be explained by a number of factors, including the semiparametric versus parametric nature of the estimator, as well as the two-stage nature of the estimation strategy. Variance of G-estimation estimates are reduced when baseline covariates contain strong predictors of intermediate variables. For studies using these types of analyses, it is therefore beneficial to record the baseline value of potential intermediate modifying variables and other baseline correlates of intermediates when intending to use this methodology. In this context, efficiency is secondary to bias, and G-estimation yields unbiased estimates when intermediates are affected by treatment and predict the outcome, whereas logistic regression does not. Unless it is known that intermediate variables are not affected by treatment, bias considerations support the use of the logistic G-estimation despite its decreased efficiency compared to logistic regression. 24

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions