G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

Size: px

Start display at page:

Download "G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation"

Morgan Todd
5 years ago
Views:

1 G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

2 G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited 14.3 Structural nested mean models 14.4 Rank preservation 14.5 G-estimation 14.6 Structural nested models with two or more parameters BIOS G-Estimation

3 14.2 Exchangeability revisited Recall conditional exchangeability defined to be For binary Y this is equivalent to Y a A L for a = 0,1 Pr[A = 1 Y a,l] = Pr[A = 1L] Consider the following parametric logistic regression model logit{pr[a = 1 Y a=0,l]} = α 0 + α 1 Y a=0 + α 2 L Fitting such a model to a real data set not possible b/c Y a=0 not observed for all individuals Thought experiment: Suppose Y a=0 observed for all individuals so that we can fit this model. If conditional exchangeability holds and the model is correctly specified, what would you expect ˆα 1 to equal? BIOS G-Estimation

4 Consider the model 14.3 Structural nested mean models E[Y a Y a=0 A = a,l] = β 1 a + β 2 al such that β 1 + β 2 l equals the average causal effect (RD) within stratum L = l Below we discuss using g-estimation to draw inference about β 1 and β 2 Note this model is semi-parametric in the sense that we are not specifying a model for E[Y a=0 L], i.e., there is no intercept β 0 or term β 3 L in the model This is in contrast to the parametric g-formula from 13. Thus we expect g-estimation to be more robust to model mis-specification than the parametric g-formula. BIOS G-Estimation

5 14.4 Rank Preservation Suppose, contrary to fact, for the NHEFS data we knew Y a=1 and Y a=0 for all participants, i.e., each individual s potential weight gain if they quit smoking and if they did not quit smoking Imagine we sorted individuals according to Y a=1 from largest value to smallest value Imagine we sorted individuals according to Y a=0 from largest value to smallest value Suppose in either case individuals end up in the same order: rank preservation BIOS G-Estimation

6 14.4 Rank Preservation When the effect of treatment A on the outcome Y is exactly the same, on the additive scale, for all individuals in the study population, we say that additive rank preservation holds For example, if smoking cessation increases each individual s body weight by exactly 3 kg, then the ranking of individuals according to Y a=0 would be equal to the ranking according to Y a=1 A particular case of additive rank preservation occurs when the sharp null hypothesis is true ( 1), i.e., treatment has no effect on the outcomes of any individual For the purposes of structural nested mean models, we will care about additive rank preservation within levels of L. This conditional additive rank preservation holds if the effect of treatment A on the outcome Y is exactly the same for all individuals with the same values of L BIOS G-Estimation

7 14.4 Rank Preservation An example of an (additive conditional) rank-preserving structural model is Yi a Yi a=0 = ψ 1 a + ψ 2 al i for all subjects i where ψ 1 +ψ 2 l is the constant causal effect for all individuals with covariate values L = l For every individual i with L i = l Yi a=1 = Yi a=0 + ψ 1 + ψ 2 l Potential outcome under no treatment Yi a=0 is shifted by ψ 1 + ψ 2 l to obtain potential outcome under treatment Yi a=1 BIOS G-Estimation

8 14.4 Rank Preservation Figs 14.1 and 14.2 show examples of additive rank preservation within two strata L = l and L = l Figure 14.1 that in the latter list all individuals will be 3 kg additive rank preservation occurs when the sharp 36 Chapter 1), i.e., if treatment has no effect on the o the study population. For the purposes of struct will care about additive rank preservation shifts from within = additive rank preservation holds if stratum. the effect Figur of tre is exactly the same for all individuals stratum with the = sa 0 An example of an (additive conditional) from than rank in st is to the left of th =0 = 1 individuals + 2 for in a cessation than where is the constant causal effect for a values =. That is, for every individual for al wi is equal to =0 For most tr A subject s count treatment =0 pected to be c is shifted by to obtain th with the same outcome under treatment. tion is scientific Figure Figure shows an example of additive r cessation affect stratum =. The bell-shaped curves represent terfactual outcomes =0 ues of. Some (left curve) and effects of smok =1 in the upper part of the figure represent the valu. The individ outcomes for subject, and the two dots in the after quitting s ues of the two counterfactual outcomes for subject gain little, and the situation d varies across in not preserved s when =0b Because of t use methods fo For most treatments and outcomes, the individual causal effect is not expected to be constant across individuals with the same covariate values, and thus (additive conditional) rank preservation is scientifically implausible Eg, we do not expect that smoking cessation affects equally the body weight of all individuals with the same values of L BIOS G-Estimation

9 Figure Rank Preservation Reality is probably closer to Fig 14.3 Figure 14.3 Here not only are the shifts from Y a=0 to Y a=1 different between individuals, but also the ranks are not preserved A structural nested mean model is well definedintheabsenceofrank preservation. For example, one could propose a structural nested mean model for the setting depicted in Figure 14.3 to estimate the average causal effect within strata of. Such average causal effect will generally differ from the individuallevelcausaleffects. B/c of implausibility of rank preservation, causal methods that rely on it not recommended. Used in 14.5 to introduce g-est b/c g-est is easier to understand for rank-preserving models, and b/c g-est procedure is actually the same for rank-preserving and non-rankpreserving models. BIOS G-Estimation with the same covariate values, a tion is scientifically implausible. cessation affects equally the bod ues of. Some people are gene effects of smoking cessation tha. The individual causal effect after quitting smoking some ind gain little, and others may even the situation depicted in Figure varies across individuals with th not preserved since the outcome when =0but not when = Because of the implausibility use methods for causal inference we consider in this book require structural mean models from Ch not for individual causal effects, tion. The estimated average cau was 3 5 kg (95% CI: 2 5, 4 5). rank preservation of individual nested mean model in the previ preservation. The additive rank-preserving assumption than non-rank-prese stant treatment effect for all indi reason why we would want to u in practice. And yet we use it because g-estimation is easier to because the g-estimation proced and non-rank-preserving models

10 14.5 G-Estimation Suppose the goal is estimating the parameters of the structural nested mean model E[Y a Y a=0 A = a,l] = β 1 a For simplicity only considering model with one parameter, effectively assuming average causal effect constant across strata of L Assume additive rank-preserving model Yi a Yi a=0 such that ψ 1 = β 1. Equivalently = ψ 1 a or by causal consistency Y a=0 i Y a=0 i = Y a i ψ 1 a = Y ψ 1 A BIOS G-Estimation

11 14.5 G-Estimation If model correct and we knew ψ 1, then could calculate Yi a=0 individuals for all Don t know ψ 1. Moreover, drawing inference of ψ 1 is our goal. Thought experiment: Your friend (an oracle) knows the value of ψ 1. She tells you it equals one of the following three values: ψ = 20, ψ = 0 or ψ = 10. She then challenges you to determine the true value based on the oberved data. You accept the challenge. For each individual compute H(ψ ) = Y ψ A for each of the three possible values of ψ The three newly created random variables H( 20), H(0) and H(10) are candidate potential outcomes. Only one of the three is the correct potential outcome Y a=0. How do you choose which one? BIOS G-Estimation

12 14.5 G-Estimation Remember from 14.2 that the assumption of conditional exchangeability can be expressed as a logistic model for treatment given the counterfactual outcome and the covariates L. When conditional exchangeability holds, the coefficient for the counterfactual outcome should be zero. This suggests we fit three separate logistic regression models logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 L The candidate H(ψ ) with ˆα 1 0 is the counterfactual Y a=0 and the corresponding ψ is the estimate of the true ψ 1 Eg, suppose for H(ψ = 10) that ˆα 1 0. Then ˆψ 1 = 10. This is g-estimation. BIOS G-Estimation

13 14.5 G-Estimation Important note: G-est does not test whether conditional exchangeability holds; it assumes it holds in order to draw inference about the causal effect of interest In reality we do not have an oracle friend supplying a short list of possible values of ψ 1 Therefore need to search over all possible values of ψ 1 until we find one where the corresponding ˆα 1 = 0 Operationally this is done by a search over a fine grid (eg, -20 to 20 by 0.01) NHEFS example: consider 31 possible candidates H(2.0), H(2.1), H(2.2),..., H(4.9), H(5.0). Fit 31 separate logistic regression models of the probability of smoking cessation A = 1 just as in 12 (with same L), but include H(ψ ) as an additional covariate BIOS G-Estimation

14 14.5 G-Estimation Coefficient estimate ˆα 1 for H(ψ ) was closest to zero for H(3.4) and H(3.5) Finer search reveals ˆα 1 essentially zero for ψ = Thus g-est of average causal effect of smoking cessation on weight gain is 3.4 kg Wald test of H 0 : α 1 = 0 at ψ = yields p-value p 1 To find a 95% confidence interval for ψ 1, find subset of ψ where p > 0.05 (this is the standard approach of constructing a CI by inverting a hypothesis test) For NHEFS data 31 logistic models, this yields 95% CI [2.5, 4.5] (essentially the same as IP weighting and parametric G-formula) BIOS G-Estimation

15 G-Estimation: chapter14.r ################################################################## # G-estimation: Checking multiple possible values of psi*/ ################################################################## require(geepack) data <- nhefs.g.est grid <- seq(from = 2,to = 5, by = 0.01) # set by = for finer estimate j = 0 store.hpsi.coefs <- double(length(grid)) for (i in grid){ psi = i; j = j+1 data$hpsi <- data$wt82_71 - psi * data$qsmk gee.obj <- geeglm(qsmk ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2)+Hpsi, data = data, weight = w.cens, id=id, corstr="independence", family = binomial(logit)) store.hpsi.coefs[j] <- coef(gee.obj)["hpsi"] cat("iteration", j, "completed\n") } store.results <- as.data.frame(cbind(grid, abs(store.hpsi.coefs))) names(store.results) <- c("grid", "Hpsi.est") store.results[store.results$hpsi.est == min(store.results$hpsi.est),] BIOS G-Estimation

16 G-Estimation: chapter14.r α ^ ψ BIOS G-Estimation

17 G-Estimation: chapter14.r P value ψ BIOS G-Estimation

18 14.5 G-Estimation: Comments Other tests of H 0 : α 1 = 0 aside from Wald test, such as the score test or likelihood ratio test, could be used instead If we assume Y a {A,C} L no need to adjust for censoring O/w, if we make the weaker assumption Y C {A,L}, need to construct inverse probability of censoring weights W C = 1/Pr[C = 0 A = a,l] as in 12 With IP censoring weights and standard software, can (conservatively) use robust variance estimate to construct Wald tests of H 0 : α 1 = 0; expect 95% CIs to be wider than if non-conservative variance estimate or bootstrap used instead BIOS G-Estimation

19 14.5 G-Estimation: Comments Back to non-rank-preserving models G-estimation estimator ˆψ 1 consistent for parameter β 1 of structural nested mean model, assuming mean model is correctly specified (i.e., if average treatment effect is equal in all levels of L) This is true regardless of whether the individual treatment effect is constant I.e., it is not necessary that H(β 1 ) = Y a=0 for all subjects. Rather, it is sufficient for H(β 1 ) and Y a=0 to have the same conditional mean given L BIOS G-Estimation

20 14.6 SNM with 2 or more parameters One parameter structural nested model E[Y a Y a=0 L] = β 1 a assumes same average treatment effect If this model is mis-specified, i.e., there is effect modification by some components V of L, inferences will be wrong We expect effect modification to be the case in general [Note, in contrast, that effect modification does not invalidate MSM methods described in 12] Relax this assumption by considering instead two-parameter SNM E[Y a Y a=0 L] = β 1 a + β 2 av BIOS G-Estimation

21 14.6 SNM with 2 or more parameters For g-estimation, the corresponding rank preserving model is and now let Yi a Yi a=0 = ψ 1 a + ψ 2 av H(ψ) = Y ψ 1 A ψ 2 AV To estimate ψ 1 and ψ 2, fit logistic model logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 H(ψ )V + α 3 L Find combination of ψ 1 and ψ 2 where H(ψ ) A L I.e., search for combination of (ψ 1,ψ 2 ) that yields ˆα 1 = ˆα 2 = 0 In general, solution does not have a closed form and therefore numerical search algorithms (eg Nelder-Mead Simplex) must be used BIOS G-Estimation

22 14.6 NHEFS Data Revisited Consider two-parameter SNM E[Y a Y a=0 L] = β 1 a + β 2 av where Y is change in weight between follow-up and baseline, and V is baseline smoking intensity Numerical 2-d grid search; fit logistic model logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 H(ψ )V + α 3 L for ψ 1 {2,2.05,...,5} and ψ 2 { 1, 0.95,...,1} Find values of ψ 1 and ψ 2 where α 1 α 2 0 Yields ˆβ 1 = ˆψ and ˆβ 2 = ˆψ BIOS G-Estimation

23 Contour plot of α 1 + α 2 ψ ψ 1 BIOS G-Estimation

24 Tech Pt 14.2 In certain settings, g-estimator has a closed form E.g., consider one parameter SNM E[Y a Y a=0 L] = β 1 a Suppose g-est based on score test of H 0 : α 1 = 0 logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 L Then equivalent to finding parameter value ψ that solves EE H i (ψ )(A i Ê[A i L i ]) = 0 i Using the fact H i (ψ ) = Y i ψ A i, closed form solution ˆψ 1 = iy i (A i Ê[A i L i ]) i A i (A i Ê[A i L i ]) What if there is censoring, or if we fit a two parameter SNM? See Tech Pt 14.2 BIOS G-Estimation

25 chapter14.r ################################################################## # G-estimation: Closed form estimator linear mean models ################################################################## logit.est <- glm(as.factor(qsmk) ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2), data = nhefs0, weight = w.cens, family = binomial("logit")) nhefs0$qsmk.pred <- predict(logit.est, nhefs0, type = "response") # solve sum(w_c * H(psi) * (qsmk - E[qsmk L])) = 0 # for a single psi and H(psi) = wt82_71 - psi * qsmk # this can be solved as # psi = sum( w_c * wt82_71 * (qsmk - pqsmk)) / sum(w_c * qsmk * (qsmk - pqsmk)) with(nhefs0, sum( w.cens * wt82_71 * (qsmk - qsmk.pred)) / sum(w.cens * qsmk * (qsmk - qsmk.pred))) # [1] BIOS G-Estimation

26 Recap 12 Fitting MSM via IPW requires correct model of Pr[A = a L] 13 Parametric G-formula requires correct model of E[Y A,L] Doubly robust (DR) estimators require (i) correct model of Pr[A = a L] or (ii) correct model of E[Y A,L] but not necessarily both 14 G-estimation requires correct model of Pr[A = a L] and correct (semiparametric) structural mean model E[Y a Y a=0 V ] = β 1 a + β 2 av See Tech Pt 14.2 regarding DR G-estimators Less popular b/c computationally demanding, lack of off-the-shelf software, but has advantages over other approaches (Vansteelandt and Joffe Stat Sci 2014) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited