G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation

( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited 14.3 Structural nested mean models 14.4 Rank preservation 14.5 G-estimation 14.6 Structural nested models with two or more parameters BIOS 776 2 14 G-Estimation

14.2 Exchangeability revisited Recall conditional exchangeability defined to be For binary Y this is equivalent to Y a A L for a = 0,1 Pr[A = 1 Y a,l] = Pr[A = 1L] Consider the following parametric logistic regression model logit{pr[a = 1 Y a=0,l]} = α 0 + α 1 Y a=0 + α 2 L Fitting such a model to a real data set b/c Y a=0 not observed for all individuals. Thought experiment: Suppose Y a=0 observed for all individuals so that we can fit this model. If conditional exchangeability holds and the model is correctly specified, what would you expect ˆα 1 to equal? BIOS 776 3 14 G-Estimation

Consider the model 14.3 Structural nested mean models E[Y a Y a=0 A = a,l] = β 1 a + β 2 al such that β 1 + β 2 l equals the average causal effect (RD) within stratum L = l Below we discuss using g-estimation to draw inference about β 1 and β 2 Note this model is semi-parametric in the sense that we are not specifying a model for E[Y a=0 L], i.e., there is no intercept β 0 or term β 3 L in the model. This is in contrast to the parametric g-formula from 13. Thus we expect g-estimation to be more robust to model mis-specification than the parametric g-formula. However, g-estimation can only be used to adjust for confounding, but not selection bias (eg, due to censoring) BIOS 776 4 14 G-Estimation

14.4 Rank Preservation Suppose, contrary to fact, for the NHEFS data we knew Y a=1 and Y a=0, i.e., their potential weight gain if they quit smoking and if they did not quit smoking Imagine we sorted individuals according to Y a=1 from largest value to smallest value Imagine we sorted individuals according to Y a=0 from largest value to smallest value Suppose in either case individuals end up in the same order: rank preservation BIOS 776 5 14 G-Estimation

14.4 Rank Preservation When the effect of treatment A on the outcome Y is exactly the same, on the additive scale, for all individuals in the study population, we say that additive rank preservation holds For example, if smoking cessation increases everybodys body weight by exactly 3 kg, then the ranking of individuals according to Y a=0 would be equal to the ranking according to Y a=1 A particular case of additive rank preservation occurs when the sharp null hypothesis is true ( 1), i.e., treatment has no effect on the outcomes of any individual For the purposes of structural nested mean models, we will care about additive rank preservation within levels of L. This conditional additive rank preservation holds if the effect of treatment A on the outcome Y is exactly the same for all individuals with the same values of L BIOS 776 6 14 G-Estimation

14.4 Rank Preservation An example of an (additive conditional) rank-preserving structural model is Yi a Yi a=0 = ψ 1 a + ψ 2 al i for all subjects i where ψ 1 +ψ 2 l is the constant causal effect for all individuals with covariate values L = l For every individual i with L i = l Yi a=1 = Yi a=0 + ψ 1 + ψ 2 l Potential outcome under no treatment Yi a=0 is shifted by ψ 1 + ψ 2 l to obtain potential outcome under treatment Yi a=1 BIOS 776 7 14 G-Estimation

36 14.4 Rank Preservation everybody s body weight by exactly 3 kg, then the rankin cording to =0 would be equal to the ranking accordi that in the latter list all individuals will be 3 kg heavier. additive rank preservation occurs when the sharp null hy Figs 14.1 and 14.2 show examples of additive rank preservation within two strata L = l and L = l Figure 14.1 Chapter 1), i.e., if treatment has no effect on the outcomes the study population. For the purposes of structural nest will care about additive rank preservation shifts from within =0 levels to of = additive rank preservation holds ifstratum. the effect Figure of treatment 14.2 s is exactly the same for all individuals stratum with the= same 0. The value d An example of an (additive conditional) from than rank-preservi in stratum is to the left of the mean =0 = 1 individuals + 2 for in all stratum subjec cessation than individu where 1 + 2 is the constant causal effect for all individ values =. That is, for every individual 1 + 2 0 for all individ with = is equal to =0 For most treatments + 1 + 2. A subject s counterfactual treatment =0 pected to be constant is shifted by 1 + 2 to obtain the value o with the same covariate outcome under treatment. tion is scientifically impl Figure Figure 14.1 14.2 shows an example of additive rank pres cessation affects equally stratum =. The bell-shaped curves represent the distr terfactual outcomes =0 ues of. Some (left curve) and =1 people a (right cu effects of smoking cessa in the upper part of the figure represent the values of the. The individual caus outcomes for subject, and the two dots in the lower par after quitting smoking ues of the two counterfactual outcomes for subject. The gain little, and others m the situation depicted i varies across individual not preserved since the when =0but not w Because of the impla use methods for causal we consider in this book structural mean models Figure 14.3 not for individual causa For most treatments and outcomes, the individual causal effect is not expected to be constant across individuals with the same covariate values, and thus (additive conditional) rank preservation is scientifically implausible BIOS 776 8 14 G-Estimation tion. The estimated ave

14.4 Rank Preservation Eg, we do not expect that smoking Figure 14.2cessation affects equally the body weight of all individuals with the same values of L Reality is probably closer to Fig 14.3 Figure 14.3 A structural nested mean model is well definedintheabsenceofrank preservation. For example, one could propose a structural nested mean model for the setting depicted in Figure 14.3 to estimate the average causal effect within strata of. Such average causal effect will generally differ from the individuallevelcausaleffects. cessation than individuals in stratum = 1 + 2 0 for all individuals with = 0, For most treatments and outcomes, t pected to be constant not even approx with the same covariate values, and thus tion is scientifically implausible. In our ex cessation affects equally the body weight ues of. Some people are genetically o effects of smoking cessation than others,. The individual causal effect of smoki after quitting smoking some individuals gain little, and others may even lose som the situation depicted in Figure 14.3, in varies across individuals with the same not preserved since the outcome for indiv when =0but not when =1. Because of the implausibility of rank p use methods for causal inference that rel we consider in this book require rank pre structural mean models from Chapter 12 not for individual causal effects, and thu Here not only are the shifts from Y a=0 to Y a=1 different between individuals, but also the ranks are not preserved tion. The estimated average causal effect was 3 5 kg (95% CI: 2 5, 4 5). This ave rank preservation of individual causal eff nested mean model in the previous sectio preservation. The additive rank-preserving model in assumption than non-rank-preserving m stant treatment effect for all individuals w reason why we would want to use such a in practice. And yet we use it in the ne because g-estimation is easier to underst because the g-estimation procedure is ac B/c of implausibility of rank preservation, causal methods that rely on it not recommended. Used in 14.5 to introduce g-est b/c g-est is easier to understand for rank-preserving models, and b/c g-est procedure is actually the same for rank-preserving and non-rankpreserving models. and non-rank-preserving models. Note t preserving structural model is a structura BIOS 776 9 14 G-Estimation

14.5 G-Estimation Suppose the goal is estimating the parameters of the structural nested mean model E[Y a Y a=0 A = a,l] = β 1 a For simplicity only considering model with one parameter, effectively assuming average causal effect constant across strata of L Assume additive rank-preserving model Yi a Yi a=0 such that ψ 1 = β 1. Equivalently = ψ 1 a or by causal consistency Y a=0 i Y a=0 i = Y a i ψ 1 a = Y ψ 1 a BIOS 776 10 14 G-Estimation

14.5 G-Estimation If model correct and we knew ψ 1, then could calculate Yi a=0 individuals for all Don t know ψ 1. Moreover, drawing inference of ψ 1 is our goal. Thought experiment: Your friend (an oracle) knows the value of ψ 1. She tells you it equals one of the following three values: ψ = 20, ψ = 0 or ψ = 10. She then challenges you to determine the true value based on the oberved data. You accept the challenge. For each individual compute H(ψ ) = Y ψ A for each of the three possible values of ψ The three newly created random variables H( 20), H(0) and H(10) are candidate potential outcomes. Only one of the three is the correct potential outcome Y a=0. How do you choose which one? BIOS 776 11 14 G-Estimation

14.5 G-Estimation Remember from 14.2 that the assumption of conditional exchangeability can be expressed as a logistic model for treatment given the counterfactual outcome and the covariates L. When conditional exchangeability holds, the coefficient for the counterfactual outcome should be zero. This suggests we fit three separate logistic regression models logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 L The candidate H(ψ ) with α 1 = 0 is the counterfactual Y a=0 and the corresponding ψ equals the true ψ 1 Eg, suppose for H(ψ = 10) that ˆα 1 = 0. Then ˆψ 1 = 10. This is g-estimation. BIOS 776 12 14 G-Estimation

14.5 G-Estimation Important note: G-est does not test whether conditional exchangeability holds; it assumes it holds in order to draw inference about the causal effect of interest In reality we do not have an oracle friend supplying a short list of possible values of ψ 1 Therefore need to search over all possible values of ψ 1 until we find one where the corresponding ˆα 1 = 0 Operationally this is done by a search over a fine grid (eg, -20 to 20 by 0.01) NHEFS example: consider 31 possible candidates H(2.0), H(2.1), H(2.2),..., H(4.9), H(5.0). Fit 31 separate logistic regression models of the probability of smoking cessation A = 1 just as in 12, but include H(ψ ) as an additional covariate BIOS 776 13 14 G-Estimation

14.5 G-Estimation Coefficient estimate ˆα 1 for H(ψ ) was closest to zero for H(3.4) and H(3.5) Finer search reveals ˆα 1 essentially zero for ψ = 3.446 Thus g-est of average causal effect of smoking cessation on weight gain is 3.4 kg Wald test of H 0 : α 1 = 0 at ψ = 3.446 yields p-value p 1 To find a 95% confidence interval for ψ, find subset of ψ where p > 0.05 (this is the standard approach of constructing a CI by inverting a hypothesis test) For NHEFS data 31 logistic models, this yields 95% CI [2.5, 4.5] (essentially the same as IP weighting and parametric G-formula) BIOS 776 14 14 G-Estimation

14.5 G-Estimation: Comments Other tests of H 0 : α 1 = 0 aside from Wald test, such as the score test or likelihood ratio test, could be used instead If we assume Y a {A,C} L no need to adjust for censoring O/w, if we make the weaker assumption Y C {A,L}, need to construct inverse probability of censoring weights W C = 1/Pr[C = 0 A = a,l] as in 12 With IP censoring weights and standard software, can (conservatively) use robust variance estimate to construct Wald tests of H 0 : α 1 = 0; expect 95% CIs to be wider than if non-conservative variance estimate or bootstrap used instead BIOS 776 15 14 G-Estimation

14.5 G-Estimation: Comments Back to non-rank-preserving models g-est algo (ie the computer code implementing the procedure) for estimating ψ 1 produces consistent estimate of parameter β 1 of mean model, assuming mean model is correctly specified (ie, if average treatment effect is equal in all levels of L) This is true regardless of whether the individual treatment effect is constant Ie, it is not necessary that H(β 1 ) = Y a=0 for all subjects. Rather, it is sufficient for H(β 1 ) and Y a=0 to have the same conditional mean given L BIOS 776 16 14 G-Estimation

14.6 SNM with 2 or more parameters One parameter structural nested model E[Y a Y a=0 L] = β 1 a assumes same average treatment effect If this model is mis-specified, i.e., there is effect modification by some components V of L, inferences will be wrong We expect effect modification to be the case in general [Note, in contrast, that effect modification does not invalidate MSM methods described in 12] Relax this assumption by considering instead two-parameter SNM E[Y a Y a=0 L] = β 1 a + β 2 av and, for g-estimation, the corresponding rank preserving model Y a i Y a=0 i = ψ 1 a + ψ 2 av BIOS 776 17 14 G-Estimation

14.6 SNM with 2 or more parameters To estimate ψ 1 and ψ 2, fit logistic model logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 H(ψ )V + α 3 L Find combination of ψ 1 and ψ 2 that result in H(ψ ) A L Ie, search for combination of (ψ 1,ψ 2 ) that yields ˆα 1 = ˆα 2 = 0 In general, solution does not have a closed form and therefore numerical search algorithms (eg Nelder-Mead Simplex) must be used For linear mean model, like the ones discussed thus far, estimator does have a closed from (Tech Pt 14.2) BIOS 776 18 14 G-Estimation

Tech Pt 14.2 Consider one parameter SNM E[Y a Y a=0 L] = β 1 a Suppose g-est based on score test of H 0 : α 1 = 0 Then equivalent (HW) to finding parameter value ψ that solves EE I[C i = 0]Ŵi C H i (ψ )(A i E[A i L i ]) = 0 i Using the fact H i (ψ ) = Y i ψ A i, closed form solution ˆψ 1 = i I[C i = 0]Wi C Y i (A i E[A i L i ]) i I[C i = 0]Wi C A i (A i E[A i L i ]) What if we fit a two parameter SNM? (HW) BIOS 776 19 14 G-Estimation