IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

Size: px

Start display at page:

Download "IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM"

Ashley Maxwell
5 years ago
Views:

1 IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

2 IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling 12.3 Stabilized IP weights 12.4 Marginal structural models 12.5 Effect modification and marginal structural models 12.6 Censoring and missing data BIOS IPW and MSM

3 12.1 The causal question Goal: Estimate effect of smoking cessation (A) on weight gain (Y ) using data from National Health and Nutrition Examination Survey Data I Epidemiologic Follow-up Study (NHEFS) n = 1566 smokers w/ baseline visit and follow-up visit 10 years later Outcome Y body weight (kg) at 10 years minus weight at baseline A = 1 if reported quitting smoking before 10 yr visit, A = 0 o/w Average weight gain Ê(Y A = 1) = 4.5 among quitters, Ê(Y A = 0) = 2.0 in non-quitters Estimated difference 2.5 (95% CI 1.7, 3.4), p < BIOS IPW and MSM

4 12.1 The causal question Causal estimand of interest: E(Y a=1 ) E(Y a=0 ) Difference in mean weight gain that would have been observed if all individuals in the population had quit smoking before the follow-up visit versus if all individuals in the population had not quit smoking Because exposure A not randomly assigned, there may be confounding. Thus we are not willing to assume E(Y A = a) E(Y A = 0) = E(Y a=1 ) E(Y a=0 ) Let L denote vector of 9 baseline covariates: sex (0 male, 1 female), age (yrs), race (0 white, 1 other), education (5 categories), intensity and duration of smoking (cigarettes/day and yrs of smoking), physical activity in daily life (3 categories), recreational exercise (3 categories), and weight (kg) BIOS IPW and MSM

5 12.2 Estimating IP weights via modeling Assume conditional exchangeability Y a A L, i.e., covariates L are sufficient to block all backdoor paths from A to Y Use IP weighting to estimate E(Y a=1 ) E(Y a=0 ) Recall from 2.4 IP estimator has the form or where 1 n n i=1 Y i A i Pr[A i = 1 L i ] 1 n n i=1 Y i (1 A i ) Pr[A i = 0 L i ] 1 n W i Y i A i 1 n W i Y i (1 A i ) W i = A i Pr[A i = 1 L i ] 1 + (1 A i )Pr[A i = 0 L i ] 1 BIOS IPW and MSM

6 12.2 Estimating IP weights via modeling In a conditionally randomized trial, W i known function of L i In an observational setting, the assignment mechanism Pr[A i = a L i ] is unknown and needs to be estimated 1 n ŴiY i A i 1 n ŴiY i (1 A i ) If L were low-dimensional (e.g., univariate and binary) could estimate non-parametrically based on sample means However, here L is a 9 dimensional covariate, with some covariates taking on more than 2 values and age continuous; need to model BIOS IPW and MSM

7 12.2 Estimating IP weights via modeling Logistic regression model of Pr[A = 1] on all nine covariates, linear and quadratic terms for age, weight, intensity and duration of smoking, and no product (interaction) terms between the covariates Based on fitted model compute Pr[A = 1 L] = logit 1 ( ˆβL) Pr[A = 0 L] = 1 logit 1 ( ˆβL) Inverse weight to create pseudo-population and fit linear model (using MLE/LS) IPW estimator of causal effect E(Y A = a) = θ 0 + θ 1 A ˆθ 1 = 3.44 BIOS IPW and MSM

8 12.2 Estimating IP weights via modeling # R code from chapter12.r fit <- glm(qsmk ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education.code) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2), family = binomial(), data = nhefs0) p.qsmk.obs <- ifelse(nhefs0$qsmk == 0, 1 - predict(fit, type = "response"), predict(fit, type = "response")) nhefs0$w <- 1/p.qsmk.obs glm.obj <- glm(wt82_71~qsmk, data = nhefs0, weights = w) BIOS IPW and MSM

9 12.3 Stabilized IP weights Does weighted LS give IP estimator from 2.4? Actually, no. Weighted LS estimator minimizes Ŵ i {Y i (θ 0 + θ 1 A i } 2 i where Ŵ i = A i Pr[A i = 1 L i ] 1 + (1 A i ) Pr[A i = 0 L i ] 1 This yields (homework) ˆθ 1 = iŵ i Y i A i iŵ i Y i (1 A i ) i Ŵ i A i i Ŵ i (1 A i ) which has slightly different from the original IPW estimator i Ŵ i Y i A i n iŵ i Y i (1 A i ) n The former known as stabilized IPW estimator; the latter aka unstabilized BIOS IPW and MSM

10 12.3 Stabilized IP weights In survey sampling nomenclature, stabilized IPW estimator is difference in Hajek estimators and unstablized IPW estimator is difference in Horwitz-Thompson estimators Hajek-type estimators tend to be less variable than HT estimators Intuition: if Pr[A i = a L i ] v small, then Ŵ i v large So unstabilized/ht estimators can be highly variable Stabilized/Hajek estimators replace denominator n w/ an unbiased estimator of n New denominator tends to be large (small) when numerator is large (small) BIOS IPW and MSM

11 12.3 Stabilized IP weights HR describe 1 W i = A i Pr[A i = 1 L i ] + (1 A 1 i) Pr[A i = 0 L i ] as unstabilized weight, and as stabilized weight W i = A i Pr[A i = 1] Pr[A i = 1 L i ] + (1 A i) Pr[A i = 0] Pr[A i = 0 L i ] Using either form of W i in ˆθ 1 is equivalent BIOS IPW and MSM

12 12.3 Stabilized IP weights What are large sample properties of stabilized IPW estimator ˆθ? Let µ a = E(Y a ) for a = 0,1; µ = (µ 1, µ 0 ) Assume for now weights W i known function of L i Consider vector estimating equation ( ) W i A i (Y i µ 1 ) ψ(y i,a i,l i, µ) = = 0 W i (1 A i )(Y i µ 0 ) Solution Note ˆµ 1 ˆµ 0 = ˆθ 1 ( ) ( ) ˆµ1 W i Y i A i / W i A i ˆµ = = ˆµ 0 W i Y i (1 A i )/ W i (1 A i ) BIOS IPW and MSM

13 M-Estimators Under suitable regularity conditions (Stefanski and Boos TAS 2002) n( ˆµ µ) d N(0,V (µ)) as n where V (µ) = A(µ) 1 B(µ){A(µ) 1 } T A(µ) = E[ ψ(y i,a i,l i, µ)] B(µ) = E[ψ(Y i,a i,l i, µ)ψ(y i,a i,l i, µ) T ] ψ(y i,a i,l i, µ) = ψ(y i,a i,l i, µ)/ µ Empirical sandwich variance estimator consistent for V (µ) ˆV = Â 1 ˆB[Â 1 ] T where Â = 1 n ψ(y i,a i,l i, ˆµ) ˆB = 1 n ψ(y i,a i,l i, ˆµ)ψ(Y i,a i,l i, ˆµ) T BIOS IPW and MSM

14 Sandwich Estimator For problem at hand, ψ(y i,a i,l i, µ) = ( W i A i 0 0 W i (1 A i ) ) From here calculation of Â and ˆB, and therefore ˆV, straightforward By delta method where g = ( ) 1 1 ˆµ 1 ˆµ 0 N(µ 1 µ 0,n 1 g T ˆV g) For NHEFS data, n 1 g T ˆV g = (homework?) implying estimated std err equals w/ Wald 95% CI (2.4,4.5) In practice, need not compute by hand; can use standard software to compute empirical sandwich variance estimate BIOS IPW and MSM

15 Sandwich Estimator in SAS /* from chapter12.sas */ proc genmod data= nhefs_w; class seqn; weight w; model wt82_71= qsmk; repeated subject=seqn / type=ind; run; Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept <.0001 qsmk <.0001 BIOS IPW and MSM

16 12.3 Stabilized IPW Estimators But wait! All of that assumed weights W i known function of L i What about the fact that in an observational study we don t know the assignment mechanism and therefore have to estimate the weights? Eg, suppose we use logistic regression logit(pr[a = 1 L]) = βl Then we have vector estimating equation ψ(y i,a i,l i, µ,β) = ψ β (A i,l i,β) W i A i (Y i µ 1 ) W i (1 A i )(Y i µ 0 ) = 0 where ψ β () is vector of score equations from log likelihood corresponding to logistic regression model BIOS IPW and MSM

17 12.3 Stabilized IPW Estimators Can show when weights n{( ˆµ 1 ˆµ 0 ) (µ 1 µ 0 )} d N(0,Σ ) where { (Y Σ 1 µ 1 ) 2 = E Pr[A = 1 L] + (Y 0 µ 0 ) 2 } Pr[A = 0 L] Whereas when the weights are estimated (e.g., based on logistic regression), asy var equals where c 0 (homework) Σ c Interesting result: Even if W i known, it is better to estimate! Unfortunately consistent estimator of asy var Σ c cannot be obtained using standard software. However, if we do use standard software as in previous slides, the above result indicates we are being conservative BIOS IPW and MSM

18 12.3 Stabilized IPW Estimators Sketch of proof of first claim on previous slide First note ( A(µ) = Therefore ), B(µ) = E ( W 2 A(Y 1 µ 1 ) W 2 (1 A)(Y 0 µ 0 ) 2 Σ = g T V (µ)g = g T B(µ)g = E[W 2 A(Y 1 µ 1 ) 2 +W 2 (1 A)(Y 0 µ 0 ) 2 ] Finally note [ A(Y E[W 2 A(Y 1 µ 1 ) 2 1 µ 1 ) 2 ] [ (Y 1 µ 1 ) 2 ] ] = E = E Pr[A = 1 L] 2 Pr[A = 1 L] and similarly [ (Y E[W 2 (1 A)(Y 0 µ 0 ) 2 0 µ 0 ) 2 ] ] = E Pr[A = 0 L] ) BIOS IPW and MSM

19 12.4 Marginal structural models Consider the following model E[Y a ] = β 0 + β 1 a Note outcome variable is a potential outcome (counterfactual) Models for mean counterfactual outcomes are referred to as structural mean models. Marginal structural model because modeling the marginal distn of the counterfactual rather than joint distn of Y 0 and Y 1 (Hernan, Robins, Brumback Epid 2000); or b/c structural mean model does not include any covariates (HR 12.4) Estimator ˆθ 1 = ˆµ 1 ˆµ 0 from previous section CAN for parameter β 1 = E(Y 1 ) E(Y 0 ) (causal risk difference) BIOS IPW and MSM

20 12.4 Marginal structural models Suppose A takes on many values, eg, number of cigarettes per day in 1982 (year of follow-up visit) minus number of cigarettes per day at baseline Each individual has many potential outcomes, eg, Y a= 25 if an individual decreased cigs/day by 25 Consider the MSM E[Y a ] = β 0 + β 1 a + β 2 a 2 Suppose interested in effect of increasing smoking by 20 cigs/day compared to no change E[Y a=20 ] E[Y a=0 ] = 400β β 1 BIOS IPW and MSM

21 12.4 Marginal structural models As before, can consistently estimate parameters of MSM (β 0,β 1,β 2 ) using IP weighting For continuous A, stabilized weights are no longer Pr[A = a]/pr[a = a L], but rather of the form f (A)/ f (A L) where f denotes the (conditional) density of A (given L) Eg, we might assume the usual linear model A = αl + ε where ε N(0,σ 2 ) in order to estimate f (A L); similarly for f (A) Based on estimated stablized weights, inverse weight to create pseudo-population and fit model E(Y A = a) = θ 0 + θ 1 a + θ 2 a 2 BIOS IPW and MSM

22 SAS Program 12.4 /* estimation of denominator of ip weights */ proc glm data= nhefs_nmv_s outstat= ss_den(keep= _source type_ df ss where=(_source_ in( ERROR ) and _type_ in( ERROR )));; class exercise active education; model smkintensity82_71 = sex race age age*age education smokeintensity smokeintensity*smokeintensity smokeyrs smokeyrs*smokeyrs exercise active wt71 wt71*wt71 / solution; output out= temp_den p= pred; data sd_den; set ss_den; rootmse_n= sqrt(ss/df); match= 1; keep rootmse_n match; data est_dens_d; merge temp_den sd_den; by match; dens_den = pdf( NORMAL, smkintensity82_71, pred, rootmse_n); proc sort; by seqn; /* estimation of numerator of ip weights */ proc glm data= nhefs_nmv_s outstat= ss_num(keep= _source type_ df ss where=(_source_ in( ERROR ) and _type_ in( ERROR )));; model smkintensity82_71 = / solution; output out= temp_num p= pred; BIOS IPW and MSM

23 data sd_num; set ss_num; rootmse_n= sqrt(ss/df); match= 1; keep rootmse_n match; data est_dens_n; merge temp_num sd_num; by match; dens_num = pdf( NORMAL, smkintensity82_71, pred, rootmse_n); proc sort; by seqn; data nhefs_sw_cont; merge est_dens_d est_dens_n ; by seqn; sw_a= dens_num / dens_den; proc univariate data=nhefs_sw_cont; var sw_a; id seqn; proc genmod data= nhefs_sw_cont; class seqn; weight sw_a; model wt82_71= smkintensity82_71 smkintensity82_71*smkintensity82_71; estimate No change intercept 1 smkintensity82_71 0; estimate Increase smoking by 20 cig/day intercept 1 smkintensity82_71 20; estimate Effect of increase smoking by 20 cig/day intercept 0 smkintensity82_71 20; repeated subject=seqn / type=ind; BIOS IPW and MSM

24 Weight Check Note proc univariate included to check weights have mean near 1 If the models are correctly specified, then mean should be near 1. Why? Consider A binary and stabilized weights ( ) Pr[A = 1] Pr[A = 0] E(W) = E A + (1 A) Pr[A = 1 L] Pr[A = 0 L] ( ) A = Pr[A = 1]E + Pr[A = 0]E Pr[A = 1 L] = Pr[A = 1]E L ( EA L A Pr[A = 1 L] ( (1 A) ) Pr[A = 0 L] ) +Pr[A = 0]E L ( EA L (1 A) Pr[A = 0 L] Deviations from 1 indicate model misspecification or possible violations, or near violations, of positivity ) = 1 BIOS IPW and MSM

25 12.4 Marginal structural models What if outcome Y is dichotomous? Eg suppose A = 1 quit smoking, A = 0 o/w and Y = 1 dead by 1982, Y = 0 alive Marginal structural logistic model logitpr[y a = 1] = α 0 + α 1 a where exp(α 1 ) is causal odds ratio of death for quitting versus not quitting smoking Parameters α 0 and α 1 can be consistently estimated by fitting logistic model logitpr[y = 1 A] = θ 0 + θ 1 A to pseudo-population created by IP weighting BIOS IPW and MSM

26 12.5 Effect modification and MSM Covariates can be included in MSM to assess effect modification Eg suppose V encodes sex (1 male, 0 female) and consider MSM E[Y a V ] = β 0 + β 1 a + β 2 Va + β 3 V Additive effect modification if β 2 0 Consistently estimate parameters of MSM by fitting via weighted LS E[Y A,V ] = γ 0 + γ 1 A + γ 2 AV + γ 3 V Weights based on covariates L that include V and any other variables sufficient to ensure exchangeability within level of V BIOS IPW and MSM

27 12.6 Censoring and missing data When estimating effect of smoking cessation A on weight gain Y, we restricted analysis to n = 1566 individuals with a body weight measurement at end of follow-up in 1982 There were, however, 63 additional individuals excluded from the analysis because their weight in 1982 was not known Selecting only individuals with non-missing outcome values may introduce selection bias ( 8) Let C = 1 if body weight missing (censored), C = 0 otherwise Unstabilized weights W now equal 1 A(1 C) Pr[C = 0 A,L]Pr[A = 1 L] +(1 A)(1 C) 1 Pr[C = 0 A,L]Pr[A = 0 L] and stabilized weight are adjusted analogously BIOS IPW and MSM

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited