I. Partial identification of causal parameters with multiple proposed instruments

Size: px

Start display at page:

Download "I. Partial identification of causal parameters with multiple proposed instruments"

Wesley Floyd
5 years ago
Views:

1 Supplemental Materials Table of contents of Supplemental Materials: I. Partial identification of causal parameters with multiple proposed instruments II. Data example III. R code I. Partial identification of causal parameters with multiple proposed instruments We first consider bounding the global average causal effect of a dichotomous treatment A on a dichotomous outcome Y: E[Y a=1 ] E[Y a=0 ] where superscripts denote counterfactuals. We consider Z 1,, Z m pre-treatment dichotomous variables as possible instruments. To simplify notation, let LB k and UB k denote the lower and upper Balke-Pearl bounds computed using Z k, respectively. Below we present derivations for bounds for the average causal effect when all proposed instruments are assumed to be valid (Theorem 1) and when at least one proposed instrument is assumed to be valid (Theorem 2). While these marginally-computed summary bounds are valid, they are not sharp for the data generating procedure depicted in the causal diagram in Figure 2. Generally, narrower bounds will be achievable when the proposed instruments jointly satisfy the instrumental conditions: see Richardson and Robins 2014 for expressions for sharp bounds. The difference is in the assumptions being leveraged. The sharp bounds rely on the vector (Z 1,, Z m ) being independent of the counterfactual outcomes, while the marginally-computed bounds rely on Z 1 being marginally independent of the counterfactual outcomes, as well as Z 2,, and Z m. The logic of the proofs for Theorems 1-2 can readily apply to bounds for any causal parameter that can be bounded under a set of assumptions about a proposed instrument whenever the definition of that causal parameter is not dependent on the proposed instrument. For example, extensions immediately follow for: partial identification of a counterfactual risk (e.g., E[Y a=1 ]) and the causal risk ratio for dichotomous outcomes; partial identification of a mean counterfactual outcome and causal mean difference for continuous outcomes; and partial identification strategies combining the instrumental conditions with various sets of additional assumptions (e.g., see Swanson et al for further examples). Analogs of these proofs are not applicable to causal parameters for which the definitions are instrument-dependent (e.g., local average treatment effects), as each marginally-computed bound or point estimate is then targeting a different causal parameter. In principle, bounds can be identified for many permutations of assumed (marginally or jointly defined) instruments. For example, suppose in our dataset we assumed that the true data generating process followed one of the four causal diagrams in efigure 1, but we do not know which one specifically. In that case, we can compute bounds for each scenario assuming that the appropriate subset of proposed instruments jointly satisfies the instrumental conditions, and then take the union as the bounds under our collective assumptions. For a given dataset and research Swanson Supplementary Materials p.1

2 question, investigators could consider all relevant sets of assumptions (perhaps as encoded in a set of causal diagrams) and use the logic presented here to guide analyses. Theorem 1. Suppose Z 1,, Z m each satisfy the instrumental conditions marginally, and the intersection of their respective Balke-Pearl bounds is non-empty. Then E[Y a=1 ] E[Y a=0 ] [max k (LB k ), min k (UB k )]. Proof. For each Z k, the average causal effect is contained within [LB k, UB k ] under our assumption that Z k satisfies the instrumental conditions marginally (Balke & Pearl 1997). Therefore, the average causal effect is contained within all m sets: E[Y a=1 ] E[Y a=0 ] [LB 1, UB 1 ] E[Y a=1 ] E[Y a=0 ] [LB 2, UB 2 ] E[Y a=1 ] E[Y a=0 ] [LB m, UB m ] Taking the intersection of these sets, E[Y a=1 ] E[Y a=0 ] is contained within [max k (LB k ), min k (UB k )]. Note: If the intersection of [LB 1, UB 1 ], [LB 2, UB 2 ],, and [LB m, UB m ] is an empty set, then we have evidence that at least one of the proposed instruments Z 1,, Z m does not satisfy the instrumental conditions. This could be viewed as an alternative to the standard overidentification tests that rely on homogeneity conditions. However, this test would only detect very extreme violations of the instrumental conditions. Theorem 2. If at least one of Z 1,, Z m satisfies the instrumental conditions, then E[Y a=1 ] E[Y a=0 ] [min k (LB k ), max k (UB k )]. Proof. For an arbitrary Z k, if Z k satisfies the instrumental conditions, then the average causal effect lies within [LB k, UB k ] (Balke & Pearl 1997). Therefore, under our assumption that at least one of Z 1,, Z m satisfies the instrumental conditions, the average causal effect is contained within at least one of these m Balke-Pearl bounds. That is, at least one of the following expressions is true: E[Y a=1 ] E[Y a=0 ] [LB 1, UB 1 ] E[Y a=1 ] E[Y a=0 ] [LB 2, UB 2 ] E[Y a=1 ] E[Y a=0 ] [LB m, UB m ] If the average causal effect must be in at least one of these sets, then it must be in the union of the sets. Thus, E[Y a=1 ] E[Y a=0 ] is contained within [min k (LB k ), max k (UB k )]. II. Data example In the main text, bounds and point estimates from a toy dataset were presented for which the various methods relying on homogeneity conditions were biased, while assuming all proposed instruments jointly satisfied the instrumental conditions (without homogeneity) led to point identification. The data example was generated as follows. Swanson Supplementary Materials p.2

3 Z 1 ~ B(0.4) Z 2 ~ B(0.6) Z 3 ~ B(0.6) Z 4 ~ B(0.4) U ~ Multinom(0.1, 0.2, 0.3, 0.4) A = Z 1 U 1 + Z 2 U 2 + Z 3 U 3 + Z 4 U 4 Y a=0 ~ B(0.2U U U U 4 ) Y a=1 ~ B(0.1U U U U 4 ) Note that the four instruments are perfect predictors of treatment in four mutually-exclusive subpopulations with different effect sizes. As such, the homogeneity conditions are not satisfied for any proposed instrument. However, the bounds computed assuming Z 1, Z 2 Z 3, and Z 4 jointly satisfy the instrumental conditions collapse to a single value because it suffices to compare subjects with Z 1 =Z 2 =Z 3 =Z 4 =1 (who all will have A=1) to subjects with Z 1 =Z 2 =Z 3 =Z 4 =0 (who all will have A=0). Obviously, it is unlikely that human biology functions as simply as in the data example presented here. (It is also unlikely that it functions as homogeneously as presumed by pointidentification methods.) That said, the simplicity of the data generating procedure was not important for deriving valid bounds. It is, however, the reason the bounds in this particular dataset are so informative. Of note, the data example was used to illustrate different point and partial identification results, and for didactic purposes we did not consider the role of sampling variability. In practice, each point estimate or set of bounds should be accompanied by confidence intervals. See Imbens and Manski 2004 for some discussion on confidence intervals for partially identified parameters. III. R code ## Generating data ##sample size n, number of IVs k ##n very large to minimize role of sampling variability in illustration n < k <- 4 set.seed(66) ##generating data with four IVs (Z), treatment (A), ##[counterfactual and observed] outcome (Y), and unmeasured confounder (U) z <- matrix(nrow=n,ncol=k) z[1:n,1] <- rbinom(n=n,size=1,prob=0.6) z[1:n,2] <- rbinom(n=n,size=1,prob=0.4) z[1:n,3] <- rbinom(n=n,size=1,prob=0.4) z[1:n,4] <- rbinom(n=n,size=1,prob=0.6) u <- matrix(nrow=n,ncol=4) u <- t(rmultinom(n=n,size=1,prob=c(0.1,0.2,0.3,0.4))) Swanson Supplementary Materials p.3

4 a <- matrix(nrow=n,ncol=1) y <- matrix(nrow=n,ncol=3) for(i in 1:n) { a[i,1] <- min(z[i,1]*u[i,1] + z[i,2]*u[i,2] + z[i,3]*u[i,3] + z[i,4]*u[i,4],1) y[i,1] <- rbinom(n=1,size=1,prob= 0.20*u[i,1]+ 0.10*u[i,2] + 0.3*u[i,3]+ 0.1*u[i,4]) y[i,2] <- rbinom(n=1,size=1,prob= 0.10*u[i,1]+ 0.20*u[i,2] + 0.2*u[i,3]+ 0.9*u[i,4]) y[i,3] <- ifelse(a[i,1]==1,y[i,2],y[i,1]) dat <- cbind(z,u,a,y) dat <- as.data.frame(dat) colnames(dat) <- c(paste('z',1:k,sep=''),paste('u',1:ncol(u),sep=''),'a','y0','y1','y') ## Computing bounds and point estimates ##summary dataset for storing single IV point estimates and bounds summarydat <- matrix(nrow=k,ncol=5) colnames(summarydat) <- c('lb','ub','ivnum','ivden','ivest') ##computing bounds and IV ratio for each Zk ##Pr[Y=y,A=a Z=z]=pya.z dat$set <- dat[,i] p00.0 <- sum(with(dat,set==0 & a==0 & y==0))/sum(with(dat,set==0)) p01.0 <- sum(with(dat,set==0 & a==1 & y==0))/sum(with(dat,set==0)) p10.0 <- sum(with(dat,set==0 & a==0 & y==1))/sum(with(dat,set==0)) p11.0 <- sum(with(dat,set==0 & a==1 & y==1))/sum(with(dat,set==0)) p00.1 <- sum(with(dat,set==1 & a==0 & y==0))/sum(with(dat,set==1)) p01.1 <- sum(with(dat,set==1 & a==1 & y==0))/sum(with(dat,set==1)) p10.1 <- sum(with(dat,set==1 & a==0 & y==1))/sum(with(dat,set==1)) p11.1 <- sum(with(dat,set==1 & a==1 & y==1))/sum(with(dat,set==1)) summarydat[i,1] <- max( p p00.0-1, p p00.1-1, p p p p p10.0, p p p p p10.1, -p p10.1, -p p10.0, p p p p p00.0, p p p p p00.1 ) summarydat[i,2] <- min( 1 - p p10.0, 1 - p p10.1, -p p p p p00.0, -p p p p p00.1, p p00.1, p p00.0, -p p p p p10.0, -p p p p p10.1 ) py1.1 <- sum(with(dat,set==1 & y==1))/sum(with(dat,set==1)) py1.0 <- sum(with(dat,set==0 & y==1))/sum(with(dat,set==0)) pa1.1 <- sum(with(dat,set==1 & a==1))/sum(with(dat,set==1)) pa1.0 <- sum(with(dat,set==0 & a==1))/sum(with(dat,set==0)) summarydat[i,3] <- (py1.1 - py1.0) summarydat[i,4] <- (pa1.1 - pa1.0) summarydat[i,5] <- (py1.1 - py1.0)/(pa1.1 - pa1.0) ##computing joint IV bounds ##for our contrived data example, it is easy to 'see' the relevant strata ##when Z1=Z2=Z3=Z4=1 and when Z1=Z2=Z3=Z4=0 ##in general, see Richardson and Robins 2014 for the full expression ##note: this code is specific to this particular dataset! dat$nzs <- rowsums(dat[,1:k]) Swanson Supplementary Materials p.4

5 py1.zz1 <- sum(with(dat,y==1 & nzs==k))/sum(with(dat,nzs==k)) py1.zz0 <- sum(with(dat,y==1 & nzs==0))/sum(with(dat,nzs==0)) jointp <- py1.zz1 - py1.zz0 ## Table 1 dat$set <- dat[,i] print(i) print(round(sum(with(dat,set==0 & y==1))/sum(with(dat,set==0)),2)) print(round(sum(with(dat,set==1 & y==1))/sum(with(dat,set==1)),2)) print(round(sum(with(dat,set==0 & a==1))/sum(with(dat,set==0)),2)) print(round(sum(with(dat,set==1 & a==1))/sum(with(dat,set==1)),2)) ## Figure 1 plot(0,0,type='n',xlim=c(-1,1),ylim=c(16,0),ylab='',xlab='causal Risk Difference',yaxt='n') abline(v=mean(dat$y1) - mean(dat$y0),lty=3) abline(v=0,lty=1) ##IV ratios text(-1.1,0,bquote(underline('iv ratios')),pos=4) points(summarydat[i,5],i,col='gray',pch=19,lwd=2) text(-1.1,1,expression('z'[1]),pos=4) text(-1.1,2,expression('z'[2]),pos=4) text(-1.1,3,expression('z'[3]),pos=4) text(-1.1,4,expression('z'[4]),pos=4) ##robust methods (simple examples) ##note: see Burgess et al. for further considerations re: use and estimation ##only simplest versions are shown here to demonstrate potential bias due to heterogeneity ##see also Burgess et al. for sensitivity analyses that may inform the use of these methods text(-1.1,5,bquote(underline('examples of robust methods')),pos=4) ##simple median points(median(summarydat[1:k,5]),6,col='gray',pch=19,lwd=2) text(-1.1,6,'median-based',pos=4) ##simple penalization-based method where we leave out the extreme ##(in these data, most obvious outlier to leave out is the IV ratio for Z4) points(mean(summarydat[1:3,5]),7,col='gray',pch=19,lwd=2) text(-1.1,7,'penalization-based',pos=4) ##MR-Egger ##see Burgess et al. for weighting given finite sample estimates ##MR-Egger developed with a linear model (no constraints for our dichotomous Y) points(lm(summarydat[,3]~summarydat[,4])$coeff[2],8,col='gray', pch=19,lwd=2) text(-1.1,8,'egger',pos=4) ##bounds text(-1.1,9,bquote(underline('bounds under sets of assumed IVs')),pos=4) segments(summarydat[i,1],i+9,summarydat[i,2],i+9,col='black',lwd=2) text(-1.1,10,expression('z'[1]),pos=4) text(-1.1,11,expression('z'[2]),pos=4) text(-1.1,12,expression('z'[3]),pos=4) text(-1.1,13,expression('z'[4]),pos=4) segments(max(summarydat[,1],na.rm=true),14,min(summarydat[,2],na.rm=true),14,col='black',lwd=2) text(-1.1,14,expression('z'[1]*',z'[2]*',z'[3]*',z'[4]*' (marginally)'),pos=4) segments(min(summarydat[,1],na.rm=true),15,max(summarydat[,2],na.rm=true),15,col='black',lwd=2) text(-1.1,15,expression('at least one: Z'[1]*',Z'[2]*',Z'[3]*',Z'[4]),pos=4) points(jointp,16,col='black',lwd=2,pch=19) text(-1.1,16,expression('joint: (Z'[1]*',Z'[2]*',Z'[3]*',Z'[4]*')'),pos=4) Swanson Supplementary Materials p.5

Z 1, Z 2, Z 3, and Z 4 do not meet the instrumental conditions in A), B), C), and D), respectively, due to confounding of the proposed instrument. References Balke A, Pearl J.

6 efigure 1. Causal diagrams depicting examples of data generating procedures in which three of the four proposed instruments are valid. Z 1, Z 2, Z 3, and Z 4 are proposed instruments for the effect of treatment A on outcome Y; U represents unmeasured confounders. Z 1, Z 2, Z 3, and Z 4 do not meet the instrumental conditions in A), B), C), and D), respectively, due to confounding of the proposed instrument. References Balke A, Pearl J. Bounds on treatment effects for studies with imperfect compliance. Journal of the American Statistical Association. 1997;92(439): Imbens GW, Manski CF. Confidence intervals for partially identified parameters. Econometrica. 2004: Richardson TS, Robins JM. ACE Bounds; SEMs with Equilibrium Conditions. Statistical Science. 2014;29(3): Swanson SA, Holme Ø, Løberg M, et al. Bounding the per-protocol effect in randomized trials: an application to colorectal cancer screening. Trials. 2015;16(1):1-11. Swanson Supplementary Materials p.6

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Comparison of Three Approaches to Causal Mediation Analysis Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Introduction Mediation defined using the potential outcomes framework natural effects