Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small

Bounds on Causal Effects in Three-Arm Trials with Non-compliance Jing Cheng Dylan Small Department of Biostatistics and Department of Statistics University of Pennsylvania June 20, 2005

A Three-Arm Randomized Trial for Treating Alcohol Dependence Three Arms: Control (0): Drug naltrexone plus simple medication management by a research physician. Active Treatment A: Compliance enhancement therapy (CET) plus drug naltrexone plus simple medication management. Active Treatment B: Cognitive behavioral therapy (CBT) plus drug naltrexone plus simple medication management. Outcome: Whether or not the subject relapsed, i.e., had five or more drinks in a day, on any day in the past month (relapse coded as 0, lack of relapse coded as 1). Noncompliance: Compliance with an active treatment was categorized as a binary variable, whether or not the subject attended at least 80% of scheduled sessions.

Data Notation: Randomization assignment Z = {0, A, B}, Treatment-received D = {0, A, B}, Observed outcome Y = {0, 1}. D Z Y = 1 Y = 0 Ê(Y D, Z) 0 0 28 19 0.60 0 A 7 8 0.47 A A 21 14 0.60 0 B 15 6 0.71 B B 20 3 0.87

Standard Two Arm Causal Estimates ITT Estimates: Treatment A versus Control: Ê(Y Z = A) Ê(Y Z = 0) = 0.04. Treatment B versus Control: Ê(Y Z = B) Ê(Y Z = 0) = 0.20. Complier Average Causal Effects: Treatment A versus Control: Treatment B versus Control: Ê(Y Z=A) Ê(Y Z=0) Ê(D Z=A) Ê(D Z=0) = 0.04 0.7 = 0.06. Ê(Y Z=B) Ê(Y Z=0) 0.20 = Ê(D Z=A) Ê(D Z=0) 0.52 = 0.38.

Principal Stratification Actual treatment received D is a post-randomization variable. Potential Treatment Received: D Z = d {0, A, B}. Each subject has three potential treatment receiveds, D 0, D A, D B. Basic principal stratification with respect to D (Frangakis and Rubin, 2002): 0AB = {u (D 0, D A, D B ) = (0, A, B) }, π 0AB = P (u 0AB); 0A0 = {u (D 0, D A, D B ) = (0, A, 0) }, π 0A0 = P (u 0A0); 00B = {u (D 0, D A, D B ) = (0, 0, B) }, π 00B = P (u 00B); 000 = {u (D 0, D A, D B ) = (0, 0, 0) }, π 000 = P (u 000)

Principal Causal Effects E(Y A Y 0 0AB), E(Y B Y 0 0AB), E(Y A Y 0 0A0), E(Y B Y 0 00B). Why are the principal causal effects for the basic principal strata in three-arm trial of interest? Personal expectations about treatment effects. Clinical decisionmaking about which treatment to offer first. Planner decisionmaking about what will happen if treatment is introduced into general practice.

Which treatment to offer first? Should treatment A or treatment B be offered first when the plan is to offer the other treatment if the patient refuses to comply with the first treatment? Treatment that has a higher effect for the 0AB principal stratum should be offered first because this is the only stratum whose treatment received is affected by which treatment is offered first. Comparing complier average causal effect for treatment A to complier average causal effect for treatment B may be misleading. Example: Equal proportions in four principal strata. Causal effect of treatment A for 0A0 stratum equals 0.7, causal effect of treatment A for 0AB stratum equals 0.4, causal effect of treatment B for 00B stratum equals 0.3, causal effect of treatment B for 0AB stratum equals 0.6. Treatment B is better for stratum 0AB. But complier average causal effect for treatment A ((.7 +.4)/2 =.55) is higher than complier average causal effect for treatment B ((.3 +.6)/2 =.45).

Planner decisionmaking An important use of causal modeling in randomized trials with noncompliance is to help a planner anticpate what would happen were the treatment(s) to be introduced into general practice. After a trial is completed, pattern of treatment received among people offered it may differ from patterns observed in trial. Causal effects of receiving treatment are important for predicting effects of introducing treatment into general practice. Joffe and Brensinger (2003) provide general framework. Three arm trials provide information for sharpening such predictions. For example, if compliance to treatment A is expected to drop in general practice, it might be expected that the subjects whose compliance will change are more likely to be in principal strata 0A0 than in principal strata 0AB.

Assumptions Assumption 1: Stable Unit Treatment Value Assumption (SUTVA): No interference between subjects; Assumption 2: Random Assignment; Assumption 3: Exclusion Restriction (ER): No direct randomization effect; Assumption 4: Monotonicity I P (D 0 = 0) = 1, P (D A = B) = P (D B = A) = 0; Assumption 5: Monotonicity II P (D A = A D B = B) = 1 no 00B stratum.

Problem in Point-identifying Parameters Z D Principal Strata A A 0AB, 0A0 A 0 00B, 000 B B 0AB, 00B B 0 0A0, 000 0 0 0AB, 0A0, 00B, 000 E(Y Z = A, D = A) = E(Y Z = A, D = 0) = π 0AB π 0AB + π 0A0 E(Y A 0AB) + π 00B π 00B + π 000 E(Y 0 00B) + π 0AB π 0A0 π 0AB + π 0A0 E(Y A 0A0) π 000 π 00B + π 000 E(Y 0 000) π 00B E(Y Z = B, D = B) = E(Y B 0AB) + E(Y B 00B) π 0AB + π 00B π 0AB + π 00B π 0A0 π 000 E(Y Z = B, D = 0) = E(Y 0 0A0) + E(Y 0 000) π 0A0 + π 000 π 0A0 + π 000 E(Y Z = 0, D = 0) = π 0AB E(Y 0 0AB) + π 0A0 E(Y 0 0A0) +π 00B E(Y 0 00B) + π 000 E(Y 0 000) Problem: System of equations for expected potential outcomes in terms of expected observed data outcomes does not have a unique solution. Principal causal effects for basic principal strata are not point-identified.

Bounds Identification region: Set of principal causal effects that are consistent with distribution of observables (Y, D, Z). Although principal causal effects are not point identified, the data is informative about the bounds of the identification region. Derive bounds for average causal effects within principal strata under Assumptions 1 4 I Determine bounds for the proportions in principal strata: π 0AB, π 0A0, π 00B, and π 000 ; II Find bounds for average potential outcomes within principal strata: E(Y A 0AB), E(Y A 0A0), E(Y B 0AB), E(Y B 00B), E(Y 0 0AB), E(Y 0 0A0), E(Y 0 00B), E(Y 0 000), given π 0AB, π 0A0, π 00B, and π 000 ;

Method for Finding Bounds III Find the bounds for the average causal effect of treatment within principal strata, E(Y A Y 0 0AB), E(Y B Y 0 0AB), E(Y A Y 0 0A0), E(Y B Y 0 00B), using the results from Step I and Step II.

Method: Bounds (Assump. 1-4) Step I: Bounds for the proportions in principal strata P (D = A Z = A) = π 0AB + π 0A0 P (D = 0 Z = A) = π 00B + π 000 P (D = B Z = B) = π 0AB + π 00B P (D = 0 Z = B) = π 0A0 + π 000 1 = P (D = 0 Z = 0) = π 0AB + π 0A0 + π 00B + π 000 Furthermore, we have 0 π 0AB, π 0A0, π 00B, π 000 1

Bounds on proportions of principal strata max{0, P (D = A Z = A) P (D = 0 Z = B)} π 0AB min{p (D = A Z = A), P (D = B Z = B)} max{0, P (D = A Z = A) P (D = B Z = B)} π 0A0 min{p (D = A Z = A), P (D = 0 Z = B)} max{0, P (D = B Z = B) P (D = A Z = A)} π 00B min{p (D = B Z = B), P (D = 0 Z = A)} max{0, P (D = 0 Z = B) P (D = A Z = A)} π 000 min{p (D = 0 Z = A), P (D = 0 Z = B)}

Method: Bounds (Assump. 1-4) Step II: Bounds for the average potential outcomes given proportions in principal strata E(Y Z = A, D = A) = E(Y Z = A, D = 0) = π 0AB π 0AB + π 0A0 E(Y A 0AB) + π 00B π 00B + π 000 E(Y 0 00B) + π 0AB π 0A0 π 0AB + π 0A0 E(Y A 0A0) π 000 π 00B + π 000 E(Y 0 000) π 00B E(Y Z = B, D = B) = E(Y B 0AB) + E(Y B 00B) π 0AB + π 00B π 0AB + π 00B π 0A0 π 000 E(Y Z = B, D = 0) = E(Y 0 0A0) + E(Y 0 000) π 0A0 + π 000 π 0A0 + π 000 E(Y Z = 0, D = 0) = π 0AB E(Y 0 0AB) + π 0A0 E(Y 0 0A0) +π 00B E(Y 0 00B) + π 000 E(Y 0 000) 0 E(Y A 0AB), E(Y A 0A0), E(Y B 0AB), E(Y B 00B), E(Y 0 0AB), E(Y 0 0A0), E(Y 0 00B), E(Y 0 000) 1

Example: E(Y A 0AB) (mine(y A 0AB), maxe(y A 0AB)) = (max{0, 1 1 P (Y = 1 Z = A, D = A) π 0AB }, min{1, P (D=A Z=A) P (Y = 1 Z = A, D = A) π 0AB }) P (D=A Z=A)

Method: Bounds (Assump. 1-4) Step III: Bounds for the average causal effect within principal strata For given π 0AB : E(Y A Y 0 0AB) (mine(y A 0AB, π 0AB ) maxe(y 0 0AB, π 0AB ), maxe(y A 0AB, π 0AB ) mine(y 0 0AB, π 0AB )). For varying π 0AB : E(Y A Y 0 0AB) ( min π 0AB I [mine(y A 0AB, π 0AB ) maxe(y 0 0AB, π 0AB )], max π 0AB I [maxe(y A 0AB, π 0AB ) mine(y 0 0AB, π 0AB )]) where I = (max{0, P (D = A Z = A) P (D = 0 Z = B)}, min{p (D = A Z = A), P (D = B Z = B)})

Assumption 5(Monotonicity II) Method: Bounds (Assump. 1-5) P (D A = A D B = B) = 1 00B does not exist. π 0AB, π 0A0, and π 000 : point-identified; E(Y B 0AB), E(Y 0 0AB), E(Y 0 0A0), E(Y 0 000): point-identified; E(Y A 0AB), E(Y A 0A0): bounded; E(Y B Y 0 0AB) = E(Y B 0AB) E(Y 0 0AB) E(Y A Y 0 0AB) (min{e(y A 0AB)} E(Y 0 0AB), max{e(y A 0AB)} E(Y 0 0AB)) E(Y A Y 0 0A0) (min{e(y A 0A0)} E(Y 0 0A0), max{e(y A 0A0)} E(Y 0 0AB))

Confidence Intervals for Bounds We have assumed that the distribution of (Y, D, Z) is known. E.g., E(Y A 0AB) (max{0, 1 min{1, 1 P (Y = 1 Z = A, D = A) π 0AB }, P (D=A Z=A) P (Y = 1 Z = A, D = A) π 0AB }) P (D=A Z=A) Sampling uncertainty in the distribution of (Y, D, Z); Confidence intervals (CIs) are of interest when making inference about the bounds.

Method: CI on bounds Suppose (L n, U n ) are estimates of the bounds (L, U) Bonferroni CI: (L l n, U n u ) P (L l n L) = 1 α 2, P (U n u U) = 1 α 2. Horowitz-Manski (HM) CI (JASA 2002): (L n z nα, U n + z nα ) z nα : P (L n z nα L, U U n + z nα ) = 1 α asymptotically. B-method CI (Beran, JASA 1988): (minl l n, maxu u n ) {L l n : L n L l n Ĥ 1 n,l [Ĥ 1 n (1 α)]}, {U u n : U u n U n Ĥ 1 n,u [Ĥ 1 n (1 α)]} Ĥn,l and Ĥn,u are distributions of (L n L n) and (U n Un ) respectively, where L n and U n are bootstrap estimates of L n and U n respectively; Ĥ n is the distribution of max{ĥ n,l (L n L n), Ĥ n,u (U n U n )}.

Method: Check the Plausibility of Assumptions Exclusion Restriction (ER) Derive a necessary condition for ER to hold (Pearl, 1995); If the probability distribution of (Y, D, Z) does not satisfy this condition, ER assumption cannot hold. Monotonicity II (Assumption 5) Derive the constraints that Monotonicity II implies for the observed probability distribution of (Y, D, Z); Bootstrap from the empirical distribution of (Y, D, Z); Count what proportion of times the bootstrapped distribution does not satisfy Monotonicity II.

Application: Alcohol study Three arms Usual care, CET, CBT. Outcome: Alcohol relapse; Sample size = 141; Check the plausibility of assumptions ER is plausible; We are fairly confident that Monotonicity II does not hold for the alcohol study (65% of 1000 bootstrapped distributions do not satisfy Monotonicity II).

Causal effect Assump. 1 4 E(Y A Y 0 0AB) Estimated Bounds ( 0.61, 0.67) 95% Bonferroni CIs ( 1, 1) 95% Horowitz-Manski CIs ( 1, 1) 95% B-method CIs ( 1, 1) E(Y A Y 0 0A0) Estimated Bounds ( 1, 0.40) 95% Bonferroni CIs ( 1, 0.79) 95% Horowitz-Manski CIs NA 95% B-method CIs ( 1, 0.81) E(Y B Y 0 0AB) Estimated Bounds (0.11, 0.67) 95% Bonferroni CIs ( 0.85, 1) 95% Horowitz-Manski CIs ( 0.44, 1) 95% B-method CIs ( 0.43, 1) E(Y B Y 0 00B) Estimated Bounds ( 0.64, 1) 95% Bonferroni CIs ( 1, 1) 95% Horowitz-Manski CIs NA 95% B-method CIs ( 1, 1)

A Hypothetical Study Three arms Control (0), Treatment A, Treatment B. Sample size = 1200; ER is plausible; Monotonicity II is plausible: Empirical distribution of (Y, D, Z) and 95% of 1000 bootstrapped distributions satisfy the constraints that Monotonicity II implies.

Causal effect 1 4 1 4 and 5 E(Y A Y 0 0AB) Estimated Bounds (0.41, 0.51) (0.44, 0.50) 95% Bonferroni CIs (0.33, 0.58) (0.36, 0.57) 95% Horowitz-Manski CIs (0.34, 0.58) (0.37, 0.57) 95% B-method CIs (0.33, 0.58) (0.37, 0.57) E(Y A Y 0 0A0) Estimated Bounds (0.39, 0.79) (0.42, 0.73) 95% Bonferroni CIs (0.13, 0.89) (0.18, 0.92) 95% Horowitz-Manski CIs (0.22, 0.96) (0.24, 0.91) 95% B-method CIs (0.21, 0.91) (0.22, 0.90) E(Y B Y 0 0AB) Estimated Bounds (0.16, 0.23) 0.20 95% Bonferroni CIs (0.06, 0.32) (0.11, 0.29) 95% Horowitz-Manski CIs (0.01, 0.32) NA 95% B-method CIs (0.07, 0.32) NA E(Y B Y 0 00B) Estimated Bounds ( 1, 1) Undefined 95% Bonferroni CIs ( 1, 1) NA 95% Horowitz-Manski CIs ( 1, 1) NA 95% B-method CIs ( 1, 1) NA

Accuracy of Bootstrap CIs Finite sample accuracy of the 95% bootstrap CIs: Coverage Probabilities Alcohol Study Hypothetical Study Bonferroni 0.92 0.97 0.94 0.96 HM method 0.65 0.92 0.91 0.96 B-method 0.76 0.98 0.90 0.96

Summary For a three arm trial, principal stratification framework reveals that a more useful set of estimands can be considered by looking at full structure of three arm trial rather than treating it as a two two-arm trials. Goal: To make inferences about causal effects within basic principal strata in three-arm trials with non-compliance. Method: Derive bounds under usual assumptions 1-4; Add assumption 5 to identify parameters / narrow bounds on parameters; Construct confidence intervals for the bounds to take into account the sampling uncertainty in the estimates. Application: Alcohol study.

Future Work Goal: To make inferences about average causal effects within basic principal strata in three-arm trials with non-compliance; In this study, we derive bounds based on the extreme relationship between certain potential outcomes; E.g. Derive bounds on E(Y A 0AB) by taking E(Y A 0A0) = 1 or 0 based on E(Y Z = A, D = A) = π 0AB π 0AB +π 0A0 E(Y A 0AB) + π 0A0 π 0AB +π 0A0 E(Y A 0A0);

Future Work 1) Sensitivity analysis: Find selection parameters between certain potential outcomes; Check the change of the bounds with the change of the selection parameters; 2) Estimation under Assumption 5 (Monotonicity II): Identify parameters using covariates that predict principal stratum membership.

Observed proportions in the alcohol study and the hypothetical study Observed Proportions Alcohol Study Hypothetical Study P (D = A Z = A) 0.70 0.95 P (D = 0 Z = A) 0.30 0.05 P (D = B Z = B) 0.52 0.80 P (D = 0 Z = B) 0.48 0.20 P (D = 0 Z = 0) 1.00 1.00 P (Y = 1 Z = A, D = A) 0.60 0.95 P (Y = 1 Z = A, D = 0) 0.47 0.20 P (Y = 1 Z = B, D = B) 0.87 0.70 P (Y = 1 Z = B, D = 0) 0.71 0.25 P (Y = 1 Z = 0, D = 0) 0.60 0.45