Optimal rejection regions for testing multiple binary endpoints in small samples

Optimal rejection regions for testing multiple binary endpoints in small samples Robin Ristl and Martin Posch Section for Medical Statistics, Center of Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna Köln, June 1

Possible aims of a study with multiple endpoints (1) Show effect in all endpoints Reject all H i Endpoints are co-primary () Reject the global null-hypothesis of no effect in any EP Need not involve conclusions on individual EPs () Identify efficacious endpoints Reject some H i Multiple testing problem (4) Show non-inferiority for some EP and superiority for at least one EP Might be considered as minimal condition for an improved treatment

What is the problem with small sample sizes? Asymptotic distibution may not reflect true distributions with sufficient accuracy Low precision of nuisance parameter estimates Limited model complexity/risk of overfitting Low power

Testing binary endpoints H : p 1 = p or equivalently H : OR = 1 p 1, p proportions in treatment and placebo group n sample size per group Large sample sizes: Asymptotic z-test Under H, Z = ˆp n T ˆp C d N(, 1) ˆp(1 ˆp ) Uses central limit theorem and consistent estimate for p Small sample sizes: Under H, n ˆp 1, n ˆp Binom(n, p ) Can find the exact distribution of ˆp 1 ˆp for given p Difficulty: p is unkown Conditional test Table row margins are a sufficient statistic for p Conditional on the margins the data do not depend on p

Fisher s Exact Test Null hypothesis H : Odds ratio 1 Test statistic T is the number of successes in the treatment group Conditional on margins, T has hypergeometric distribution under H The test may be quite conservative (due to discreteness) Example Data Null distribution of T Trt Ctrl Sum k P(T=k) Succ. 1 4. Fail 4 6 1.1 Sum.476 Observed T =.1 4. Rejection region p-value =.619

Multiple Fisher Tests Consider two endpoints, A and B Four possible outcomes: No success, Success A only, Success B only, Success for A and B Global null-hypothesis: H : OR A 1 and OR B 1 Simple approach: Two marginal Fisher tests at level α/ May be very conservative Example Data Marginal Table EP A Marginal Table EP B Trt Ctrl Sum Trt Ctrl Sum Trt Ctrl Sum AB 9 Succ. 11 16 Succ. 1 6 16 A Fail 4 1 14 Fail 9 14 B Sum 1 1 Sum 1 1 Fail 7 11 T A = 11 T B = 1 Sum 1 1 p =. >.1 p =.16 >.1

Idea: Use conditional joint distribution of the test statistics Fixed sample size per group n, m X Mult(n, p X1, p X, p X, p X4 ), Y Mult(m, p Y1, p Y, p Y, p Y4 ) X, Y independent. Trt Ctrl Sum AB x 1 y 1 a A x y b B x y c Fail x 4 y 4 d Sum n m N P(X 1 = x 1, X = x, X = x, X 4 = x 4 X 1 + Y 1 = a, X + Y = b, X + Y = c, X 4 + Y 4 = d) ( ) xi 4 1 pxi i=1 x i!y i! p Yi Use this to find conditional joint distribution of the test statistics T A = x 1 + x, T B = x 1 + x Consider the point null-hypothesis H : p xi = p yi, i = 1,..., 4 Or any alternative specified through p Xi /p Yi, i = 1,..., 4

Conditional joint distribution under H Joint distribution of T A and T B Probabilities in %, rounded to 1 decimal digit. Entries "" are small but positive..4. 1..1..1 1...4 1 1 14 1 1.1.1.1.1.4 11..7.9.6..1. 1.4 1.. 1..6.1 1. T B 9.1..7 1.. 4.1 6.4 6.4.4. 6.4..9.7.1.1.1. 7.1.9. 6.4 4.1 1...1 6.1.6 1.. 1..4 1..1..6.9.7.. 4.1.1.1.1.4 1 1 4 6 7 9 1 11 1 1 14 1 T A

Under alternative of specific success probabilities Succ. probs. for EP A and B: Treatment.7,.4, control.,., correlation between endpoints..1.6.6 1 7.4 1. 1.6 4.6. 1 1 14 1.1..4..9 1.1.6 1.7.4 1.. 6. 11.4. 7. 1..1 1. 1.1.9.6.4 1.6 6.9..7 T B 9..1 1.1.7.7. 7.4.. 4.1.9.. 1. 7.1..7..4. 6.1.1.1. 4 1 1 4 6 7 9 1 11 1 1 14 1 T A

Definition of valid rejection regions A valid rejection region R N N must satisfy (i) Under H, P((T A, T B ) R) α (ii) If (t 1, t ) R then A = {(s 1, s ) N N : s 1 t 1, t } R A R Condition (i) ensures level α control. Condition (ii) prevents implausible results and ensures larger power for larger effects. Reject H if (T A, T B ) R

Bonferroni rejection region for the example Bonferroni rejection region..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.99 %; Area = 4

Aims: Use joint distribution to... Find rejection region with maximal power for a specified alternative Also find rejection regions with Maximal alpha exploitation Maximal area Study special cases of exact p-value combination tests C = min(p 1, p ) C = p 1 p Reject H if c greater or equal the α quantile of the exact conditional null-distribution of C. Calculate power for Bonferroni related tests Bonferroni Tarone test: Test at level α/m. m is the number of tests that can become significant at level α/m. Choose m as large as possible. May add unused alpha to some individual tests.

Optimal rejection regions An optimal rejection region R can be found as solution of P((T A, T B ) R) max, under a specified alternative (maximal power) subject to conditions (i) and (ii) Similarly we can optimize either P((T A, T B ) R) max, under H (maximal α) or R max (maximal area) This is a binary optimization problem, since each point (t 1, t ) is either element of R or not element of R.

Optimization preprocessing Size of search space is of order n, preprocessing useful 1 Exclude points (t 1, t ) : P{(s 1, s ) : s 1 t 1, s t } > α >α? For all points B = (t 1, t ) get largest region A meeting condition (ii) without that point A B If P(A) + P(B) α, B R for sure

Preprocessing result example Preprocessing Excluded: Black, Remaining search space: Red, Included: Green.4. 1..1..1 1...4 1 1 14 1 1.1.1.1.1.4 11..7.9.6..1. 1.4 1.. 1..6.1 1. T B 9.1..7 1.. 4.1 6.4 6.4.4. 6.4..9.7.1.1.1. 7.1.9. 6.4 4.1 1...1 6.1.6 1.. 1..4 1..1..6.9.7.. 4.1.1.1.1.4 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.91 %; Area = 16

Actual opimization as binary linear program Solution is of type x = (, 1,,...) Objective function can be power, alpha, area γ i contribution of element x i to the objective function Find R, such that γ T x max, s.t. conditions (i) and (ii) Condition (ii) imposes x i x j for certain pairs i, j Constraint matrix for linear program: 1 1 1 1... 1 α 1 1... 1 1... 1 1... x....... Use LP solver (branch and bound algorithm)

Result: Maximal power rejection region Region with maximal power for specified alternative Succ. probs. EP A, EP B: Trtm..7,.4, Contr..,.. Common correlation...4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.44 %; Area = 4

Maximal alpha rejection region Rejection region with maximal alpha (red area)..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.46 %; Area =

Maximal area rejection region Region with maximal area..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.4 %; Area = 7

C = p 1 p rejection region C=p 1 p rejection region..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.4 %; Area = 7

C = min(p 1, p ) rejection region C=min(p 1,p ) rejection region..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.99 %; Area = 4

Tarone rejection region Tarone rejection region..4. 1.4.1.4.1 1.4..4. 1 1 14 1.1.1.1. 1...14.1..1.4 11.4.6.69.9.6.4..1. 1.4. 1.46.. 1.4.6.1.1 1.4 T B 9..6 1.46 4.11 6.4.79..9.14.1..69. 6.4.41 6.4..69..1.4 7.1.14.9..79 6.4 4.11 1.46.6..1 6.1.1.6 1.4.. 1.46..4 1.4.1..4.6.9.69.6.4. 4.1..1.14...4.1.1.1. 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.99 %; Area = 4

Power of Bonferroni rejection region = % Power for Bonferroni region Succ. probs. EP A, EP B: Trtm..7,.4, Contr..,.. Common correlation..6.9.9 1. 7.41 1.79 1.7 4.61. 1 1 14.1... 1.1.1.9.7.16.9 1.1.11.61 1.7.4 1..4 6.76 11..4.1.19 7.1 4.9 1.1.14 1. 1.1.1.9.6. 1.6 6.9 1.9.19.71 T B 9..1 1.6.71 7.4 7.96 4.1.79..14.7.16.1.4.. 1.7 7.1.7.9.69...6 6..6.11.7.7.1.1. 4 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level =.191 %; Area = 4

Power for maximal alpha region = 6% Power for maximal alpha region Succ. probs. EP A, EP B: Trtm..7,.4, Contr..,.. Common correlation..6.9.9 1. 7.41 1.79 1.7 4.61. 1 1 14.1... 1.1.1.9.7.16.9 1.1.11.61 1.7.4 1..4 6.76 11..4.1.19 7.1 4.9 1.1.14 1. 1.1.1.9.6. 1.6 6.9 1.9.19.71 T B 9..1 1.6.71 7.4 7.96 4.1.79..14.7.16.1.4.. 1.7 7.1.7.9.69...6 6..6.11.7.7.1.1. 4 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level = 6.97 %; Area =

Power for maximal area region = 7% Power for maximal area region Succ. probs. EP A, EP B: Trtm..7,.4, Contr..,.. Common correlation..6.9.9 1. 7.41 1.79 1.7 4.61. 1 1 14.1... 1.1.1.9.7.16.9 1.1.11.61 1.7.4 1..4 6.76 11..4.1.19 7.1 4.9 1.1.14 1. 1.1.1.9.6. 1.6 6.9 1.9.19.71 T B 9..1 1.6.71 7.4 7.96 4.1.79..14.7.16.1.4.. 1.7 7.1.7.9.69...6 6..6.11.7.7.1.1. 4 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level = 7.77 %; Area = 7

Maximal power = 79% Power for maximal power region Succ. probs. EP A, EP B: Trtm..7,.4, Contr..,.. Common correlation..6.9.9 1. 7.41 1.79 1.7 4.61. 1 1 14.1... 1.1.1.9.7.16.9 1.1.11.61 1.7.4 1..4 6.76 11..4.1.19 7.1 4.9 1.1.14 1. 1.1.1.9.6. 1.6 6.9 1.9.19.71 T B 9..1 1.6.71 7.4 7.96 4.1.79..14.7.16.1.4.. 1.7 7.1.7.9.69...6 6..6.11.7.7.1.1. 4 1 1 4 6 7 9 1 11 1 1 14 1 T A Alpha =. %; Level = 79.46 %; Area = 4

Introduction.7.7 Binary endpoints 7..4 Joint4.1 distribution4. Optimal 66. rejection 66. regions 6. 1.6 Power 4.1 Discussion 4.1.... 1.9. 1.7.9.9.4.6....7. 1. 4.6 9.4 46. 4. 4. 41.6 1.7 4.1...7. 6. 9.1 7. 9.7 46.1 46.1 4.1 1.9 4.1.6..7.7 4.1.7 1.4 1. 64. 64. 64. 49.4 4.1 4.1 Exact power (%) for different alternatives n=1 per group Succ. prob. treat. Optimize Combine p-values Bonferroni-related Simplifications Corr. EP A EP B Power Area Alpha p1*p minp Tarone Bonf. Collapse Only A Only B...4..4. 1. 1.4...9.9.7. 76. 67.. 69.9 67. 67.7. 9.7 67..9.7. 7. 1.6 77. 4. 7. 7.1 9.. 67. 1.7.7.7 97.7 9.7 9.4 96. 7.4. 79.6 79.6 67. 67.....4..4. 1. 1.4...9.9..7. 7. 66.6 1. 67.4 67. 67.7. 9. 67..9..7.. 79.1 74.6 1.1 7. 71. 9.1 4. 67. 1.7..7.7 96.4 94. 94. 9. 6. 6.9 7.1 76.7 67. 67. Success probability control =. for both EP For these scenarios, maximal power test is.9 to 7. % points better than next best test Maximizing alpha is not sole the key to good power C = p 1 p and maximal area test similar Tarone and minp very similar Simplifications have least power

Non-inferiority - superiority tests What if we want to show non-inferiority for both EP and superiority in the sense of the global null-hypothesis for some EP? Find optimal rejection region for global superiority test within the rejection region of the non-inferiority test. Problem: At the moment the optimal test is designed for a point null-hypothesis Type-I-error rate control for more general null-hypothesis H : OR 1 1 AND OR 1?

Discussion Exact conditional tests useful for small sample inference Example of two binary endpoints Can get optimal rejection region for a specified alternative Generalizations to more endpoints no problem in principle But optimization run-time may be non-polynomial Level of significance close to nominal level Not sole determinant of good power Behavior under other than point null-hypothesis remains to be analyzed Exact joint distribution can be used to study other tests Maxmial power test can be used as benchmark Further restictions can be implemented, e.g. as suggested for non-inferiority - superiority test

Literature Westfall, P. H., Young, S. S. (199). P value adjustments for multiple tests in multivariate binomial models. Journal of the American Statistical Association, 4(47), 7-76 Tarone, R. E. (199). A modified Bonferroni method for discrete data. Biometrics, 1-. Agresti, A. (199). A survey of exact inference for contingency tables. Statistical Science, 11-1. Agresti, A. (). Categorical data analysis. A John Wiley and Sons, Inc. Publication, Hoboken, New Jersey, USA. Wellek, S. (). Statistical Methods for the Analysis of Two Arm Non Inferiority Trials with Binary Outcomes. Biometrical journal, 47(1), 4-61.

Exact power (%) for different alternatives n=1 per group Succ. prob. treat. Optimize Combine p-values Bonferroni-related Simplifications Corr. EP A EP B Power Area Alpha p1*p minp Tarone Bonf. Collapse Only A Only B... 1.9. 1.7.9.9.4.6...7. 4.1 47. 41. 4.4 4. 4. 41.6. 4.1..7. 69.1 6. 61. 6. 46.6 46.6 4.7 4. 4.1.6.7.7 7..4 4.1 4. 66. 66. 6. 1.6 4.1 4.1.... 1.9. 1.7.9.9.4.6....7. 1. 4.6 9.4 46. 4. 4. 41.6 1.7 4.1...7. 6. 9.1 7. 9.7 46.1 46.1 4.1 1.9 4.1.6..7.7 4.1.7 1.4 1. 64. 64. 64. 49.4 4.1 4.1 n=1 per group Succ. prob. treat. Optimize Combine p-values Bonferroni-related Simplifications Corr. EP A EP B Power Area Alpha p1*p minp Tarone Bonf. Collapse Only A Only B...4..4. 1. 1.4...9.9.7. 76. 67.. 69.9 67. 67.7. 9.7 67..9.7. 7. 1.6 77. 4. 7. 7.1 9.. 67. 1.7.7.7 97.7 9.7 9.4 96. 7.4. 79.6 79.6 67. 67.....4..4. 1. 1.4...9.9

Non-inferiority - superiority maximal area non inf. (green), maximal area (red) T A T B 4 6 1 1 14 16 1 4 6 4 6 7 9 1 11 1 1 14 1 16 17 1 19 1 4 6 7 9..1.1...1.1..1..4...4..1..1.4.7 1. 1.4 1...4.1 6... 1..6. 1..9.4.1 1.1.. 1.4.6.6. 1....1 17.9.1.4 1... 4.4.. 1..4.1.4.1.. 1...6.6 1.4.. 17.9.1.4.9 1...6 1... 1.1.1.4. 1. 1.4 1..7.4.1 6..1..4...4..1..1.1...1.1..... 6. 1.1 17.9.4 17.9 1.1 6.... 1 Alpha =. %; Level =.4 % Alpha =. %; Level =.4 %; Area = 1

Non-inferiority - superiority C = p 1 p non inf. (green), combination C=(p1p) (red) T B 9 7 6 4 1 19 1 17 16 1 14 1 1 11 1 9 7 6 4...6.1.17.9.417.91.16... 1...1.1...1.1.1..4...4..1.1.4.1.1.41..7.4.1.1.4.91...6 1....1..1...6.61.4...1.41...4.4..1..4.1..1.4.6.6. 1...1..1..6.1..9.4.1.1.4.71.1.41...4.1.1..4...4..1.1.1...1.1. 6. 1.1 17.9.4 17.9 1.1 6.... 4 6 1 1 14 16 1 4 6 Alpha =. %; Level = 1.9 %; Area = 11 T A