Estimation in Flexible Adaptive Designs

Similar documents
Adaptive Dunnett Tests for Treatment Selection

Adaptive Treatment Selection with Survival Endpoints

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests.

Finding Critical Values with Prefixed Early. Stopping Boundaries and Controlled Type I. Error for A Two-Stage Adaptive Design

Repeated confidence intervals for adaptive group sequential trials

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim

Adaptive clinical trials with subgroup selection

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

Confidence intervals and point estimates for adaptive group sequential trials

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Precision of maximum likelihood estimation in adaptive designs

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013

Exact Inference for Adaptive Group Sequential Designs

Adaptive Survival Trials

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

Group sequential designs for Clinical Trials with multiple treatment arms

Pubh 8482: Sequential Analysis

A Type of Sample Size Planning for Mean Comparison in Clinical Trials

Adaptive Designs: Why, How and When?

Two-Phase, Three-Stage Adaptive Designs in Clinical Trials

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

Are Flexible Designs Sound?

ADAPTIVE SEAMLESS DESIGNS: SELECTION AND PROSPECTIVE TESTING OF HYPOTHESES

Maximum type 1 error rate inflation in multiarmed clinical trials with adaptive interim sample size modifications

Pubh 8482: Sequential Analysis

AN INVESTIGATION OF TWO-STAGE TESTS. Marc Vandemeulebroecke. Otto-von-Guericke-Universität Magdeburg. and. Schering AG

Multiple Testing in Group Sequential Clinical Trials

Group Sequential Designs: Theory, Computation and Optimisation

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Overrunning in Clinical Trials: a Methodological Review

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Sequential/Adaptive Benchmarking

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem

Generalized Likelihood Ratio Statistics and Uncertainty Adjustments in Efficient Adaptive Design of Clinical Trials

The Limb-Leaf Design:

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping


ROI analysis of pharmafmri data: an adaptive approach for global testing

clinical trials Abstract Multi-arm multi-stage (MAMS) trials can improve the efficiency of the drug

Designing multi-arm multi-stage clinical trials using a risk-benefit criterion for treatment selection

Group Sequential Tests for Delayed Responses

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Pubh 8482: Sequential Analysis

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

4. Issues in Trial Monitoring

On the efficiency of two-stage adaptive designs

arxiv: v2 [stat.me] 24 Aug 2018

Bios 6649: Clinical Trials - Statistical Design and Monitoring

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Adaptive Trial Designs

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

The Design of a Survival Study

Monitoring clinical trial outcomes with delayed response: Incorporating pipeline data in group sequential and adaptive designs. Christopher Jennison

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Pubh 8482: Sequential Analysis

6 Sample Size Calculations

Parametric Techniques Lecture 3

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Review. December 4 th, Review

ROI ANALYSIS OF PHARMAFMRI DATA:

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

Parametric Techniques

Group Sequential Trial with a Biomarker Subpopulation

Optimal rejection regions for testing multiple binary endpoints in small samples

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Statistical Aspects of Futility Analyses. Kevin J Carroll. nd 2013

Sequential Monitoring of Clinical Trials Session 4 - Bayesian Evaluation of Group Sequential Designs

Testing hypotheses in order

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

In Defence of Score Intervals for Proportions and their Differences

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Duke University. Duke Biostatistics and Bioinformatics (B&B) Working Paper Series. Randomized Phase II Clinical Trials using Fisher s Exact Test

Group sequential designs for negative binomial outcomes

Spring 2012 Math 541B Exam 1

Group sequential designs with negative binomial data

Lecture 3. Inference about multivariate normal distribution

Blinded sample size reestimation with count data

Group-Sequential Tests for One Proportion in a Fleming Design

On Generalized Fixed Sequence Procedures for Controlling the FWER

Some Notes on Blinded Sample Size Re-Estimation

Basic Concepts of Inference

A simulation study for comparing testing statistics in response-adaptive randomization

Inverse Sampling for McNemar s Test

Bayesian concept for combined Phase 2a/b trials

Practice Problems Section Problems

Testing a Primary and a Secondary Endpoint in a Confirmatory Group Sequential Clinical Trial

BIOS 312: Precision of Statistical Inference

Use of frequentist and Bayesian approaches for extrapolating from adult efficacy data to design and interpret confirmatory trials in children

Confidence Intervals for a Ratio of Binomial Proportions Based on Unbiased Estimators

Reports of the Institute of Biostatistics

Confidence Distribution

Bayesian Optimal Interval Design for Phase I Clinical Trials

Bias Variance Trade-off

An extrapolation framework to specify requirements for drug development in children

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Transcription:

Estimation in Flexible Adaptive Designs Werner Brannath Section of Medical Statistics Core Unit for Medical Statistics and Informatics Medical University of Vienna BBS and EFSPI Scientific Seminar on Adaptive Designs in Drug Development Basel, June 2007

Contents Review of Flexible Two Stage Tests Repeated Confidence Intervals Point Estimation Remarks on Seamless Phase II/III designs Summary 2

Review of Flexible Two Stage Tests 3

Flexible designs Flexible designs allow for mid-trial design modifications based on all internal and external information gathered at interim analyses without compromising the type I error rate. For a control of the type I error rate, the design modifications need not be specified in advance. Examples for mid-trial design modifications: Adaptation of the sample sizes, dropping (or adding) study doses, adapting the number of interim analyses, decision boundaries, test statistics, the endpoints, the study goal (non-inferiority and superiority), the multiple testing strategy,.... 4

Pre-specified Adaptivity versus Flexibility Pre-specified adaptivity = adapting design parameters according to a pre-specified adaptation rule Aims: Increasing efficiency by optimizing specific cost functions. Examples: Group sequential trials, play-the-winner allocation rules,... Flexibility ( unscheduled adaptivity) = adapting design parameters without a (complete) specification of the adaptation rule 5

Flexibility Aims of flexibility: Dealing with the unexpected. Dealing with the expected unpredictability of clinical trials. Improving the quality of the decision process as a whole in an environment where the parameter assumptions and also the weighting of gains and costs are unclear a priori and can change in the course of the trial. 6

Flexible Two Stage Tests Step-wise procedure Consist of two sequential stages: Stage 1 (e.g. Phase II part) and Stage 2 (e.g. Phase III part) Stage 1 and Stage 2 data are from two independent cohorts. Adaptivity The design of Stage 2 (sample sizes, statistical test,... ) can be chosen based on the data of Stage 1 as well as any other internal or external information. Flexibility For a control of the type I error rate, one need not pre-specify how the Stage 1 data determine the design of Stage 2. 7

Flexible two stage combination tests Notation: p and q the p-values from stage 1 and 2 for H 0 : θ 0 versus H 1 : θ > 0 p and q are independent under H 0. Two stage combination test: Prefix a monotone combination function C(p, q) and rejection bounds c and α 1. We reject H 0 if either p α 1 (stage 1) or C(p, q) c (stage 2) Level condition: We must prefix α 1, C(p, q) and c such that P 0 ({p α 1 } {C(p, q) c}) = α 8

Examples for combination functions Fisher s product test: C(p, q) = p q Inverse normal method: C(p, q) = w 1 Φ 1 (p) + w 2 Φ 1 (q), w1 2 + w 2 2 = 1 Gives a two stage GSD with information times t 1 t 2 if w 1 /w 2 = t 1 /(t 2 t 1 ) and no adaptations are done. 9

Estimation Corresponding methods to estimate the size of the treatment effect and to provide confidence intervals with pre-specified coverage probability are additional requirements. Reflection paper on flexible designs (Draft), EMEA 2006 10

Bias of conventional estimates Unblinded sample size adaptations may lead to (mean) biased estimates and invalid confidence intervals. Sample size adaptation rule unkown bias and coverage probabilities unknown. 11

Confidence intervals for flexible designs Duality between confidence sets and significance tests: confidence set = set of values where significance test accepts Flexible confidence interval: Use flexible tests at level α for all parameter values Flexible confidence intervals have coverage probability 1 α independently of the adaptation rule. Overall p-values can be constructed in a similar way. 12

Confidence intervals for flexible designs accounting for stopping rules Two possible approaches: Repeated confidence interval approach: is very simple; flexibility also with regard to stopping rule; inevitable price is strict conservatism. Exact confidence interval via stage wise ordering: is more complicated; no flexibility with regard to stopping rule; exact coverage probability. Exact confidence interval at level 0.5 gives median unbiased point estimate which lies in the interior of the exact 95%-confidence interval. 13

Confidence intervals for flexible designs accounting for stopping rules Two possible approaches: Repeated confidence interval approach: is very simple; flexibility also with regard to stopping rule; inevitable price is strict conservatism. Exact confidence interval via stage wise ordering: is more complicated; no flexibility with regard to stopping rule; exact coverage probability. Exact confidence interval at level 0.5 gives median unbiased point estimate which lies in the interior of the exact 95%-confidence interval. 14

Chapter 2: Repeated Confidence Intervals (LEHMACHER & WASSMER, 1999; BRANNATH ET AL. 2002, LAWRENCE & HUNG, 2003; PROSCHAN ET AL., 2003) 15

Repeated confidence intervals Duality between hypothesis tests and confidence sets: p stage 1 and q stage 2 p-values for H 0, : θ. p and q independent under H 0, and increasing in. Two stage combination test for H 0, : Use for all the same combination test. We reject H 0, if either p α 1 (stage 1) or C(p, q ) c (stage 2) Remark: The rule p α 1 should not be understood as a stopping rule, but as rejection rule which we apply at stage 1. 16

Lower repeated confidence bounds Stage 1: Solve the equation p = α 1 δ 1 such that p α 1 δ 1 (δ 1, ) one-sided confidence interval at first stage. Stage 2: Solve C(p, q ) = c δ 2 such that C(p, q ) c δ 2 (δ 2, ) one-sided confidence interval at second stage. 17

Example I Primary efficacy end point: Infarct size measured by the cumulative release of α-hdbh within 72 hours after administration of the drug (area under the curve, AUC). θ the mean α-hdbh AUC difference between control c and treatment t, H 0 : θ 0 vs. H 1 : θ > 0 Inverse normal combination test: C(p, q) = 0.5 Φ 1 (p) + 0.5 Φ 1 (q) O Brien and Flemming at one sided level α = 0.025 α 1 = 0.0026, c = 0.024. 18

Example I (cont.) Stage 1: sample sizes: n 1c = 88, n 1t = 91, standard deviation: ˆσ 1c = 26.0, ˆσ 1t = 22.5 treatment difference: ˆθ 1 = 4.0, σˆθ 1 = 3.64 p according to t-test for H 0 : θ =. Solving p = 0.0026 classical CI at level 0.0026 δ 1 = ˆθ 1 t ν,0.9974 σˆθ 1 = 6.3 First stage confidence interval is ( 6.3, ) 19

Example I (cont.) Stage 2: sample sizes n 2c = 322, n 2t = 321, standard deviations: ˆσ 2c = 26.1, ˆσ 2t = 28.5, treatment difference: ˆθ 2 = 4.8, σˆθ 2 = 2.16 q according to t-test for H 0 : θ = from second stage data. Solving C(p, q ) = 0.024 numerically δ 2 = 0.71 Second stage confidence interval is (0.71, ) 20

Example I (cont.): Determination of δ 2 C(p( ), q( )) 0.00 0.04 0.08 2 1 0 1 2 δ 2 c 21

Properties of repeated confidence bounds One need not pre-specify the adaptation and stopping rule to keep the nominal coverage probability. Price for the flexibility with regard to stopping rule is strict conservatism: we must control the level for the worst case rule, also when actually not following this rule. H 0 is rejected with the combination test iff δ L > 0. The first stage bound δ 1 is the classical confidence bound at level α 1. 22

Normal approximations and inverse normal method (Lehmacher and Wassmer, 1999) If the stage wise estimates ˆθ i (i = 1, 2) for the treatment effect are asymptotically independent and normal with mean treatment effect and variance σ 2ˆθi = I 1 1, then p( ) = 1 Φ( I 1 (ˆθ 1 )) and q( ) = 1 Φ( I 2 (ˆθ 2 )) are approximate independent p-values for H 0, : θ. Inverse normal combination function δ 2 = θ Φ 1 (1 c) w w 1 I 1 + w 2 I, θw = w 1 I1 ˆθ 1 + w 2 I 2 ˆθ 2 1 w 1 I 1 + w 2 I 2 23

Example I with normal approximation Stage 1: ˆθ1 = 4.0, I 1 = 0.076 δ 1 = 4.0 Φ 1 (0.9974) I 1 = 6.2 (before 6.3) Stage 2: ˆθ 2 = 4.8, I 2 = 0.215, w 1 = w 2 = 0.5 θ w = I1 ˆθ 1 + I 2 ˆθ 2 I1 + I 2 = 4.5 δ 1 = 4.5 Φ 1 (1 0.024) ( I 1 + I 2 ) = 0.70 (before 0.71) 0.5 24

Extensions Repeated confidence intervals can be extended to multistage flexible designs, and can be computed even after adapting the number of interim looks ( LEHMACHER AND WASSMER, 1999; BRANNATH ET AL., 2002) One can incorporate a futility boundary into the dual combination tests. However, one must carefully account for the futility bound in the determination of δ 2 : One must accept all for which stage 1 p-value p falls into stage 1 acceptance region even if the second stage data suggest rejection of H 0,. 25

Extensions We could use different α 1 and c for different, however, one must be careful in our choice in α 1 and c in order to get nested dual rejection regions. (BRANNATH ET AL. 2003) Exact confidence intervals and median unbiased point estimates are available via the stage wise ordering. (BRANNATH ET AL. 2002) Confidence intervals and point estimates for adaptive GSD s following the principle of Müller and Schäfer have been derived only recently. (METHA, BRANNATH, POSCH AND BAUER 2006; SUBMITTED) 26

Two-sided tests and two-sided confidence intervals at level 2α One should not perform combination tests with two-sided p-values for H 0, : θ = : Interpretation problem if the first and the second stage estimates point in conflictive directions. Solution: Intersection of two one-sided combination tests and corresponding repeated confidence intervals (one lower and one upper) each at level α. 27

Two-sided confidence intervals at level 2α At the first stage we get the classical two-sided interval at level 2α 1. With the normal approximation and normal inverse method we get at the second stage the interval ( θ w Φ 1 (1 c) w 1 I 1 + w 2 I, θ Φ 1 (1 c) w + 2 w 1 I 1 + w 2 I ) 2 with θ w = w 1 I 1 ˆθ 1 + w 2 I 2 ˆθ 2 w 1 I 1 + w 2 I 2 28

Confidence intervals for the conditional error function approach Conditional error function approach: Prefix a decreasing conditional error function A(x) and first stage rejection level α 1. Reject H 0 if p α 1 (stage 1) or q A(p) (stage 2). Equivalent combination test (POSCH & BAUER 1999, WASSMER 1999): e.g. α 1, C(p, q) = q A(p), and c = 0 One can use the same estimation methods as for combination tests 29

Chapter 3: Point Estimation 30

Maximum likelihood estimate (MLE) Assuming normal data and balanced treatment groups the MLE can be written as ˆθ mle = I 1 I 1 + I 2 ˆθ 1 + I 2 I 1 + I 2 ˆθ 2 (for small effect sizes approximatively also in other cases) Mean Bias: E (ˆθ mle ) = Cov ( I 1 I 1 +I 2, ˆθ 1 ) (Liu et al. 2002) One can show that always: E (ˆθ mle ) 0.4 σ/ n 1 Variance also depends on (unknown) adaptation/selection rule 31

Maximum likelihood estimate (MLE) Mean bias of MLE for typical examples (qualitatively): Stopping with early rejection: the larger the effect size the smaller the sample size positive mean bias. Stopping for futility: the smaller the effect size the smaller the sample size negative mean bias. Conditional or predictive power control: the smaller the effect size the larger the sample size positive mean bias. Selecting promising treatments: the larger the effect size the larger the sample size negative mean bias. 32

Weighted maximum likelihood estimate Center of a two sided repeated confidence interval: θ w = w 1 I 1 ˆθ 1 + w 2 I 2 ˆθ 2 w 1 I 1 + w 2 I 2 where w 1, w 2 0, w 2 1 + w 2 2 = 1 are the pre-specified weights. Properties: If recruitment is stopped at stage 1 then θ w = ˆθ 1. If recruitment is never stopped at the interim analysis, then ˆθ w is median unbiased, i.e., ˆθ w has median. Median of ˆθ w close to also if recruitment can be stopped at interim analysis. 33

Cases for which the estimates are similar The two estimates are equal or differ only slightly if recruitment is stopped at the interim analysis; recruitment is not stopped at the interim analysis, and or the first and second stage estimates are similar, ˆθ1 ˆθ 2 ; the sample sizes are (almost) as pre-planned: I1 /I 2 w 1 /w 2 = t 1 /(t 2 t 1 ) 34

Numerical example 80% - Predictive power rule, truncated 0.1 I 1 I 2 5 I 1. θ w with w1 2 = 0.5 2 ˆθ 1 first stage mean diffenrence horizontal axis: λ 1 = 1 0 I 1 θ -2-1 0 1 2 3 4 5 4 3 Mean Bias 1 0.8 0.6 0.4 0.2 0-0.2 ˆ θ ˆ θw ˆ θ mle 1 è!!!!!!!!! MSE 1 0.8 0.6 0.4 0.2 0-0.2-2 -1 0 1 2 3 4 λ 1-2 -1 0 1 2 3 4 λ 1 35

Remarks on Seamless Phase II/III designs Start with a number of treatments (doses) and select treatments and reassess sample sizes at an adaptive interim analysis. Selection rule is typically not fully pre-specified. Flexible closed tests provide strong FWER control. Repeated confidence interval approach provides univariate confidence intervals (no multiplicity adjustment). Simultaneous confidence intervals which are consistent with the (multiple) test result may not be available. Mean or median unbiased estimates are currently not available. Selection bias can be an additional issue (ongoing research and discussion). 36

Summary Univariate confidence intervals are, in general, available for flexible adaptive designs. Using the normal approximation of stage wise estimates and the inverse normal combination function, we get explicit (and intuitive) formula for the confidence bounds. Maximum likelihood estimate is biased, however, seems to perform well in terms of the mean square error. The weighted maximum likelihood estimate is, in general, less biased and median unbiased in the case of an administrative interim look. With a stopping rule a median unbiased estimate can be obtained via the stage wise ordering. 37

Selected literature Brannath, König, Bauer (2006). Estimation in flexible two stage designs, Statistics in Medicine 25: 3366 3381. Posch, Koenig, Branson, Brannath, Dunger-Baldauf, Bauer (2005). Testing and estimation in flexible group sequential designs with adaptive treatment selection, Statistics in Medicine 24: 3697 3714. Brannath, Maurer, Posch and Bauer (2003). Sequential tests for non-inferiority and superiority. Biometrics 59:106 114. Brannath, König and Bauer (2003). Improved repeated confidence bounds in trials with a maximal goal. Biometrical Journal 45:311 324. Proschan, Liu, Hunsberger (2003). Practical midcourse sample size modification in clinical trials, Controlled clinical trials 24:4 15. Lawrence and Hung (2003). Estimation and confidence intervals after adjusting the maximum information, Biometrical Journal 45:143 152. Liu, Proschan and Pledger (2002). A unified theory of two-stage adaptive designs, JASA 97:1034 1041. Brannath, Posch and Bauer (2002). Recursive combination tests. JASA 97:236 244. Lehmacher and Wassmer (1999). Adaptive sample size calculations in group sequential trials. Biometrics 55:1286 1290. 38