Adaptive clinical trials with subgroup selection

Similar documents
Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim

Adaptive Dunnett Tests for Treatment Selection

Adaptive Designs: Why, How and When?

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Adaptive Treatment Selection with Survival Endpoints

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests.

Estimation in Flexible Adaptive Designs

Optimizing trial designs for targeted therapies - A decision theoretic approach comparing sponsor and public health perspectives

Group sequential designs for Clinical Trials with multiple treatment arms

Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013

Adaptive Survival Trials

Two-Phase, Three-Stage Adaptive Designs in Clinical Trials

Finding Critical Values with Prefixed Early. Stopping Boundaries and Controlled Type I. Error for A Two-Stage Adaptive Design

The Limb-Leaf Design:

Adaptive Enrichment Designs for Randomized Trials with Delayed Endpoints, using Locally Efficient Estimators to Improve Precision

ADAPTIVE SEAMLESS DESIGNS: SELECTION AND PROSPECTIVE TESTING OF HYPOTHESES

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

ROI analysis of pharmafmri data: an adaptive approach for global testing

Designing multi-arm multi-stage clinical trials using a risk-benefit criterion for treatment selection

OPTIMAL, TWO STAGE, ADAPTIVE ENRICHMENT DESIGNS FOR RANDOMIZED TRIALS USING SPARSE LINEAR PROGRAMMING

Group Sequential Trial with a Biomarker Subpopulation

clinical trials Abstract Multi-arm multi-stage (MAMS) trials can improve the efficiency of the drug

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

Multiple Testing in Group Sequential Clinical Trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Group Sequential Designs: Theory, Computation and Optimisation

A Type of Sample Size Planning for Mean Comparison in Clinical Trials

AN INVESTIGATION OF TWO-STAGE TESTS. Marc Vandemeulebroecke. Otto-von-Guericke-Universität Magdeburg. and. Schering AG

ROI ANALYSIS OF PHARMAFMRI DATA:

University of California, Berkeley

Confidence intervals and point estimates for adaptive group sequential trials

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Exact Inference for Adaptive Group Sequential Designs

Multiple testing: Intro & FWER 1

Determination of the optimal sample size for a clinical trial accounting for the population size

Repeated confidence intervals for adaptive group sequential trials

Lecture 6 April

Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series

Sequential/Adaptive Benchmarking

Optimal Tests of Treatment Effects for the Overall Population and Two Subpopulations in Randomized Trials, using Sparse Linear Programming

Pubh 8482: Sequential Analysis

OPTIMAL TESTS OF TREATMENT EFFECTS FOR THE OVERALL POPULATION AND TWO SUBPOPULATIONS IN RANDOMIZED TRIALS, USING SPARSE LINEAR PROGRAMMING

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks

Group sequential designs with negative binomial data

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni based closed tests

On Generalized Fixed Sequence Procedures for Controlling the FWER

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials


Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints

Applying the Benjamini Hochberg procedure to a set of generalized p-values

arxiv: v2 [stat.me] 24 Aug 2018

Testing a Primary and a Secondary Endpoint in a Confirmatory Group Sequential Clinical Trial

A note on tree gatekeeping procedures in clinical trials

Blinded sample size reestimation with count data

Bonferroni - based gatekeeping procedure with retesting option

Group Sequential Tests for Delayed Responses

UNIFORMLY MOST POWERFUL TESTS FOR SIMULTANEOUSLY DETECTING A TREATMENT EFFECT IN THE OVERALL POPULATION AND AT LEAST ONE SUBPOPULATION

Statistical Inference

Are Flexible Designs Sound?

The Design of a Survival Study

An Alpha-Exhaustive Multiple Testing Procedure

An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary

Practical issues related to the use of biomarkers in a seamless Phase II/III design

Pubh 8482: Sequential Analysis

Approaches to sample size calculation for clinical trials in small populations

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Alpha-recycling for the analyses of primary and secondary endpoints of. clinical trials

Familywise Error Rate Controlling Procedures for Discrete Data

Maximum type 1 error rate inflation in multiarmed clinical trials with adaptive interim sample size modifications

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

A Clinical Trial Simulation System, Its Applications, and Future Challenges. Peter Westfall, Texas Tech University Kuenhi Tsai, Merck Research Lab

Optimal group sequential designs for simultaneous testing of superiority and non-inferiority

Two-Stage Sequential Designs Regulatory Perspective

Sample Size Estimation for Studies of High-Dimensional Data

Chapter Seven: Multi-Sample Methods 1/52

2015 Duke-Industry Statistics Symposium. Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data

Analysing Survival Endpoints in Randomized Clinical Trials using Generalized Pairwise Comparisons

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

Evidence synthesis for a single randomized controlled trial and observational data in small populations

Monitoring clinical trial outcomes with delayed response: Incorporating pipeline data in group sequential and adaptive designs. Christopher Jennison

Multistage Tests of Multiple Hypotheses

Stepwise Gatekeeping Procedures in Clinical Trial Applications

a Sample By:Dr.Hoseyn Falahzadeh 1

An extrapolation framework to specify requirements for drug development in children

Group sequential designs for negative binomial outcomes

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1

Optimal exact tests for multiple binary endpoints

Bayesian Updating: Discrete Priors: Spring

Transcription:

Adaptive clinical trials with subgroup selection Nigel Stallard 1, Tim Friede 2, Nick Parsons 1 1 Warwick Medical School, University of Warwick, Coventry, UK 2 University Medical Center Göttingen, Göttingen, Germany n.stallard@warwick.ac.uk EFSPI Statistical Meeting on Subgroup Analyses 30 November 2012

Setting Definitive comparison of experimental treatment with control Single predefined subgroup of interest Research questions: is treatment effective in whole population test H 0 is treatment effective in subgroup test H 0 Confirmatory study: wish to control risk of any false positive result (Wang et al 2007, Song & Chi 2007, Brannath et al 2009, Spiessens & Debois 2010, Jenkins et al 2011)

Adaptive design idea We are testing hypotheses in full population and subgroup If we knew we could focus on subgroup, could recruit only subgroup patients Need to adjust for multiple hypotheses to control error rate This makes it harder to show efficacy If we knew which hypothesis to test, could focus on that Adaptive design: conduct interim analysis partway through trial use interim data to guide recruitment and hypothesis testing in remainder of trial

Adaptive design details Two-stage trial Stage 1: Recruit n 1 patients per group from full population On basis of interim data decide to continue to (i) test H 0 and H 0 at final analysis (ii) test H 0 only final analysis (iii) test H 0 only final analysis (possibly use interim data on short-term endpoint or external information)

Adaptive design details Stage 2: Recruit further n 2 patients per group If testing H 0 and H 0 or H 0 alone: continue to recruit from full population If testing H 0 alone: recruit from subgroup only (possibly with enrichment) Final analysis: Test selected hypothesis/hypotheses Control familywise error rate allowing for hypothesis selection (Song & Chi 2007, Brannath et al 2009, Jenkins et al 2011)

Hypothesis testing Test H 0 (full population) and H 0 (sub population Require strong control of family wise error rate (FWER): Pr(erroneously reject either H 0 or H 0 ) = α under H 0 or H 0 or H 0 H 0

Adaptive design ingredients We need: hypothesis testing approach to strongly control FWER allowing for multiple hypotheses method for combination of evidence from two stages decision rule for deciding which hypothesis/hypotheses to consider in stage 2

Test statistics Test H 0 (full population) and H 0 (sub population At stage i, let p i be p-value for testing H 0 (full population) p i be p-value for testing H 0 (sub population) Z i be test statistic for testing H 0 (full population) Z i = Φ 1 (1 p i ) Z i be test statistic for testing H 0 (sub population) Z i = Φ 1 (1 p i )

Closed testing procedure Extend family of hypotheses to include H 0, H 0, H 0 {F,S} = H 0 H 0 Perform local level α test of each hypothesis Reject H 0 if and only if H 0 and H 0 H 0 are locally rejected Reject H 0 if and only if H 0 and H 0 H 0 are locally rejected Controls FWER in strong sense Requires level test of intersection hypothesis H 0 {F,S}

Tests of the intersection hypothesis At stage i, base test of H 0 {F,S} on p-value p i {F,S} Bonferroni test p i {F,S} = 2 min{p i, p i } Simes test p i {F,S} = min{2 min{p i, p i }, max{p i, p i }} (Simes 1986) These tests ignore correlation between p and p

Tests of the intersection hypothesis Spiessens and Debois test Under H 0 H 0 where τ is proportion in subgroup Obtain p i {F,S} based on distribution of max{z i, Z i } (Spiessens and Debois 2010) 1 1, 0 0 ~ } { } { τ τ N Z Z S i F i

Combining stages 1 and 2 Combination test Obtain overall p-value for each hypothesis using pre-specified combination function E.g. inverse normal combination test for based on w 1 Φ 1 (1 p 1 ) + w 2 Φ 1 (1 p 2 ) for pre-specified w 1 and w 2 with w 12 + w 22 = 1 Requirement is that p-values are independent (or at least the p-clud condition holds) (Brannath et al 2002)

Combining stages 1 and 2 Conditional error function approach Obtain design assuming testing both hypotheses At interim analysis obtain error rate for second stage conditional on stage 1 data Can redesign stage 2 so long as conditional error rate is controlled e.g. restrict testing to full population or subgroup alone (Müller and Schäfer 2001, Friede et al 2011)

Decision rule

Decision rule

Simulation study Model: Normally distributed test statistics Standardised treatment difference: 0.3 in subgroup 0 outside subgroup Subgroup prevalence: 1% to 50%

Simulation study Designs: Adaptive designs: CEF test with Spiessens and Debois test combination test with Spiessens test combination test with Simes test Fixed designs: single study testing both hypotheses two separate studies: second to test hypothesis selected by first Sample size: n 1 = 200, n 2 = 200 fixed single stage: n = 400 if enrichment: all n 2 from subgroup if test H 0 only

Simulation study results Without enrichment Adaptive: CEF Comb test Simes Comb test Spiessens Fixed: Separate studies + - + Single study o - o power 0.0 0.2 0.4 0.6 0.8 1.0 + + + + + 0.0 0.1 0.2 0.3 0.4 0.5 subgroup prevalence

Simulation study results Without enrichment Key messages: power 0.0 0.2 0.4 0.6 0.8 1.0 -Power increases with subgroup size -Including data from both stages increases power -Gain from allowing testing of only one hypothesis in second stage is relatively small -Adaptive designs are very similar (Simes method slightly less powerful) + + + 0.0 0.1 0.2 0.3 0.4 0.5 + subgroup prevalence +

Simulation study results With enrichment Adaptive: CEF Comb test Simes Comb test Spiessens Fixed: Separate studies + - + Single study o - o power 0.0 0.2 0.4 0.6 0.8 1.0 + + + + + 0.0 0.1 0.2 0.3 0.4 0.5 subgroup prevalence

Simulation study results With enrichment With enrichment Key messages: -Enrichment increases power, reduces effect of subgp. size power -Single stage design (without enrichment) is less powerful 0.0 0.2 0.4 0.6 0.8 1.0 -Gain from including first stage data is relatively small, particularly for small subgroup -Adaptive designs are similar (Simes method slightly less powerful) + + + 0.0 0.1 0.2 0.3 0.4 0.5 + subgroup prevalence +

Conclusions We may want to test treatment in a subgroup in addition to full population In confirmatory test need to control overall error rate Adaptive design allows use of interim analysis to (i) select subgroup and enable enrichment (ii) select hypothesis(hypotheses) for final analysis A number of methods exists to control overall error rate

Conclusions When possible, enrichment increases power considerably Adaptive design can increase power by using data from both stages Adaptive methods that allow for correlation between full- and subgroup tests are slightly more powerful We have assumed a pre-defined subgroup. Allowing for use of interim data to select subgroup is harder!

References Brannath et al. Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology. Stat Med 2009; 28: 1445 1463. Brannath, Posch, Bauer. Recursive combination tests. J. Am. Stat. Ass. 2002, 97, 236-44. Jenkins, Stone, Jennison. An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints. Pharm Stat 2011; 10: 347 56. Friede, Parsons, Stallard. A conditional error function approach for subgroup selection in adaptive clinical trials. Stat Med. In press. Müller, Schäfer. Adaptive group sequential designs for clinical trials: Combining the advantage of adaptive and of classical group sequential approaches. Biometrics 2001; 57: 886 91. Simes. An improved Bonferroni procedure for multiple test of significance. Biometrika 1986; 73: 751 4. Song, Chi. A method for testing a prespecified subgroup in clinical trials. Stat Med 2007; 26: 3535-49. Spiessens, Debois. Adjusted significance levels for subgroup analysis in clinical trials. Cont Clin Trials 2010; 31: 647 56. Wang, O Neill, Hung. Approaches to evaluation of treatment effect in randomised clinical trials with genomic subset. Pharm Stat 2007; 6: 227 4.