Issues in Group Sequential/Adaptive Designs

Similar documents
Bios 6648: Design & conduct of clinical research

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

Hypothesis Tests for One Population Mean

, which yields. where z1. and z2

Comprehensive Exam Guidelines Department of Chemical and Biomolecular Engineering, Ohio University

Math Foundations 20 Work Plan

NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION

Sequential Allocation with Minimal Switching

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

The blessing of dimensionality for kernel methods

WRITING THE REPORT. Organizing the report. Title Page. Table of Contents

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium

Physics 2B Chapter 23 Notes - Faraday s Law & Inductors Spring 2018

Lesson Plan. Recode: They will do a graphic organizer to sequence the steps of scientific method.

Resampling Methods. Chapter 5. Chapter 5 1 / 52

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

Technical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology

Document for ENES5 meeting

Differentiation Applications 1: Related Rates

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

BASD HIGH SCHOOL FORMAL LAB REPORT

UNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION

Accreditation Information

Least Squares Optimal Filtering with Multirate Observations

Chapter 3: Cluster Analysis

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

AP Statistics Notes Unit Two: The Normal Distributions

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

Preparation work for A2 Mathematics [2017]

BOUNDED UNCERTAINTY AND CLIMATE CHANGE ECONOMICS. Christopher Costello, Andrew Solow, Michael Neubert, and Stephen Polasky

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

the results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must

How do scientists measure trees? What is DBH?

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

Support-Vector Machines

Five Whys How To Do It Better

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

Tutorial 4: Parameter optimization

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

ENSC Discrete Time Systems. Project Outline. Semester

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment

Computational modeling techniques

Checking the resolved resonance region in EXFOR database

Physics 2010 Motion with Constant Acceleration Experiment 1

Floating Point Method for Solving Transportation. Problems with Additional Constraints

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Basics. Primary School learning about place value is often forgotten and can be reinforced at home.

Module 4: General Formulation of Electric Circuit Theory

Weathering. Title: Chemical and Mechanical Weathering. Grade Level: Subject/Content: Earth and Space Science

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Lab 1 The Scientific Method

Eric Klein and Ning Sa

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur

Pattern Recognition 2014 Support Vector Machines

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

A Matrix Representation of Panel Data

Computational modeling techniques

NUMBERS, MATHEMATICS AND EQUATIONS

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

Inference in the Multiple-Regression

8 th Grade Math: Pre-Algebra

AMERICAN PETROLEUM INSTITUTE API RP 581 RISK BASED INSPECTION BASE RESOURCE DOCUMENT BALLOT COVER PAGE

Section 6-2: Simplex Method: Maximization with Problem Constraints of the Form ~

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

Multiple Source Multiple. using Network Coding

7 TH GRADE MATH STANDARDS

o o IMPORTANT REMINDERS Reports will be graded largely on their ability to clearly communicate results and important conclusions.

making triangle (ie same reference angle) ). This is a standard form that will allow us all to have the X= y=

Determining Optimum Path in Synthesis of Organic Compounds using Branch and Bound Algorithm

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

NOTE ON A CASE-STUDY IN BOX-JENKINS SEASONAL FORECASTING OF TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPORT NO. 16 APRIL 1974

UN Committee of Experts on Environmental Accounting New York, June Peter Cosier Wentworth Group of Concerned Scientists.

Determining the Accuracy of Modal Parameter Estimation Methods

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

How topics involving numbers are taught within Budehaven Community School

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

Comparing Several Means: ANOVA. Group Means and Grand Mean

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

Dead-beat controller design

You need to be able to define the following terms and answer basic questions about them:

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

The standards are taught in the following sequence.

How T o Start A n Objective Evaluation O f Your Training Program

THERMAL TEST LEVELS & DURATIONS

Homology groups of disks with holes

Preparation work for A2 Mathematics [2018]

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning

Intelligent Pharma- Chemical and Oil & Gas Division Page 1 of 7. Global Business Centre Ave SE, Calgary, AB T2G 0K6, AB.

ECE 5318/6352 Antenna Engineering. Spring 2006 Dr. Stuart Long. Chapter 6. Part 7 Schelkunoff s Polynomial

Distributions, spatial statistics and a Bayesian perspective

Optimization Programming Problems For Control And Management Of Bacterial Disease With Two Stage Growth/Spread Among Plants

Transcription:

University f Pennsylvania SchlarlyCmmns Publicly Accessible Penn Dissertatins 1-1-2013 Issues in Grup Sequential/Adaptive Designs Hng Wan University f Pennsylvania, wanhng76@yah.cm Fllw this and additinal wrks at: http://repsitry.upenn.edu/edissertatins Part f the Bistatistics Cmmns Recmmended Citatin Wan, Hng, "Issues in Grup Sequential/Adaptive Designs" (2013). Publicly Accessible Penn Dissertatins. 815. http://repsitry.upenn.edu/edissertatins/815 This paper is psted at SchlarlyCmmns. http://repsitry.upenn.edu/edissertatins/815 Fr mre infrmatin, please cntact libraryrepsitry@pbx.upenn.edu.

Issues in Grup Sequential/Adaptive Designs Abstract In recent years, there has been great interest in the use f adaptive features in clinical trials (i.e., changes in design r analyses guided by examinatin f the accumulated data at an interim pint in the trial) that may make the studies mre efficient (e.g., shrter duratin, fewer patients). Many statistical methds have been develped t maintain the validity f study results when adaptive designs are used (e.g., cntrl f the Type I errr rate). Grup sequential designs, which allw early stpping fr efficacy in light f cmpelling evidence f benefit r early stpping fr futility when the likelihd f success is lw at interim analyses, have been widely used fr many years. In this dissertatin, we study several aspects f statistical issues in grup sequential/ adaptive designs. Sample size re-estimatin has drawn a great deal f interest due t its permitting revisin f the target treatment difference based n the unblinded interim analysis results frm an nging trial. A pssible risk f ublinded sample size re-estimatin is that the exact treatment effect being bserved at interim analysis might be back-calculated frm the mdified sample size, which might jepardize the integrity f the trial. In the first prject, we prpse a pre-specified stepwise tw-stage sample size adaptatin t lessen the infrmatin n treatment effect that wuld be revealed. We minimize expected sample size amng a class f these designs and cmpare efficiency with the fully ptimized tw-stage design, ptimal tw-stage grup sequential design and designs based n prmising cnditinal pwer. In the secnd prject, we define the cmplete rdering f a grup sequential sample space and shw that a Wang-Tsiatis bundary family r an expnential spending functin family can cmpletely rder the sample space. We als prpse a simple methd t transfrm a spending functin t a cmpletely rdered sample space when using the sequential p- value rdering. This methd is als extended t β-spending functins fr p-values t reject the alternative hypthesis. In the third prject, we prpse a simple apprach fr cntrlling the familywise errr rate in a grup sequential design with multiple testing. We apply sequential p-values at the interim analysis frm a grup sequential design t the sequentially rejective graphical prcedure which is based n the clsure principle. We als use simulatins t study the perating characteristics f multiple testing in grup sequential designs. We shw that in terms f expected sample size, using a grup sequential design in multiple hypthesis testing is mre efficient than fixed sample size designs in many scenaris. Degree Type Dissertatin Degree Name Dctr f Philsphy (PhD) Graduate Grup Epidemilgy & Bistatistics First Advisr Susan S. Ellenberg Subject Categries Bistatistics This dissertatin is available at SchlarlyCmmns: http://repsitry.upenn.edu/edissertatins/815

ISSUES IN GROUP SEQUENTIAL/ADAPTIVE DESIGNS Hng Wan A DISSERTATION in Epidemilgy and Bistatistics Presented t the Faculties f the University f Pennsylvania in Partial Fulfillment f the Requirements fr the Degree f Dctr f Philsphy 2013 Supervisr f Dissertatin Signature Susan S. Ellenberg, Ph.D. Prfessr f Bistatistics Graduate Grup Chairpersn Signature Daniel F. Heitjan, Ph.D. Prfessr f Bistatistics Dissertatin Cmmittee Kathleen J. Prpert, Sc.D., Prfessr f Bistatistics Keaven M. Andersn, Ph.D., Executive Directr, Merck Research Lab David J. Marglis, MD, Ph.D., Prfessr f Dermatlgy Andrea B. Trxel, Sc.D., Prfessr f Bistatistics

ISSUES IN GROUP SEQUENTIAL/ADAPTIVE DESIGNS COPYRIGHT 2013 Hng Wan

Acknwledgments This is really a dream cme true. Pursuing a Ph.D. at Penn is prbably the biggest prject I have dne s far. I wuld like t thank Dr. Susan Ellenberg and Dr. Keaven Andersn fr their time, patience, and enthusiastic encuragement in the past five years. I wuld specially like t thank Dr. Keaven Andersn, wh is fficially a c-supervisr f this wrk, fr his tremendus effrt t guide me thrugh all the challenges in my research. I wuld like t thank my ther cmmittee members Dr. Kathleen Prpert, Dr. Andrea Trxel, and Dr. David Marglis fr the advice and discussin. I wuld als like t thank Merck & C. and Shire fr the financial supprt. Finally, I wuld like t thank my wife, Yandng, fr her supprt and patience thrughut this lng prcess. iii

ABSTRACT ISSUES IN GROUP SEQUENTIAL/ADAPTIVE DESIGNS Hng Wan Susan Ellenberg In recent years, there has been great interest in the use f adaptive features in clinical trials (i.e., changes in design r analyses guided by examinatin f the accumulated data at an interim pint in the trial) that may make the studies mre efficient (e.g., shrter duratin, fewer patients). Many statistical methds have been develped t maintain the validity f study results when adaptive designs are used (e.g., cntrl f the Type I errr rate). Grup sequential designs, which allw early stpping fr efficacy in light f cmpelling evidence f benefit r early stpping fr futility when the likelihd f success is lw at interim analyses, have been widely used fr many years. In this dissertatin, we study several aspects f statistical issues in grup sequential/adaptive designs. Sample size re-estimatin has drawn a great deal f interest due t its permitting revisin f the target treatment difference based n the unblinded interim analysis results frm an nging trial. A pssible risk f ublinded sample size re-estimatin is that the exact treatment effect being bserved at interim analysis might be back-calculated frm the mdified sample size, which might jepardize the integrity f the trial. In the first prject, we prpse a pre-specified stepwise tw-stage sample size adaptatin t lessen the infrmatin n treatment effect that wuld be revealed. We minimize expected sample size amng iv

a class f these designs and cmpare efficiency with the fully ptimized tw-stage design, ptimal tw-stage grup sequential design and designs based n prmising cnditinal pwer. In the secnd prject, we define the cmplete rdering f a grup sequential sample space and shw that a Wang-Tsiatis bundary family r an expnential spending functin family can cmpletely rder the sample space. We als prpse a simple methd t transfrm a spending functin t a cmpletely rdered sample space when using the sequential p-value rdering. This methd is als extended t β-spending functins fr p-values t reject the alternative hypthesis. In the third prject, we prpse a simple apprach fr cntrlling the familywise errr rate in a grup sequential design with multiple testing. We apply sequential p-values at the interim analysis frm a grup sequential design t the sequentially rejective graphical prcedure which is based n the clsure principle. We als use simulatins t study the perating characteristics f multiple testing in grup sequential designs. We shw that in terms f expected sample size, using a grup sequential design in multiple hypthesis testing is mre efficient than fixed sample size designs in many scenaris. v

Cntents 1 Intrductin 1 2 Stepwise tw-stage sample size adaptatin 6 2.1 Intrductin................................ 6 2.2 A tw-stage design with a limited set f stage tw sample size pssibilities 8 2.3 Reparameterizing the design....................... 11 2.4 Unrestricted 2-stage designs....................... 13 2.5 Results................................... 14 2.5.1 Stepwise Adaptive Design Characteristics............ 14 2.5.2 Stepwise Adaptive Design Cmpares with Designs Based n Prmising Cnditinal Pwer.................. 20 2.6 Discussin................................. 23 3 Sample Space Ordering and Inference fr Grup Sequential/Adaptive Designs 28 3.1 Intrductin................................ 28 vi

3.2 Review f Grup Sequential Testing................... 32 3.3 Review f Sample Space Ordering.................... 34 3.4 Cmplete Ordering f Sample Space................... 37 3.5 Illustrative Example........................... 55 3.6 Sample Space Ordering fr β-spending Functin............ 58 3.7 Discussin................................. 63 4 Applicatin f Sequential P-value Methds t Multiplicity Issues fr Grup Sequential Designs 65 4.1 Intrductin................................ 66 4.2 Methdlgy............................... 68 4.2.1 The clsure principle....................... 68 4.2.2 Bnferrni-based clsed test prcedures............. 69 4.2.3 Sequentially rejective graphical prcedure........... 70 4.2.4 Our prpsal........................... 72 4.3 Results................................... 74 4.3.1 O Brien-Fleming-type spending functin fr bth primary and secndary endpints....................... 77 4.3.2 O Brien-Fleming-type spending functin fr primary endpint and Pcck-type spending functin fr secndary endpint.. 80 4.4 Discussin................................. 83 5 Cnclusin 85 vii

List f Tables 2.1 Hw stage 2 sample size knwledge translates int pssible stage 1 results by design type fr ptimal designs with prir θ N(0, (δ/2) 2 ), 90% pwer and 5% Type I errr, ne-sided............... 18 3.1 Sequential Inference fr Nscmial Pneumnia (NP) Study..... 57 3.2 Sequential Inference under Sample Space Ordering by β-spending fr Nscmial Pneumnia (NP) Study................... 63 4.1 Prbability π fr a successful trial, individual pwer π i, and expected sample size fr different design ptins, scenaris and strategies t stp the trial (α=0.025 and β=0.2) with 100000 simulatins. One primary and ne secndary endpint fr each treatment grup. O Brien- Fleming-type spending functin fr efficacy bundary fr all endpints fr case N. 1-17............................. 79 viii

4.2 Prbability π fr a successful trial, individual pwer π i and expected sample size fr different design ptins and scenaris (α=0.025 and β=0.2) with 100000 simulatins. One primary and ne secndary endpint fr each treatment grup. O Brien-Fleming-type spending functin fr efficacy bundary and Pcck-type spending functin fr secndary endpints fr case N. 1-17.................. 82 ix

List f Figures 2.1 Ttal sample size N/N fix (tp left) and the bundary value (tp right) at the secnd stage fr designs ptimized fr prir θ N(δ/2, (δ/2) 2 ) with 90% pwer and 5% Type I errr, ne-sided; expected sample size (middle left), pwer (middle right), predictive pwer (bttm left), prbability f maximizing N after first interim analysis (bttm right) ver a range f θ fr the design ptimized fr prir θ N(δ/2, (δ/2) 2 ). 17 2.2 Ttal sample size N/N fix and the bundary value at the secnd stage fr designs ptimized fr prir θ N(0, (δ/2) 2 ) (tp), fr prir θ N(δ, (δ/2) 2 ) (middle), and fr prir θ N(δ/2, (2δ) 2 ) vs. θ N(δ/2, (δ/2) 2 ) (bttm) with 90% pwer and 5% Type I errr, ne-sided....... 21 x

2.3 Ttal sample size N/N fix (left) and bserved treatment effect at study bundary (right). Stepwise adaptive design and Ga s adaptive designs have n 1 /n 2 = 0.5, the stepwise adaptive design is ptimized fr prir θ N(δ/2, (δ/2) 2 ), and the maximum sample size fr Ga s adaptive design can be up t duble the size f the sample size fr a tw-stage grup sequential design t have 90% cnditinal pwer when the firststage test statistic fall int prmising zne................ 23 2.4 Pwer Curve (left) and expected sample size (right). Grey line shws the pwer curve fr a stepwise adaptive design which matched the pwer f Ga s adaptive design at 0.5δ.................. 24 3.1 Ordering f Sample Space by ttal Type I errr assciated with the bund: Pcck design with 5 equally spaced interim analyses..... 39 3.2 Ordering f Sample Space by ttal Type I errr assciated with the bund: Pwer spending functin with ρ = 1.............. 45 3.3 Ordering f Sample Space by ttal Type I errr assciated with the bund: O Brien-Fleming-type spending functin............ 46 3.4 Ordering f Sample Space by ttal Type I errr assciated with the bund: Expnential Spending Functin with ν = 0.8, which apprximates O Brien-Fleming bundary.................... 48 xi

3.5 Ordering f Sample Space by ttal Type I errr assciated with the bund: Expnential Spending Functin with ν = 0.2, which apprximates Pcck bundary......................... 49 3.6 Bundaries as a functin f Type I errr: Expnential Spending Functin with ν = 0.8, which apprximates O Brien-Fleming bundary.. 50 3.7 Ordering f Sample Space by ttal Type I errr assciated with the bund: Pwer Family with ρ = 1 and α 0 = 0.025............ 53 3.8 Bundaries as a functin f Type I errr: Pwer Family with ρ = 1 and α 0 = 0.025.............................. 54 3.9 Bundaries as a functin f Type II errr: Expnential Spending Functin with ν = 0.8. The sample size is fixed as the design with α = 0.025 and β = 0.1................................. 60 3.10 Bundaries as a functin f Type II errr: Pwer Family with ρ = 1 and β 0 = 0.1. The sample size is fixed as the design with α = 0.025 and β = 0.1................................. 61 4.1 Multiple testing strategy fr tw primary hyptheses H 1, H 2 and tw secndary hyptheses H 3, H 4...................... 72 xii

Chapter 1 Intrductin Clinical trials ften take lng time and a lt f resurces t cnduct. Interim analyses are ften perfrmed in clinical trials because f ethical and ecnmical reasns. There is an ethical need t ensure that patients are nt expsed t unsafe, inferir r ineffective treatments. Early stpping may als allw highly effective medicines t cme t market faster fr patients wh d nt have gd treatment ptins. Early cmpletin can als free up resurces fr studies addressing ther pressing medical issues. In recent years, the ptential use f adaptive designs in clinical trials have attracted great interest because f the ptential gain f efficiency in drug develpment prcesses (e.g., shrter duratin, fewer patients). The Pharmaceutical Research and Manufacturers f America (PhRMA) has frmed an adaptive design wrking grup t prmte the usage f adaptive designs and related methdlgy (Gall et al. (2006)). The Eurpean Medicines Agency (EMA) published a Reflectin paper n methdlgical 1

issues in cnfirmatry clinical trials planned with an adaptive design (EMA (2007)). The Fd and Drug Administratin (FDA) recently released the draft guidance n adaptive design clinical trials and discussed varius aspects f usage, cnsideratins, challenges f applicatin f adaptive design trials (Fd and Drug Administratin (2010)). The FDA draft guidance defines an adaptive design clinical study as a study that includes a prspectively planned pprtunity fr mdificatin f ne r mre specified aspects f the study design and hyptheses based n analysis f data (usually interim data) frm subjects in the study. Varius aspects f clinical trials culd be mdified at interim analysis; these include, but are nt limited t, study dse, treatment duratin, study endpints, randmizatin, study design, study hyptheses, sample size, etc. Sample size re-estimatin based n unblinded interim effect size estimates has drawn a great deal f interest due t its permitting revisin f the hypthesized treatment difference frm an nging trial while preserving the Type I errr rate. When there is uncertainty abut the assumptins f treatment effect at the design stage, it wuld be valuable t check these assumptins and make a midcurse adjustment t maintain the study pwer. Several adaptive design methds have been prpsed t re-estimate sample size using the bserved treatment effect after an initial stage f a clinical trial while preserving the verall Type I errr at the time f the final analysis (Prschan and Hunsberger (1995); Cui et al. (1999); Müller and Schäffer (2001)). One unfrtunate prperty f the algrithms used in sme methds is that they can be 2

inverted t reveal the exact treatment effect at the interim analysis (Ellenberg et al. (2006)). In Chapter 2, we prpse using a step functin with an inverted U-shape f bserved treatment difference fr sample size re-estimatin t lessen the infrmatin n treatment effect revealed. This will be referred t as stepwise tw-stage sample size adaptatin. This methd applies calculatin methds used fr grup sequential designs. We minimize expected sample size amng a class f these designs and cmpare efficiency with the fully ptimized tw-stage design, ptimal tw-stage grup sequential design and designs based n prmising cnditinal pwer. The tradeff between efficiency versus the imprved blinding f the interim treatment effect is als discussed. Armitage, McPhersn, and Rwe (1969) had numerically shwn that repeated testing at a fixed level at interim analyses inflates the verall Type I errr rate. Grup sequential designs (Pcck (1977); O Brien and Fleming (1979); Lan and DeMets (1983); Jennisn and Turnbull (2000); etc.) have been develped and are well accepted t cntrl the Type I errr rate with pssible early stpping t either accept r reject the null hypthesis. P-values are ften used t measure the strength f evidence against the null hypthesis in favr f the alternative. An rdered utcme space is required t cmpute a p-value. Unlike a fixed sample design, a grup sequential trial might stp early and the densities fr the grup sequential statistics used t stp the trial lack a mntne likelihd rati. There are several ways t rder the sample space fr a grup sequential design, e.g., stage-wise rdering by Tsiatis, Rsner 3

and Mehta (1984); maximum likelihd estimate (MLE) rdering by Emersn and Fleming (1990); likelihd rati rdering r z-scre rdering by Chang (1989); scre test rdering r B-value rdering by Rsner and Tsiatis (1988); and sequential p-value rdering by Liu and Andersn (2008a). In Chapter 3, we review the existing sample space rderings fr grup sequential designs and we shw the advantage f sequential p-value rdering because this methd uses the ttality f the accumulating data, taking int accunt the entire sample path, while the ther rderings nly cnsider the data where the bundary was crssed r the data at the current analysis. We shw that sme spending functins culd nt cmpletely rder the sample space when sequential p-value rdering is used t test the null hypthesis (Type I errr). We prpse a simple methd t transfrm such a spending functin t ne which can cmpletely rder a grup sequential design sample space. We als extend the sequential p-value rdering t test the alternative hypthesis (Type II errr). The tw ne-sided sequential p-values against the null r alternative hypthesis may be useful fr a Data Mnitring Cmmittee (DMC) making an apprpriate decisin. Much f the wrk n grup sequential methds was develped under a single endpint. Clinical trials ften invlve mre than ne endpint. It is f interest t extend the grup sequential methds in the multiple endpint/testing cntext. Less literature is available fr this tpic. In Chapter 4, we prpse t apply sequential p-values methds t clsed test based multiple testing prcedures t cntrl the familywise errr rate fr a grup sequential design with multiple testing. We run simulatins t 4

study pwer and expected sample size f a grup sequential design with tw primary and tw secndary endpints. We study the perating characteristics f this design under many different scenaris f design parameters and using different spending functins fr secndary endpints. 5

Chapter 2 Stepwise tw-stage sample size adaptatin 2.1 Intrductin Different adaptive design methds have been prpsed t mdify sample size based n unblinded results frm interim analysis while preserving the Type I errr rate. Prschan and Hunsberger (1995) prpsed a tw-stage adaptive design t re-estimate secnd-stage sample size based n cnditinal pwer assuming the bserved interim treatment effect. Liu and Chi (2001) varied this apprach based n cnditinal pwer cmputed under the minimum treatment effect f interest. Andersn and Liu (2004) shwed that the latter apprach imprves efficiency cmpared t the frmer apprach. Cui et al. (1999) preserved the verall Type I errr by cmbining the Wald statistics with pre-specified weights, btained befre and after sample size adaptatin. Müller 6

and Schäffer (2001) shwed the verall Type I errr can be preserved uncnditinally under any general adaptive change given that the cnditinal Type I errr is preserved. Psch et al. (2003) investigated an ptimal reassessment rule which minimizes the expected sample size ver sme set f fixed alternatives with an verall desired pwer at the minimum treatment effect f interest. They described the ptimal secnd-stage sample size as a plynmial functin f the first-stage test statistic given the stpping bundaries and preplanned weights f the grup sequential designs. Lkhnygina and Tsiatis (2008) prpsed a fully ptimized, decisin-theretic tw-stage adaptive grup sequential design t achieve the minimum expected sample size averaged ver a nrmal prir r sme fixed alternatives fr the treatment effect. This ptimal tw-stage design is adaptive in that the sample size at the secnd stage depends n the data frm the first stage. They used backward inductin algrithm t slve fr a Bayesian sequential decisin prblem fllwing Schmitz (1993), and Barber and Jennisn (2002). The re-estimated sample size in the secnd stage frm these adaptive designs is a cntinuus functin f the bserved test statistic (treatment effect) at the first interim analysis. Given the study design and the secnd-stage sample size, the treatment effect at the interim analysis might be back-calculated. This is generally cnsidered a pr feature f these designs (Ellenberg et al. (2006)). One way t reduce the infrmatin revealed abut the treatment effect in the interim analysis is t make the secnd-stage sample size a step functin f interim treatment effect, i.e., t prvide a few sample size chices given the interim test results. 7

In this paper, we utline a pre-specified tw-stage design with a limited set f stage tw sample size pssibilities and minimizing the expected sample size under the assumptin f a nrmal prir fr the treatment effect. We cmpare this design with the fully ptimized tw-stage adaptive design (Lkhnygina and Tsiatis (2008)), ptimal tw-stage grup sequential designs (Andersn (2007)) and designs based n prmising cnditinal pwer (Ga et al. (2008), Mehta and Pcck (2011)). We cnclude with a discussin in the final sectin. 2.2 A tw-stage design with a limited set f stage tw sample size pssibilities Assume X 1, X 2,... are independent and identically distributed with a Nrmal (θ,1) distributin. Let θ represent the single parameter f interest, which is the treatment effect in ur case. Assume n 1 is the first-stage sample size and there are m 1 pssible stage tw sample sizes at the first interim analysis. Fr i = 1, 2,..., m, n i is a sequence f psitive integers and dente n i Z i = X i / n i. j=1 We will assume n 1 < n i, i = 2, 3,..., m, but that therwise these numbers are nt rdered in any particular way. The amunt f statistical infrmatin abut θ after n i bservatins and will be dented by I i, i = 1, 2,..., m. Under these assumptins the statistics Z i, i = 1, 2, 3,..., m, have a multivariate nrmal distributin where if 8

1 n j n i we have E{Z i } = θ I i, (2.2.1) Cv(Z j, Z i ) = I j /I i (2.2.2) Jennisn and Turnbull (2000) refer t this as the cannical frm when used with grup sequential designs where n 1 < n 2 <... < n m. It is the asympttic frm fr a brad variety f grup sequential designs with endpints having different distributins. We cnsider tw-stage designs bth since the tw-stage design shuld be simple t implement and because it minimizes what is revealed abut the interim treatment effect. Fr sme initial sample size n 1 we cmpute a test statistic Z 1 and fr sme integer m > 1 we cnsider bundary values a 1 < a 2 <... < a m. The trial is stpped after the analysis f n 1 patients fr a psitive efficacy finding if Z 1 a m, while if Z 1 < a 1 the trial is stpped fr futility. Fr i = 2, 3,..., m, if a i 1 Z 1 < a i the trial cntinues t the secnd stage with a sample size f n i > n 1, a test statistic Z i is cmputed based n the mean f the entire n i bservatins, and fr sme real value b i efficacy is established if Z i > b i. In this tw-stage design setting, b 1 = a m. Nte that fr i = 2, 3,..., m there is n restrictin n the rdering f the n i values. If they are all equal r if m = 2, this becmes a tw-stage grup sequential design. The prbability f crssing an upper bund at the first interim analysis with n 1 bservatins is α 1 (θ) = P θ {Z 1 a m } (2.2.3) 9

Fr i = 2, 3,..., m the prbability f the first interim test statistic being between a i 1 and a i and then crssing the upper bund after n i bservatins at the secnd stage is α i (θ) = P θ {{a i 1 Z 1 < a i } {Z i b i }}. (2.2.4) Similarly, the prbability f crssing a lwer bund at the first interim analysis with n 1 bservatins is β 1 (θ) = P θ {Z 1 < a 1 } (2.2.5) Fr i = 2, 3,..., m the prbability f the first interim test statistic being between a i 1 and a i and then failing t crss the upper bundary at the secnd stage after n i > n 1 bservatins is β i (θ) = P θ {{a i 1 Z 1 < a i } {Z i < b i }}. (2.2.6) These prbabilities can be cmputed using grup sequential design cmputatins as utlined in Jennisn and Turnbull (2000). The ttal prbability f crssing an upper bund at any time is α(θ) = m α i (θ) (2.2.7) i=1 and the Type I errr fr the design is α(0). The prbability f being belw a lwer bundary (a 1 fr the first interim analysis and b i fr stage tw analysis after n i patients fr i = 2, 3,..., m) is Fr any given θ, β(θ) = m β i (θ) (2.2.8) i=1 α(θ) + β(θ) = 1 (2.2.9) 10

2.3 Reparameterizing the design The design can be parameterized by using the sample sizes and bundaries, e.g., n i, a i and b i, fr i = 1, 2,..., m. Our gal is t achieve the minimum expected sample size ver a range f alternatives. We will reparameterize the design here, beginning with bundary crssing prbabilities under the null hypthesis and relative sample sizes at the different stages f the design. The verall Type I errr fr the design is α α(0) = m α i (0) (2.3.1) i=1 The prbability f a negative finding under the null hypthesis is 1 α = β(0) = m β i (0) (2.3.2) i=1 Leaving n i fixed fr i = 1, 2,..., m we can map back and frth frm a parameterizatin using a 1 and a i, b i, i = 2, 3,..., m, t anther using α and α i (0), β i (0), i = 2, 3,..., m. We briefly discuss the methd fr ding this. First, cnsider the bunds at the first stage. Since β 1 (0) = P {Z 1 < a 1 } we have a 1 = Φ 1 (β 1 (0)) where Φ 1 () represents the inverse f the standard nrmal cumulative distributin functin. Next, nte that fr i = 2,..., m i P 0 {Z i < a i } = Φ(a i ) = β 1 (0) + (α j (0) + β j (0)) (2.3.3) j=2 and thus ) i a i = Φ (β 1 1 (0) + (α j (0) + β j (0)). (2.3.4) j=2 11

Fr i = 2, 3,..., m the value f b i is a slutin t the equatin β i (0) = P 0 {{a i 1 Z 1 < a i } {Z i < b i }}. (2.3.5) where β i, a i and a i 1 are fixed. This is a standard cmputatin fr deriving grup sequential designs that is utlined in Jennisn and Turnbull (2000). With the reparameterizatin frm a i and b i t α i (0) and β i (0) we nw have a methd f chsing designs that cntrl Type I errr. Next we cnsider sample size parameterizatin t cntrl pwer. We let r i = n i /n 1 > 1 represent the relative increase in sample size at the secnd stage f the trial based n interim results at stage 1, i = 2, 3,..., m. The initial parameters defining the distributin were n 1,..., n m, a 1,..., a m, b 2,..., b m. Nte b 1 = a m in this tw-stage design setting. Thus, there were a ttal f 3m 1 parameters defining the design. The cmplete reparameterizatin nw cnsists f n 1, α, r i, α i (0) and β i (0), i = 2, 3,..., m, which still has 3m 1 parameters. Any tw designs with all parameters ther than n 1 equal will have the same Type I errr structure. The pwer t reject θ = 0 when, in truth, θ = δ > 0, 1 β(δ), is strictly increasing as a functin f n 1 in this case. δ represents the minimal treatment difference f interest. A rt finding algrithm can find a minimum value f n 1 that prvides a desired pwer level. Thus, we can replace n 1 with β(δ) in the parametrizatin. 12

2.4 Unrestricted 2-stage designs An apprpriately selected and unrestricted parameter space can make ptimizatin prblems particularly tractable. We develp an unrestricted reparameterizatin f the design. We assume α and β(δ) are fixed at desired levels. It may be easier t ptimize the unrestricted value n 1 rather than β(δ) if pwer is nt restricted. Nte that we are treating n 1 as a prprtin f the sample size f a fixed design (n fix ) with Type I errr α and pwer 1-β(δ), and thus as a cntinuus variable rather than as an integer value here. We cnsider a real value x ai and let α i (0) = α exp(x ai ) 1 + m j=2 exp(x aj) (2.4.1) i = 2, 3,..., m. Similarly, we cnsider a real value x bi and let β i (0) = (1 α) exp(x bi) 1 + m j=2 exp(x bj) (2.4.2) i = 2, 3,..., m. Nte that α 1 (0) = α 1 + m j=2 exp(x aj), (2.4.3) and β 1 (0) = 1 α 1 + m j=2 exp(x bj), (2.4.4) Finally, we cnsider a real value x ri and let r i = 1 + exp(x ri ), i = 2, 3,..., m. Nw ur parameter space cnsists f fixed values α and β(δ) and 3m 3 unrestricted parameters: x ai, x bi, and x ri, i = 2, 3,..., m. This space is easily mapped t the errr 13

prbability parameter space and then t the apprpriate bundary cutffs. A simple ptimizatin functin such as the R nlminb functin can be used t find a design t minimize the expected sample size given a fixed Type I errr, pwer and δ value. 2.5 Results 2.5.1 Stepwise Adaptive Design Characteristics The fully ptimized tw-stage design frm Lkhnygina and Tsiatis (2008) suggests that the sample size fr the secnd stage is an inverted U shape curve f the test statistic frm the first stage t achieve the minimum expected sample size ver a range f alternatives. Psch et al. (2003) als suggests a similar shape f the ptimal secnd-stage plynmial while minimizing expected sample size averaged ver sme fixed alternatives, i.e., nly upsizing the trial when the treatment effect in the first interim is an intermediate effect furthest frm stage ne bundaries. In light f the inverted U shape curve frm Lkhnygina and Tsiatis (2008) design, we present the stepwise adaptive design, which is an ptimal design with tw chices f secnd-stage sample sizes with m = 4. We set the chice f secnd-stage sample size t ne value when the first-stage test statistic is clse t either the futility bund r efficacy bund at the first interim, i.e., n 2 = n 4. The ther chice f sample size is chsen when the first-stage test statistic falls int an intermediate regin away frm the first-stage stpping bundaries, i.e., an intermediate treatment effect is bserved 14

that is nt particularly clse t the null r alternate hypthesis effect size. This feature can further blind the treatment effect at the first interim analysis. The expected sample size was integrated ver a nrmal prir distributin fr θ with mean and standard deviatin δ/2. The prir mean might be chsen based n the best knwledge f the treatment effect befre the trial started. The prir standard deviatin might be chsen t reflect the range f the interest. The specific chice f δ/2 was arbitrary. We ll shw the results later abut the impact f the chice f the prir mean and standard deviatin n the ptimizatin f the trial design. The secnd-stage sample sizes and the cutffs fr selecting amng stage tw sample sizes were selected thrugh the ptimizatin algrithm which minimizes the expected sample size. The first-stage sample size was selected t prduce the desired pwer 1 β(δ). Figure 2.1 (tp) shws the stepwise adaptive design, the fully ptimized twstage adaptive design (Lkhnygina and Tsiatis (2008)) and ptimal tw-stage grup sequential designs (Andersn (2007)). We fcus n the prpsed stepwise adaptive design first. The tp left figure shws ttal sample size N fr the ptimal design expressed as a percentage f the fixed sample size design, N fix, as a functin f the standardized statistic at first interim analysis, Z 1. The tp right figure shws the bundary value at the secnd stage, Z 2, as a functin f Z 1. Fr errr prbabilities α = 0.05 and β = 0.1, N fix = (1.64 + 1.28)/δ 2 and the bundary fr a ne stage study wuld be Φ 1 (0.95) = 1.64. In this tw-stage design, the first interim analysis wuld be cnducted after 0.52N fix bservatins. If the standardized test statistic Z 1 15

is less than 0.48 then the trial will stp fr futility. If the standardized test statistic Z 1 exceeds 2.01, the trial will stp fr efficacy. If the standardized test statistic Z 1 falls int the regin [0.69, 1.70], the final ttal sample size wuld be 1.20N fix and the secnd-stage bundary wuld be 1.75. Otherwise, if the standardized test statistic Z 1 falls int the ther area f the cntinuatin regin, the final ttal sample size wuld be 1.07N fix. and the secnd-stage bundary is 1.67. While Figure 2.1 (tp) als cmpares the study designs frm this stepwise adaptive design with the fully ptimized tw-stage adaptive design (Lkhnygina and Tsiatis (2008)) and ptimal tw-stage grup sequential designs (Andersn (2007)). The stepwise adaptive design gives tw chices f secnd-stage sample size: the ttal sample size clse t the sample size frm a fixed design when the first interim test statistic is clse t the futility bund r efficacy bund; the ttal sample size increases abut 20% cmpared t the sample size frm a fixed design when the first interim test statistic is intermediate. The stepwise adaptive design is simplified cmpared t the fully ptimized tw-stage adaptive design. Cmparing t the ptimal tw-stage grup sequential design, the stepwise adaptive design has the sample size and bundary clse t the fixed sample size design when the interim test statistic is clse t the first-stage bundaries. The maximum sample size and crrespnding secnd-stage bundary frm stepwise adaptive design is a bit higher cmpared t grup sequential design but nt much higher. Knwing the sample size adaptatin fllwing stage 1 reveals sme infrmatin abut the interim test statistic which, in turn, can be trans- 16

N N fix 1.20 1.07 0.52 Stepwise Adaptive Fully Adaptive 2 stage GS 0.48 0.69 1.70 2.01 z 2 2.01 1.75 1.67 1.50 Stepwise Adaptive Fully Adaptive 2 stage GS 0.48 0.69 1.70 2.01 z 1 z 1 E θ (N) N fix 1 0.9 0.8 0.7 0.6 0.5 Stepwise Adaptive Fully Adaptive 2 stage GS Pwer 100 80 60 40 20 0 Stepwise Adaptive Fully Adaptive 2 stage GS 0 0.5δ δ 1.5δ 0 0.5δ δ 1.5δ θ θ Predictive Pwer 0.2 0.4 0.6 0.8 Stepwise Adaptive Fully Adaptive 2 stage GS Prb. f Maximizing N 0.5 0.4 0.3 0.2 0.1 0 Stepwise Adaptive 2 stage GS 0.5 1.0 1.5 2.0 0 0.5δ δ 1.5δ z 1 θ Figure 2.1: Ttal sample size N/N fix (tp left) and the bundary value (tp right) at the secnd stage fr designs ptimized fr prir θ N(δ/2, (δ/2) 2 ) with 90% pwer and 5% Type I errr, ne-sided; expected sample size (middle left), pwer (middle right), predictive pwer (bttm left), prbability f maximizing N after first interim analysis (bttm right) ver a range f θ fr the design ptimized fr prir θ N(δ/2, (δ/2) 2 ). 17

Table 2.1: Hw stage 2 sample size knwledge translates int pssible stage 1 results by design type fr ptimal designs with prir θ N(0, (δ/2) 2 ), 90% pwer and 5% Type I errr, ne-sided Examples f stage tw sample size relative t the fixed design Pssible values f Z 1 Stepwise Adaptive Design 0.68 (0.69, 1.70) 0.55 (0.48, 0.69), (1.70, 2.01) Fully Optimized Adaptive Design 0.71 1.18 0.49 0.47, 1.94 Optimal Tw-Stage Grup Sequential Designs 0.65 (0.50, 1.99) lated int an apprximate range fr the interim bserved treatment effect. Table 2.1 shws the examples f the range f pssible Z-values that crrespnd t different knwn stage 2 sample sizes. Figure 2.1 (middle and bttm) cmpares the expected sample size, verall pwer, and predictive pwer f this stepwise adaptive design with the fully ptimized twstage adaptive design and ptimal tw-stage grup sequential designs. The stepwise adaptive design had nearly identical expected sample size and verall pwer ver a range f alternatives cmpared t the fully ptimized tw-stage adaptive design and ptimal tw-stage grup sequential designs. Predictive pwer is defined as a weighted average f cnditinal pwer (cnditining n the first-stage test statistic) with prir θ N(δ/2, (δ/2) 2 ). The stepwise adaptive design and fully ptimized adaptive design have higher predictive pwer when the first-stage test statistic is clse t the upper efficacy bund and lwer predictive pwer when the first-stage test statistic is clse 18

t the lwer futility bund cmpared t ptimal tw-stage grup sequential design. We als cmpare the prbability f maximizing sample size fr the stepwise adaptive design and the ptimal tw-stage grup sequential design. The stepwise adaptive design has a lwer prbability f requiring the maximum ttal sample size cmpared t the ptimal tw-stage grup sequential design as shwn in Figure 2.1 (bttm right), thugh the maximum sample size is a bit larger fr the stepwise adaptive design. The designs shwn abve are based n a prir distributin f θ N(δ/2, (δ/2) 2 ), which is the situatin when the investigatr has sme prir infrmatin and is neutral n treatment effect between the null and alternative hypthesis. Early Phase II develpment f experimental drugs might fit this situatin. We als explred the stepwise adaptive design which uses different prir distributin. Figure 2.2 (tp) shws the design with prir θ N(0, (δ/2) 2 ) and Figure 2.2 (middle) shws the design with prir θ N(δ, (δ/2) 2 ). With prir mean =0, the experimenter des nt have much cnfidence in the treatment effect; the stepwise adaptive design nly increases the sample size when the interim statistics lks prmising. With prir mean =δ, the experimenter has mre cnfidence in the treatment effect, the stepwise adaptive design nly increases the sample size when the interim test statistic des nt lk prmising. We als investigate the impact f a flatter prir distributin n the design. Figure 2.2 (bttm) shws the design with prir θ N(δ/2, (2δ) 2 ) vs. θ N(δ/2, (δ/2) 2 ). The stepwise design with a flatter prir has a wider cntinuatin 19

regin and an earlier first interim analysis which wuld be cnducted after 0.29N fix bservatins. This is incnsistent with cmmn recmmendatin f cnducting the first interim analysis at arund 50% infrmatin time. This suggests that the time t adapt als depends n hw much prir infrmatin we have. Fr many trials with delayed endpints, the nly pssible time fr adaptatin wuld be at early time pints. 2.5.2 Stepwise Adaptive Design Cmpares with Designs Based n Prmising Cnditinal Pwer Chen et al. (2004) shwed that the cnventinal test culd be perfrmed withut inflating the Type I errr if ne increased the sample size nly when interim results were prmising, which was defined as cnditinal pwer f 50 percent r greater. Ga et al. (2008) and Mehta and Pcck (2011) further extended this idea t a brader range f prmising znes in which the sample size may be increased up t an upper bund based n cnditinal pwer and the cnventinal tests may be applied withut inflating Type I errr. Define z 1 as the first-stage test statistic, ñ 2 as the incremental sample size at the secnd stage, and ˆδ 1 as the bserved treatment effect at stage 1. Mehta and Pcck (2011) partitined the cnditinal pwer value, CPˆδ1 (z 1, ñ 2 ), int three znes: unfavrable zne, prmising zne and favrable zne. CPˆδ1 (z 1, ñ 2 ) < CP min defined the unfavrable zne, while CP min depends n n max /n 2, n 1 /n 2 and 1 β, which means 20

N Nfix 1.23 0.99 0.42 Stepwise Adaptive 2 stage GS z2 2.22 1.68 1.52 Stepwise Adaptive 2 stage GS 0.29 0.71 2.22 0.29 0.71 2.22 z 1 z 1 N Nfix 1.22 0.97 z2 1.96 1.85 Stepwise Adaptive 2 stage GS 0.47 Stepwise Adaptive 2 stage GS 1.70 0.25 1.53 1.96 0.25 1.53 1.96 z 1 z 1 Prir with sigma = 0.5*Delta Prir with sigma = 2*Delta Prir with sigma = 0.5*Delta Prir with sigma = 2*Delta N Nfix 1.20 1.07 z2 2.01 0.52 0.29 1.75 1.67 0.15 0.48 0.69 1.70 2.01 0.15 0.48 0.69 1.70 2.01 z 1 z 1 Figure 2.2: Ttal sample size N/N fix and the bundary value at the secnd stage fr designs ptimized fr prir θ N(0, (δ/2) 2 ) (tp), fr prir θ N(δ, (δ/2) 2 ) (middle), and fr prir θ N(δ/2, (2δ) 2 ) vs. θ N(δ/2, (δ/2) 2 ) (bttm) with 90% pwer and 5% Type I errr, ne-sided. 21

that the interim result is s disappinting that it is nt wrth increasing the sample size. CP min (z CPˆδ1 1, ñ 2 ) < 1 β defined the prmising zne, with results that are nt disappinting but nt gd enugh fr the cnditinal pwer t equal r exceed the uncnditinal pwer specified at the design stage. (z CPˆδ1 1, ñ 2 ) 1 β defined the favrable zne, in which the interim results are favrable. This apprach can be extended t a tw-stage grup sequential design with pssible early stpping at stage ne. We present the stepwise adaptive design with the cnstraint f n 1 /n 2 = 0.5 and Ga s methd n tw-stage grup design where n max /n 2 = 2, n 1 /n 2 = 0.5 and 1 β = 0.9 in Figure 2.3 (left). The sample size in Ga s adaptive design is up t duble the sample size f the tw-stage grup sequential design when the interim test statistic is in the prmising zne. We als cmpare the secnd-stage critical values fr different designs. Mehta and Pcck (2011) mentined that the Type I errr was preserved even when the cnventinal test was perfrmed, and suggested using the secnd-stage bundary f the unfavrable zne/the favrable zne fr the prmising zne. Figure 2.3 (right) shws the bserved treatment effect at the study bundary when the trial is stpped fr designs with α = 0.05. The bserved treatment effect at the bundary f Ga s adaptive design is much smaller than the stepwise adaptive design due t the big sample size increase in the prmising zne even if we use the cnventinal test. Figure 2.4 shw the pwer and expected sample size frm the stepwise adaptive and Ga s adaptive design. When we match the pwer f the stepwise adaptive design with Ga s adaptive design at 0.5δ, the pwer is higher fr 22

Stepwise Adaptive Design Ga's Adaptive Design Fixed Sample Size Ga's AD w/ cnventinal test 1.0 Stepwise Adaptive Design Ga's Adaptive Design Fixed Sample Size Ga's AD w/ cnventinal test 2.25 0.8 N N fix δ^ δ 0.6 1.12 0.4 0.56 0.2 0.54 1.05 1.36 1.85 2.03 0.54 1.05 1.36 1.85 2.03 z 1 z 1 Figure 2.3: Ttal sample size N/N fix (left) and bserved treatment effect at study bundary (right). Stepwise adaptive design and Ga s adaptive designs have n 1 /n 2 = 0.5, the stepwise adaptive design is ptimized fr prir θ N(δ/2, (δ/2) 2 ), and the maximum sample size fr Ga s adaptive design can be up t duble the size f the sample size fr a tw-stage grup sequential design t have 90% cnditinal pwer when the first-stage test statistic fall int prmising zne. the stepwise adaptive design if the true mean is δ and the expected sample size is generally smaller fr the stepwise adaptive design. 2.6 Discussin Lkhnygina and Tsiatis (2008) presented a fully ptimized tw-stage design that has minimum expected sample size averaged ver a range f alternatives. In this paper, we simplified this design and presented a methd t create a pre-specified ptimal tw-stage design with a limited set f stage tw sample size pssibilities t lessen the infrmatin revealed at the interim analysis. In this paper, we fcus the stepwise adaptive design with tw chices f secndstage sample size fr the prir distributin f θ N(δ/2, (δ/2) 2 ). We set the chice f secnd-stage sample size t ne value when the first-stage test statistic is clse t 23

100 80 1.3 1.2 1.1 Stepwise Adaptive Design Ga's Adaptive Design Stepwise AD matching Ga's AD pwer at 0.5 Delta Pwer 60 40 E θ (N) N fix 1 0.9 0.8 20 0 Stepwise Adaptive Design Ga's Adaptive Design Stepwise AD matching Ga's AD pwer at 0.5 Delta 0.7 0.6 0.5 0 0.5δ δ 1.5δ θ 0 0.5δ δ 1.5δ θ Figure 2.4: Pwer Curve (left) and expected sample size (right). Grey line shws the pwer curve fr a stepwise adaptive design which matched the pwer f Ga s adaptive design at 0.5δ. either the futility bund r efficacy bund at the first interim analysis, i.e., n 2 = n 4, and t a different value when the first-stage test statistic falls int an intermediate regin away frm the first-stage stpping bundaries, i.e., an intermediate treatment effect is bserved that is nt particularly clse t the null r alternate hypthesis effect size. This feature f the design imprves blinding f the interim treatment effect by lessening the infrmatin revealed at the interim analysis. Each secnd-stage sample size crrespnds t ne range r tw ranges f the first interim analysis test statistic, as shwn in Table 2.1. If the study prceeds t the secnd stage with sample size f 0.68N fix, we knw nly that the standardized first-stage test statistic is between 0.69 and 1.70. If the study prceeds t the secnd stage with sample size f 0.55N fix, we knw nly that the standardized first-stage test statistic is either between 0.48 and 0.69 r between 1.70 and 2.01. The fully ptimized tw-stage adaptive design has unlimited chices f secnd-stage sample size due t its cntinuus nature and culd therefre reveal ne r tw exact first interim analysis test results given the chice 24

f secnd-stage sample size. The ptimal tw-stage grup sequential design has nly ne chice f secnd-stage sample size and reveals the least infrmatin (nly gave ne range f first-interim analysis test statistic). The stepwise adaptive design and the ptimal tw-stage grup sequential design therefre reveal less infrmatin abut the interim treatment effect than the fully ptimized adaptive design. We have seen that the efficiency lss frm the stepwise adaptive design may be minimal cmpared t the substantially mre cmplicated fully ptimized design (Lkhnygina and Tsiatis (2008)). The stepwise adaptive, fully ptimized adaptive designs and ptimal tw-stage grup sequential designs have similar expected sample size and verall pwer ver the range f θ. Advantages f the stepwise adaptive design ver the ptimal tw-stage grup sequential design are that the minimum secnd-stage sample size is much smaller, and the stepwise adaptive design is less likely t require the maximum sample size cmpared t the ptimal tw-stage grup sequential design. Ntice the shape f the stepwise adaptive design is nt symmetric. This is als true fr the fully ptimized tw-stage adaptive design (Lkhnygina and Tsiatis (2008)). This might be caused by the ptimizatin prcess which requires a minimum expected sample size fr a given prir. We design a symmetric stepwise adaptive design with equal length f cntinuatin regin when the first-stage test statistic is clse t the futility bund r efficacy bund at the first interim. We cmpare the expected sample size fr the current stepwise adaptive design with this symmetric stepwise adaptive design. The expected sample size fr the current stepwise design relative t a fixed 25

sample size design is 0.77096 cmpared t 0.77107 fr the symmetric stepwise adaptive design. Levin et al. (2011) recently presented a cmpletely pre-specified ptimal adaptive design. This design is similar t ur stepwise adaptive design in that we bth used step functins. Levin et al. (2011) nly cnsidered the symmetric design and ptimized the design by assigning half the weight n the null and half the weight n the alternative and achieved the ptimizatin thrugh adding mre steps t the design. Our design fcuses n the design with fewer steps and minimizes the expected sample size ver a range f alternatives. Chuang-Stein et al. (2006) pinted ut that the interim treatment effect size can be highly variable and ptentially t unreliable t be used directly fr sample size re-estimatin purpses. And in general, the sample size re-estimatin design based n cnditinal pwer is likely nt ptimized fr expected sample size. Jennisn and Turnbull (2003) have demnstrated that mid-curse sample size mdificatin based n the bserved treatment effect cme with the cst f efficiency when cmpared with grup sequential designs. The stepwise adaptive design is an extensin f standard grup sequential design. This design is pre-specified at the design stage as the grup sequential design and als prvides the pprtunity f sample size adaptatin with great efficiency. The stepwise adaptive design prvides a slutin by cmbining the prir infrmatin and the infrmatin within a trial. We have fund ur stepwise adaptive design is cmpetitive with fully ptimized 26

tw-stage adaptive and with ptimal tw-stage grup sequential designs, but reveals less infrmatin abut interim treatment effect than the fully ptimized adaptive design and has the ptential t increase sample size based n interim results. 27

Chapter 3 Sample Space Ordering and Inference fr Grup Sequential/Adaptive Designs 3.1 Intrductin Armitage, McPhersn, and Rwe (1969) numerically shwed that if significance tests at a fixed level are repeated at interim analyses, the Type I errr rate (r α) is greatly increased ver the nminal level. Simple grup sequential methds fr a predefined number f equally spaced interim analyses were develped by Pcck (1977) and O Brien and Fleming (1979) t cntrl the Type I errr rate by adjusting the critical values. Wang and Tsiatis (1987) generalized Pcck (1977) and O Brien and 28

Fleming (1979) designs t a class f grup sequential tests, als referred as bundary families. But the bundary family designs assume the maximum number f analyses, K, be fixed in advance and require equally spaced interim analyses. Lan and DeMets (1983) suggested an alternative methd t cnstruct discrete sequential bundaries by using α-spending functins. The bundary at a decisin time is determined by α(t), where t is the timing f the interim analysis, which is als called infrmatin time. Infrmatin time t is defined as I i /I max fr i = 1,..., K, where I i is the statistical infrmatin at analysis i and I max represents the maximum planned infrmatin at the time f design. Kim and DeMets (1987) and Hwang, Shih, and DeCani (1990) individually extended the methd f Lan and DeMets (1983) t a general neparameter family f α-spending functins, α(t; γ) = α h γ (t), where the parameter γ specifies the rate f α-spending. The functin h(t) is increasing in t (0, 1) with h(0) = 0 and h(t) = 1 fr t 1. Pampallna, Tsiatis, and Kim (2001) extended the Type I errr spending methd f Lan and DeMets (1983) by incrprating an analgus Type II errr (r β) spending functin fr interim analyses t test futility. Andersn and Clark (2010) discussed additinal ne- and tw-parameter spending families. Their tw- r three-parameter spending functin families prvide additinal flexibility t custmize the shape f spending functins t fit mre than ne desired critical value. The spending functin apprach has becme cmmn because f its flexibility in accmmdating unequally-spaced analyses and allwing sme leeway in mving, adding r deleting interim analyses as lng as this is dne withut knwledge 29