Pubh 8482: Sequential Analysis

Similar documents
Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Pubh 8482: Sequential Analysis

Group Sequential Designs: Theory, Computation and Optimisation

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Estimation in Flexible Adaptive Designs

Regression #3: Properties of OLS Estimator

4. Issues in Trial Monitoring

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Math 494: Mathematical Statistics

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee

Overrunning in Clinical Trials: a Methodological Review

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Data Mining Stat 588

557: MATHEMATICAL STATISTICS II BIAS AND VARIANCE

Evaluating the Performance of Estimators (Section 7.3)

Inference in Regression Analysis

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

[y i α βx i ] 2 (2) Q = i=1

Bias Variance Trade-off

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

MS&E 226: Small Data

The Design of a Survival Study

Group sequential designs with negative binomial data

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Multiple Testing in Group Sequential Clinical Trials


Graduate Econometrics I: Unbiased Estimation

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Combining Biased and Unbiased Estimators in High Dimensions. (joint work with Ed Green, Rutgers University)

A Very Brief Summary of Statistical Inference, and Examples

Review of probability and statistics 1 / 31

Central Limit Theorem ( 5.3)

Bios 6649: Clinical Trials - Statistical Design and Monitoring

MS&E 226: Small Data

ECE531 Lecture 10b: Maximum Likelihood Estimation

IEOR E4703: Monte-Carlo Simulation

Applied Statistics and Econometrics

Rerandomization to Balance Covariates

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Multiple Linear Regression

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Multiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar

Econometrics Review questions for exam

Machine Learning for OR & FE

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

Statistical inference

Estimation of Parameters

Statistics Ph.D. Qualifying Exam: Part II November 9, 2002

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Estimators as Random Variables

Multivariate Time Series: VAR(p) Processes and Models

Introduction to Maximum Likelihood Estimation

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

Review of Econometrics

Economics 583: Econometric Theory I A Primer on Asymptotics

University of North Texas Health Science Center at Fort Worth

The SEQDESIGN Procedure

The regression model with one stochastic regressor.

f (1 0.5)/n Z =

A General Overview of Parametric Estimation and Inference Techniques.

A Type of Sample Size Planning for Mean Comparison in Clinical Trials

A Very Brief Summary of Statistical Inference, and Examples

Problem Selected Scores

Inverse Sampling for McNemar s Test

ROI ANALYSIS OF PHARMAFMRI DATA:

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Independent Increments in Group Sequential Tests: A Review

V. Properties of estimators {Parts C, D & E in this file}

Answers to Problem Set #4

SAS/STAT 15.1 User s Guide The SEQDESIGN Procedure

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Bios 6649: Clinical Trials - Statistical Design and Monitoring

An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary

Problem Set 6 Solution

Master s Written Examination

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Master s Written Examination - Solution

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Mathematical statistics

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Visual interpretation with normal approximation

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Ch 2: Simple Linear Regression

Review. December 4 th, Review

Transcription:

Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 7

Course Summary To this point, we have discussed group sequential testing focusing on Maintaining the correct type-i error rate and power Decreasing the expected sample size These approaches only provide a yes or no answer as to whether or not we reject the null hypothesis Generally, more detail is provided when presenting results

Four-Number Summary In general, the following should always be reported when presenting results Point estimate Confidence Interval p-value

Impact of a Group Sequential Design Implementing a group sequential procedure will change the properties of standard point and interval estimators Group sequential procedures change the sampling distribution of standard estimators Confidence intervals derived from normal approximations will no longer have nominal coverage

Impact of a Group Sequential Design Implementing a group sequential procedure will change the properties of standard point and interval estimators Group sequential procedures change the sampling distribution of standard estimators Confidence intervals derived from normal approximations will no longer have nominal coverage We will start by considering distribution theory for group sequential design and then consider the implication for point and interval estimation

Set-up Let β be our ( parameter of interest and assueme that the sequence of estimates ˆβ1,..., ˆβ ) K follows a multivariate normal distribution with ( ) ˆβk N β, I 1 β,k for k = 1,..., K [ Cov ˆβk, ˆβ ] [ ] j = Var ˆβj = I 1 β,j for k j

Set-up ( ) Define ˆβ β0 = ˆθ β and (β β 0 ) = θ β. For Z k = ˆθ β,k Ik, the sequence of test statistics (Z 1,..., Z K ) follows a multivariate normal distribution with Z k N ( θ β Iβ,k, 1 ) for k = 1,..., K Cov [ Z k, Z j ] = Iβ,k /I β,j for k j

Notation Let T be the stage at which stopping occurs: T = min (k : Z k C k ) where C k is the continuation region at stage k and C K =

Notation Let Z (k) = (Z 1,..., Z k ) be the vector of the first k test statistics and for k = 1,..., K, define A k = {z (k) : z i C i, i = 1,..., k 1, and z k C k } i.e. A k is the set of sample paths that terminate at stage k.

Density of (Z 1,..., Z k ) The joint density of (Z 1,..., Z k ) follows a multivariate normal distribution as described above The joint density of (Z 1,..., Z k ) can also be written as a product of independent normal random variables by considering the following transformation

Transformations Consider the following transformation Let and y 1 = z 1 Iβ,1 1 = I β,1 y i = z i Iβ,i z i 1 Iβ,i 1 i = I β,i I β,i 1

Joint Density of (y 1,..., y k ) For i = 1 y 1 is normally distributed with ] E [y 1 ] = E [z 1 Iβ,1 = θ β,1 I β,1 = θ β 1 ] Var [y 1 ] = Var [z 1 Iβ,1 = I β,1 = 1 For i = 2,..., k y i is normally distributed with ] E [y i ] = E [z i Iβ,i z i 1 Iβ,i 1 = θ β (I β,i I β,i 1 ) = θ β i ] Var [y i ] = Var [z i Iβ,i z i 1 Iβ,i 1 = I β,1 = i

Joint Density of (y 1,..., y k ) More importantly, we know that the y i s are indendent. Cov [ y i, y j ] = 0 for i j Recall that the z i s have independent increments

Joint Density of (y 1,..., y k ) This means that we can write the joint density of (y 1,..., y k ) as the product of independent normally distributed random variables f T, yt (k, y k θ beta ) = k i=1 1 e ( y i i θ β) 2 2 i 2π i Therefore, the joint density of (z 1,..., z k ), f T, z ( k, z (k) θ β ), can be evaluated by evaluating f T, yt (k, y k θ beta ) for the correct y k.

Joint Density of (y 1,..., y k ): Example Assume you have the following sequence of test statistics: z (k) = (0.73,.25, 0.33, 0.10) and the following sequence of information I β,k = (3.53, 5.00, 6.12)

Joint Density of (y 1,..., y k ): Example The resulting sequence of y i s and i s are y k = (0.73, 3.83, 3.27, 1.31) and k = (3.54, 1.46, 1.12, 0.95)

Joint Density of (y 1,..., y k ): Example Therefore, the joint density of z (k) is f T,z (T ) ((0.73, 0.25, 0.33, 0.10), k θ β ) =f T, yt (k, (0.73, 3.83, 3.27, 1.31) θ β ) k 1 = e (y i i θ β) 2 2 i 2π i i=1 =2.3 10 7

Equivalence of two joint distributions The preceding argument shows that the joint distribution of z (k) and y k are equivalent Therefore, we can simply study f T, yt to derive theoretical properties of z (k)

Joint Density of (y 1,..., y k ) We can re-write f T, yt (k, y k θ β ) as f T, yt (k, y k θ β ) = = k i=1 ( k 1 e ( y i i θ β) 2 2 i 2π i i=1 ) 1 e y k 2 2 i i θ β y i + 2 i θ2 β i=1 2 i 2π i ( k ) 1 = e y 2 i 2 i e θ βz k Iβ,k θ 2 β I β,k /2 2π i i=1 = h (k, y k, I 1,..., I k ) e θ βz k Iβ,k θ 2 β I β,k /2

Joint Density of (y 1,..., y k ) There are two primary implications from the previous results By factorization, we see that (Z T, T ) is sufficient for θ β Z T / I β,t is the MLE of θ β

Implications Implications of the sufficiency of (Z T, T ) for θ β The only information about θ β is contained in the stopping time and final Z That is, it only matters that you reached the kth stopping time. The exact path followed to the kth stopping time is irrelevant This should be somewhat intuitive given that the Z s have independent increments The final increment z k Iβ,k z k 1 Iβ,k 1 is independent of the first k 1 test statistics

Sub-densities of Z k To this point we have consider the joint density of (Z 1,..., Z k ) and (y 1,..., y k ) We might also consider the sub-densities of Z k, f (k, z k θ β ) The sub-densities can be found by integrating over all paths that result in terminating at the kth interim analysis

Sub-densities of Z k That is, the kth sub-density, f (k, z k θ β ) is defined as f (k, z k θ β ) = h (k, y k, I 1,..., I k ) e θ 2 βz k Iβ,k θβ I β,k /2 dy k 1... dy 1 B k( y) where B k ( y) is the set of all paths that result in terminating at the kth interim analysis

Sub-densities of Z k Note that if θ β = 0, f (k, z k θ β ) = h (k, y k, I 1,..., I k ) e θ 2 βz k Iβ,k θβ I β,k /2 dy k 1... dy 1 B k( y) = h (k, y k, I 1,..., I k ) dy k 1... dy 1 B k( y)

Sub-densities of Z k This implies that f (k, z k θ β ) = f (k, z k 0) e θ βz k Iβ,k θ 2 β I β,k /2 This is a helpful because it allows us to easily calculate sub-densities at multiple values of θ β.

Defining the sub-densities recursively The previous integral is potentially nasty Luckily, the sub-densities can be defined recursively, which aids in computation

Defining the sub-densities recursively The general form of the sub-densities is { g k (z θ β ) if z / C k f (k, z k θ β ) = 0 if z C k

Defining the sub-densities recursively Sub-density at the first interim analysis. At the first interim analysis, Z 1 is normally distributed with mean θ β I1 and variance 1 Therefore ) g 1 (z θ β ) = φ (z θ β I1

Defining the sub-densities recursively For k = 2,..., K, g k is defined recursively as ( Ik z I k u ) I k 1 k θ β g k (z θ β ) = g k 1 (u θ β ) φ du C k 1 k k

Defining the sub-densities recursively Essentially, each sub-density is the kernel of a normal density multiplied a factor accounting for the possibility of terminating early The inflation factor is determined by integrating over all sample paths that result in terminating at the kth interim analysis using the recursive procedure described before

Sub-Densities: Example Consider a group sequential design with O Brien-Fleming stopping boundaries and α = 0.10 k = 4 90% power to reject assuming that θ β = δ

Sub-Densities: Example theta = 0 sub densities 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 z

Sub-Densities: Example theta =.5 * delta sub densities 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 z

Sub-Densities: Example theta = delta sub densities 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 z

Sub-Densities: Example theta = 1.5 * delta sub densities 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 z

Sub-Densities and stopping times It should be noted that the sub-densities do not integrate to 1 Integrating each sub-density will give the probability of stopping at that interim analysis Pr (T = k θ β = θ) = f (k, z θ β = θ) dz z / C k In contrast, the sum of the k integrals will equal 1

Sub-Densities and stopping times: Example For example, if θ β = 0 and assuming the O Brien-Fleming design discussed before Pr (T = 1) = 0.0006 Pr (T = 2) = 0.0140 Pr (T = 3) = 0.0358 Pr (T = 2) = 0.9496

Sub-Densities and stopping times: Example If θ β = δ Pr (T = 1) = 0.0239 Pr (T = 2) = 0.3407 Pr (T = 3) = 0.3594 Pr (T = 2) = 0.2760

Estimating β To this point, we have considered distribution theory for a group sequential test of a general parameter β We are also interested in point and interval estimates of β In a fixed-sample test, point and interval estimates of β are based on a normal sampling distribution for ˆβ We have seen that implementing a group sequential procedure changes the sampling distribution of Z Group sequential procedures also change the sampling distribution of ˆβ and thus changes our approach to estimation after a group sequential test

Sampling distribution of ˆβ Previously, we defined the sub-densities of Z k, f (k, z k θ) It should be clear that the overall density is simply K f (z θ) = f (k, z k θ) How do we use this result to derive the sampling density of ˆβ? k=1

Sampling distribution of ˆβ Recall that Z k = ( ˆβ β0 ) Ik Therefore, the sampling density of ˆβ at ˆβ = y is f (y β) = K f k=1 ( k, (y β 0 ) I k θ) Ik Note that θ = (β β 0 ), in which case, conditioning on θ is synonimous to conditioning on β

Sampling distribution of ˆβ: example Consider the case where x 1, x 2,..., x 128 are i.i.d. N ( µ, σ 2 = 20 ) We want to complete a group sequential test of H 0 : µ = 0 In this case, ˆβ = X

Sampling distribution of ˆβ: example Consider a group sequential design with O Brien-Fleming stopping boundaries and α = 0.10 K = 4 In this case, I 1 I 4 = 1.6, 3.2, 4.8, 6.4

Density ˆβ: Example beta = 0 f(beta_hat) 0.0 0.2 0.4 0.6 0.8 1.0 3 2 1 0 1 2 3 beta_hat

Density ˆβ: Example beta = 0.5 f(beta_hat) 0.0 0.2 0.4 0.6 0.8 1.0 3 2 1 0 1 2 3 beta_hat

Density ˆβ: Example beta = 1 f(beta_hat) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 3 2 1 0 1 2 3 beta_hat

Density ˆβ: Example beta = 0 f(beta_hat) 0.0 0.2 0.4 0.6 0.8 1.0 3 2 1 0 1 2 3 beta_hat

Density ˆβ: Example We see that the sampling distribution is substantially difference when a group sequential test is used The sampling distribution is no longer normal and, therefore, interval estimates based on the normal approximation are no longer valid The difference between the sampling density under the group sequential test and the usual sampling density becomes more dramatic as β moves away from the null hypothesis

Expected value of ˆβ The expected value of ˆβ after a group sequential test can be expressed as [ ] E β ˆβ = β 0 + K i=1 z / C k z Ik f (k, z β) dz For simplicity, we will now consider a two-stage design with cotinuation region C 1 = (a, b) in order to illustrate the bias due to a group sequential clinical trial

Expected value of ˆβ After a two-stage design The expected value of ˆβ after a two-stage design can be expressed as: [ a E ˆβ] = β 0 + + + z 1 I1 φ b a b z 1 φ (z 1 θ ) I 1 dz 1 I1 (z 1 θ I 1 ) dz 1 z 2 φ (z 1 θ ) ( ) z2 I2 z 1 I1 (I 2 I 2 ) θ I 1 φ dz 2 dz 1 I2 I 1 I2 I 1

Expected value of ˆβ After a two-stage design At stage 1, z 1 is a truncated normal random variable with mean θ I 1 and variance 1 and b a z 1 φ (z 1 θ ) I 1 dz 1 = θφ (a θ ) I 1 φ ( a θ ) I 1 I1 I1 z 1 I1 φ (z 1 θ I 1 ) dz 1 = θ ( 1 Φ (b θ I 1 )) + φ ( b θ I 1 ) I1

Expected value of ˆβ After a two-stage design Consider the double integral, we see that ( ) z 2 z2 I2 z 1 I1 (I 2 I 1 ) θ φ dz 2 I2 I 1 I2 I 1 Is simply the expected value of x I 2, where x is a normally distributed random variable with mean ( z 1 I1 I 2 I 1 θ ) and variance I 2 I 2. Therefore: z 2 I2 I 1 φ ( z2 I2 z 1 I1 (I 2 I 1 ) θ I2 I 1 ) dz 2 = z 1 I1 + (I 2 I 1 ) θ I 2

Expected value of ˆβ After a two-stage design Therefore, a b a = b = θi ( 1 Φ I 2 z 2 φ (z 1 θ ( ) ) z 2 I2 z 1 I1 (I 2 I 2 ) θ I 1 φ dz 2 dz 1 I2 I 1 I2 I 1 z 1 I1 + (I 2 I 1 ) θ I 2 (b θ ) I 1 Φ + (I 2 I 1 ) θ ( I 2 ( =θ Φ (b θ I 1 ) Φ Φ (b θ ) I 1 φ (z 1 θ ) I 1 dz 1 (a θ )) I 1 + ((φ(a θ ) ( I 1 φ b θ )) I 1 I1 /I 2 Φ (a θ )) I 1 (a θ )) I 1 + ((φ(a θ ) ( I 1 φ b θ )) I 1 I1 /I 2

Expected value of ˆβ After a two-stage design Summing everything up, we get a β 0 + + b a + b z 1 φ (z 1 θ ) I 1 dz 1 I1 z 1 I1 φ (z 1 θ I 1 ) dz 1 z 2 φ (z 1 θ ( ) ) z 2 I2 z 1 I1 (I 2 I 2 ) θ I 1 φ dz 2 dz 1 I2 I 1 I2 I 1 =β 0 + θφ (a θ ) φ I 1 (a θ I 1 ) ( + θ 1 Φ (b θ )) φ (b θ ) I 1 I 1 + I1 I1 ( + θ Φ (b θ ) I 1 Φ (a θ )) I 1 + ((φ(a θ ) ( I 1 φ b θ )) I 1 I1 /I 2 =β + ((φ(b θ I 1 ) φ ( a θ I 1 )) I1 I 2 I1 I 2

Bias of ˆβ From the previous slide, we see that the bias in ˆβ is [ E ˆβ] = β + ((φ(b θ ) ( I 1 φ a θ )) I1 I 2 I 1 I1 I 2 = β + b (β) where the bias, b (β) depends on β a and b I 1 and I 2

Bias of ˆβ: Example Consider a two-stage design with O Brien-Fleming boundaries with α = 0.05 a 1 = 2.80 b 1 = 2.80 I 1 = 1.6 I 2 = 3.2

Bias of ˆβ: Example bias 0.15 0.10 0.05 0.00 0.05 0.10 0.15 4 2 0 2 4 beta

Bias of ˆβ: Example What if we double the information? I 1 = 3.2 I 2 = 6.4

Bias of ˆβ: Example bias 0.10 0.05 0.00 0.05 0.10 4 2 0 2 4 beta

Bias of ˆβ: Example Our first example considered symmetric bounds In this case, the bias was naturally symmetric about 0 What if we use asymmetric bounds? a 1 = 0.38 b 1 = 2.00 Information same as before

Bias of ˆβ: Example bias 0.15 0.10 0.05 0.00 0.05 0.10 0.15 4 2 0 2 4 beta

Bias of ˆβ: Example Again, doubling the information I 1 = 3.2 I 2 = 6.4

Bias of ˆβ: Example bias 0.10 0.05 0.00 0.05 0.10 4 2 0 2 4 beta

Bias of ˆβ: Summary Implementing a group sequential procedure results in substantial bias for ˆβ Bias is smallest at the extremes and in the middle of the continuation region Where the study either stops early or continues to full enrollment with high probability Bias is symmetric for symmetric bounds and asymmetric for asymmetric bounds

Correcting the Bias We will consider two estimators for correcting the biased caused by a group sequential design Whitehead s mean adjusted estimator UMVUE suggested by Emerson and Fleming

Whitehead s Mean adjusted Estimator Whitehead s mean adjusted estimator is defined as ˆβ w, such that ˆβ = ˆβ ( ) w + b ˆβw That is, whitehead s mean adjusted estimator is the true value of beta that results in an expectation equal to the observed ˆβ

Properties of Whitehead s Mean adjusted estimator ˆβw must be found by numerical search ˆβw is only biased adjusted and not unbiased

UMVUE Emerson and Fleming proposed the UMVUE defined as [ ˆβ umvue = E ˆβ 1 (T, Z T )] Where ˆβ 1 is the estimate of ˆβ 1 after stage 1 Note that ˆβ 1 is an unbiased estimator of β and we find the UMVUE by the Rao-Blackwell technique

Properties of the UMVUE ˆβumvue has the minimum variance among the class of unbiased estimators Unbiasedness is a restrictive property and the set of unbiased estimators is narrow This estimator has substantial bias and, in fact, has larger MSE than ˆβ w

Estimating β Implementing a group sequential design dramatically impacts the sampling distribution of ˆβ This results on substantial bias in ˆβ depending on the true value of β Unbiased or bias-reduced estimators have been proposed but we need to be mindful of the mean-variance trade-off when evaluating these estimators