Sequential Bonferroni Methods for Multiple Hypothesis Testing with Strong Control of Familywise Error Rates I and II

Size: px
Start display at page:

Download "Sequential Bonferroni Methods for Multiple Hypothesis Testing with Strong Control of Familywise Error Rates I and II"

Transcription

1 Sequential Bonferroni Methods for Multiple Hypothesis Testing with Strong Control of Familywise Error Rates I and II Shyamal K. De and Michael Baron Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, USA Abstract: Sequential procedures are developed for simultaneous testing of multiple hypotheses in sequential experiments. Proposed stopping rules and decision rules achieve strong control of both familywise error rates I and II. The optimal procedure is sought that minimizes the expected sample size under these constraints. Bonferroni methods for multiple comparisons are extended to sequential setting and are shown to attain an approximately 50% reduction in the expected sample size compared with the earlier approaches. Asymptotically optimal procedures are derived under Pitman alternative. Keywords: Asymptotic optimality; Familywise error rate; Multiple comparisons; Pitman alternative; Sequential probability ratio test; Stopping rule. Subject Classifications: 62L, 62F03, 62F05, 62H INTRODUCTION The problem of multiple hypothesis testing often arises in sequential experiments such as sequential clinical trials with more than one endpoint see O Brien 1984, Pocock et al. 1987, Jennison and Turnbull 1993, Glaspy et al. 2001, and others, multichannel change-point detection Tartakovsky et al. 2003, Tartakovsky and Veeravalli 2004, acceptance sampling with different criteria of acceptance Baillie 1987, Hamilton and Lesperance 1991, etc. In such experiments, it is necessary to find a statistical answer to each posed question by testing each individual hypothesis instead of combining all the null hypotheses and giving one global answer to the resulting composite hypothesis. This paper develops sequential methods that result in accepting or rejecting each null hypothesis at the final stopping time. Non-sequential methods of multiple comparisons are already well developed. Proposed procedures include Bonferroni and Sidak testing, Holm and Hommel step-down methods controlling familywise error rate, Benjamini-Hochberg step-up and Guo-Sarkar method controlling false discovery rate, and others, see Sidak 1967, Holm 1979, Benjamini and Hochberg 1995, Benjamini et al. 2004, and Sarkar There is a growing demand for sequential analogues of these procedures. Furthermore, sequential tests of single hypotheses can be designed to control both probabilities of Type I and Type II errors. Generalizing to multiple testing, we develop sequential schemes to control both familywise error rates I and II in the strong sense. In the literature of sequential multiple comparisons, two types of testing problems are commonly understood as sequential multiple hypothesis testing. One of them is to compare effects θ 1,..., θ d of d > 1 Address correspondence to Shyamal K. De, Department of Mathematical Sciences, FO 35, The University of Texas at Dallas, 800 West Campbell Road, Richardson, TX , USA; Fax: ; shyamalkd@gmail.com 1

2 treatments with the goal of finding the best treatment, as in Edwards and Hsu 1983, Edwards 1987, Hughes 1993, Betensky 1996, and Jennison and Turnbull 2000, chap. 16. This involves testing of a composite null hypothesis H 0 : θ 1 = = θ d against H A : H 0 not true. The other type of problems is to classify sequentially observed data into one of the desired models, see Armitage 1950, Baum and Veeravalli 1994, Novikov 2009, and Darkhovsky In this case, the parameter of interest θ is classified into one of d alternative hypotheses, namely, H 1 : θ Θ 1 vs,, vs H d : θ Θ d. This paper deals with a different and more general problem of testing d individual hypotheses sequentially Problem Formulation Suppose a sequence of iid random vectors X n, n = 1, 2,...} is observed, where n is a sampled unit and X n = X n 1,..., X n d R d is a set of d measurements, possibly dependent, made on unit n. Let the j th measurement on each sampled unit has a marginal density f j x θ j with respect to a reference measure µ j, j = 1,..., d. While vectors X n are observed sequentially, the d tests are conducted simultaneously H 1 0 : θ 1 Θ 01 vs H 1 A : θ1 Θ A1, H d 0 : θ d Θ 0d vs H d A : θd Θ Ad. Denote the index set of true null hypotheses by J 0 1,..., d} and the set of false null hypotheses by J A = 1,..., d} \ J 0. Further, let ψ j = ψ j X n } equal 1 if H j 0 is rejected and equal 0 otherwise. Then the familywise error rate I FWER I J 0 = P ψ j > 0 J 0 j J0 is defined as the probability to reject at least one true null hypothesis and therefore, commit at least one Type I error, the familywise error rate II FWER II J A = P 1 ψ j > 0 J A, j J A is the probability of accepting at least one false null hypothesis and committing at least one Type II error, and the familywise power is the probability of rejecting all the false null hypotheses, FP = 1 FWER II e.g., Lee 2004, chap. 14. This paper develops stopping rules and multiple decision rules that control both FWER I and FWER II in the strong sense, i.e., FWER I = sup FWER I J 0 α and FWER II = sup FWER II J A β 1.2 J 0 J A for pre-specified levels α and β. This task alone is not difficult to accomplish. For example, one can conduct d separate sequential probability ratio tests, one for each null hypothesis, at levels α/d and β/d each. However, this leaves room for optimization. Our goal is to control both familywise error rates at the 2

3 minimum sampling cost. Namely, an optimization problem of minimizing the expected sample size ET under the strong control of FWER I and FWER II is investigated. A battery of one-sided tests H j 0 : θ j θ j 0 vs H j A : θj > θ j 0, j = 1,..., d}, is considered, which is typical for clinical trials of safety and efficacy. For example, in a case study of early detection of lung cancer Lange et al. 1994, chap. 15, 30,000 men at high risk of lung cancer were recruited sequentially over 12 years for testing three one-sided hypotheses whether intensive screening improves detectability of lung cancer, whether it reduces mortality due to lung cancer, and whether a surgical treatment is more efficient for the reduction of mortality rate than a nonsurgical treatment. In another study, a crossover trial of chronic respiratory diseases Pocock et al. 1987, Tang and Geller 1999 involved 17 patients with asthma or chronic obstructive airways disease to test any harmful effect of an inhaled drug based on four measures of the standard respiratory function Existing Approaches and Our Objective Sequential methods for simultaneous testing of multiple hypotheses have recently received a new wave of attention. Jennison and Turnbull 2000, chap. 15 develop group sequential methods of testing several outcome measures of a clinical trial. Their sampling strategy is to stop at the first time when one of the tests is accepted or rejected. Protecting FWER I conservatively, all the null hypotheses that are not rejected at the stopping time are accepted. While strongly controlling rate of falsely rejected null hypotheses, this testing scheme sacrifices power significantly. Tang and Geller 1999 introduce a closed testing procedure for group sequential clinical trials with multiple endpoints. Their method controls FWER I and yields power that is at least the power obtained by the Jennison-Turnbull procedure. In a recent work by Bartroff and Lai 2010, conventional methods of multiple testing such as Holm s step-down procedure and closed testing technique have been combined to develop a multistage step-down procedure that also controls FWER I. Their method does not assume any special structure of the hypotheses such as closedness or any correlation structure among different test statistics and can be applied to any group sequential, fully sequential, or truncated sequential experiments. All these methods mainly aim at controlling FWER I. To the best of our knowledge, no theory has been developed to control both FWER I and FWER II and also to design a sequential procedure that yields an optimal expected sampling cost under these constraints. The next Section extends Bonferroni methods for multiple comparisons to sequential experiments. Existing stopping rules are evaluated and modified to achieve lower familywise error rates or lower expected sampling cost. In Section 3, asymptotically optimal testing procedures are derived under Pitman alternative. Controlling FWER I and FWER II in the strong sense, they attain the asymptotically optimal rate of the expected sample size. Finite-sample performances of the proposed schemes are evaluated in a simulation study in Section 4. All lengthy proofs are moved to the Appendix. 2. SEQUENTIAL BONFERRONI METHODS In this section, we propose and compare four sequential approaches for testing d hypotheses controlling FWER I and FWER II that are based on the Bonferroni inequality. Assuming the Monotone Likelihood Ratio MLR property of the likelihood ratios and introducing a practical region of indifference θ j 0, θj A for parameter θj, the one-sided testing considered in Section 1 becomes equivalent to the test of H j 0 : θ j = θ j 0 vs H j A : θj = θ j A for all j = 1,..., d. 3

4 Following Neyman-Pearson arguments for the uniformly most powerful test and Wald s sequential probability ratio test, our methods are based on log-likelihood ratios Λ j n = n i=1 ln f jx j i θ j A f j X j i θ j 0 = n i=1 Z j i for j = 1,..., d. 2.1 With stopping boundaries a j and b j, after n observations, we reject H j 0 if Λ j n Λ j n b j, and continue sampling if Λ j n a j, accept H j 0 if b j, a j. The SPRT stopping time for the j th test is therefore T j = inf } n : Λ j n / b j, a j. 2.2 If Type I and Type II errors for the j th test are α j and β j, Wald s approximation gives the approximate stopping boundaries as 1 βj βj a j ln and b j ln. 2.3 α j 1 α j Now, consider any stopping time τ based on the likelihood ratios Λ j n and some stopping boundaries b j, a j such that Type I and Type II errors of the j th test are controlled at levels α j and β j respectively for j = 1,..., d. The strong control of the familywise error rates I and II at levels α and β follows from the Bonferroni inequality. That is, FWER I FWER II P Type I error on j th test J 0 = α j α, P Type II error on j th test J 0 = β j β for any index set J 0 of true null hypotheses Choice of Stopping and Decision Rules In this Subsection, we discuss stopping and decision rules that appeared in the sequential multiple testing literature and propose their improvements. One multiple decision sequential rule is proposed in Jennison and Turnbull 2000, chap Tmin rule. Jennison and Turnbull proposed a Bonferroni adjustment for the sequential testing of multiple hypotheses. According to their procedure, sampling is terminated as soon as at least one test statistics Λ j n leaves the continue-sampling region b j, a j. This stopping rule that we call Tmin is therefore T min = mint 1,..., T d, where stopping times T j are defined in 2.2. If Λ k n falls in its continue-sampling region at time T min for some k j, then H k 0 is considered accepted. For the stopping boundaries 2.3, the Tmin rule controls FWER I conservatively. However, due to an early stopping the Tmin rule fails to control FWER II and, hence, does not yield a high familywise power. In order to control both FWER I and II, sampling must continue beyond T min. 2. Incomplete T rule. For the control of both Type I and Type II error probabilities for testing each parameter θ j, we propose to continue sampling the j th coordinate at n = T j until Λ j n / b j, a j and 4

5 Accept H 1 Reject H 2 T int 5 Accept both a 2 b 2 Λ 2 n b 1 a 1 Reject both 0 Λ 1 n 5 T 5 T min Reject H 1 Accept H 2 Figure 1. Graphical Display of Tmin, T complete and incomplete, and intersection rule, d = 2. to reject H j 0 if and only if Λ j T j a j. Based on Wald s approximation 2.3, this rule yields approximate control of both Type I and Type II error probabilities at α j and β j for the j th test for each j = 1,..., d. This rule that we call incomplete T continues sampling until results for all tests are obtained, T = T 1,..., T d T min. It is incomplete because for all j with T j < T, only a portion of the available data X j 1,..., Xj T j is utilized to reach a decision. Sobel and Wald 1949 used a similar approach for a different sequential problem of classifying the unknown mean of a normal distribution into one of three regions. According to their rule, testing based on the j th likelihood ratio j = 1, 2 stops as soon as it leaves its continue-sampling region. Using the incomplete T rule, the number of used measurements taken on each sampling unit decreases in the course of this sequential scheme, although the cost of each unit e.g., patient is roughly the same. Typically, it is of no extra cost to ask patients to complete the entire questionnaire instead of a portion of it. Moreover, it may be important to include the entire new observed vectors because it may be necessary to retest some of the hypotheses. Indeed, if one or several test statistics Λ j n return to the continue-sampling region after crossing a stopping boundary once, and the corresponding hypothesis is retested based on a larger sample size n, it may further reduce the rate of false discoveries and false non-discoveries. Therefore, we propose an improvement of the incomplete T decision rule that is based on the same stopping time T but utilizes all the available data. 3. Complete T rule. As a plausible improvement over the incomplete T rule, we propose the complete T rule that continues sampling each coordinate of X and testing the corresponding hypothesis until time T. The sample size requirement for both complete and incomplete T rules are the same although the complete rule utilizes T T j additional observations for testing the j th hypothesis. Based on all the available data at time T, different versions of the terminal decision rules can be considered. Following the SPRT, one can accept or reject H j 0 depending on whether Λ j T is below the acceptance boundary b j or above the rejection boundary a j. However, some of the log-likelihood ratios may fall in the continue-sampling region b j, a j at T. An additional boundary c j [b j, a j ] has to be 5

6 pre-chosen to decide on the j th test. In Section 4, complete and incomplete T rules are compared by a simulation study; superiority of the complete rule is established with a certain choice of c j. Alternatively, we propose a randomized terminal decision rule that is based on a sufficient statistics for θ = θ 1,..., θ d at T unlike the incomplete T rule and show that it strongly controls FWER I and FWER II. Theorem 2.1. The randomized decision rule Dj that rejects Hj 0 at time T with probability p j = P Λ j T j a j Λ T, T strongly controls Type I and Type II error probabilities at levels α j and β j respectively. Proof. Let ψ j = I Λ j T j a j be the j th test function for the incomplete T decision rule D j and L θ j, D j = I θ j = θ j 0, reject Hj 0 + I θ j = θ j A, accept Hj 0 be the loss function for j th test. Also, note that Λ n = Λ 1 n,..., Λ d n is a sufficient statistic for θ for any n, and therefore, Λ T, T is a sufficient statistic Govindarajulu 1987, sect Hence, p j = E ψ j Λ T, T is free of θ. By the Rao-Blackwell theorem for randomized rules in Berger 1985, 36, we have EL θ j, Dj = EL θ j, D j. Since the the incomplete T decision rule Dj controls Type I and Type II error probabilities at levels α j and β j, the randomized rule Dj controls error probabilities at the same levels. 4. Intersection rule. To avoid sampling termination in the continue-sampling region of some coordinate and yet to use all coordinates of the already sampled random vectors, we propose the intersection rule. According to it, sampling continues until all log-likelihood ratios leave their respective continue-sampling regions, d T int = inf n : Λ j n / b j, a j. This is the first moment when SPRT-type decisions are obtained for all d tests simultaneously. Clearly, T int T a.s. The difference, however, is likely to be small, and it appears only when a log-likelihood ratio statistic for some j, having crossed the test boundary once, curved back into the continue-sampling region b j, a j. The following lemma gives sufficient conditions for T int to be a proper stopping time. Lemma 2.1. If for each j = 1..., d, the Kullback-Leibler information numbers } Kθ j, θ j = E θ j log f j X j θ j /f j X j θ j are strictly positive and finite for θ j θ j, then P T int < = 1. This lemma follows from the weak law of large numbers; see De and Baron 2012 for the complete proof. Consequently, T is also a proper stopping time. The strong control of FWER I and II by the intersection rule is shown in the following two statements. Lemma 2.2. Let τ be any stopping time with respect to σx 1,..., X n satisfying P Λ j τ b j, a j = 0 for a j > 0, b j < 0 for all j d. with the terminal decision rule of rejecting H j 0 if and only if Λ j τ a j. For such a test P Type I error on j th test = P Λ j τ P Type II error on j th test = P Λ j τ a j H j 0 e a j for all j = 1,..., d, b j H j A eb j for all j = 1,..., d. 6

7 The proof is given in De and Baron Corollary 2.1. With stopping boundaries a j = ln α j and b j = ln β j for j = 1,..., d, the intersection rule strongly controls FWER I and FWER II at levels α = α j and β = β j respectively. Proof. As a result of Lemma 2.2, Type I and Type II error probabilities are controlled at levels α j and β j for a j = ln α j and b j = ln β j. The corollary is established by applying the Bonferroni inequality and choosing α j, β j such that d α j α and d β j β. The three stopping times compared in this subsection are displayed in Fig. 1, based on a sample path of a two-dimensional log-likelihood ratio random walk. As seen in this figure, stopping time T min occurs when one of the coordinates of Λ n crosses the corresponding stopping boundary, T occurs when all the coordinates of Λ n have crossed their boundaries at least once, and finally, T int occurs when all the coordinates are beyond the boundaries simultaneously. In the case of T min and T int, the crossed boundaries determine the decision rules. For the data in Fig. 1, hypothesis H 1 0 is rejected at time T min, but later, both H 1 0 and H 2 0 are accepted at T int. At time T, the incomplete T rule rejects H 1 0 and accepts H 2 0 based on the first crossings. However, since Λ 1 T b 1, a 1 lands in the continue-sampling region, other decision rules can also be chosen, including the complete T rule with the decision boundary c j or the randomized rule D. 3. ASYMPTOTIC OPTIMALITY UNDER PITMAN ALTERNATIVE In this section, we give an asymptotically optimal solution to the problem considered in the previous sections. Namely, we find the optimal allocation of individual error probabilities α j and β j that yields the asymptotically lowest rate of the expected sample size subject to the constraints on familywise error rates I and II in the strong sense. Asymptotic analysis of the considered stopping times is based on the asymptotics of likelihood ratio and its expectation Kullback-Leibler information under Pitman alternative. The rigorous formulation of this optimization problem under Pitman alternative is as follows. Consider an array of sequences of observed random vectors X 11, X 21,, X i1 X 1ν, X 2ν,, X iν where X iν = X 1 iν,..., Xd iν. Observations X j iν in the νth row have a marginal density pmf or pdf f j. θ ν j for all j = 1,..., d. A battery of d simultaneous tests H j : θj ν = θ j 0 vs H j Aν : θj ν = θ j 0 + ϵ jν for j = 1,..., d, is considered, where min j d ϵ jν 0 as ν. The j th log-likelihood ratio statistic for the ν th row is then n Λ j n,ν = ln f j X j iν θj 0 + ϵ jν n i=1 f j X j iν = Z j θj i,ν for j = 1,..., d. 0 i=1 } Based on this statistic, let T jν = inf n : Λ j n,ϵ jν / B jν, A jν be the single-hypothesis Wald s SPRT stopping time for testing H j vs Hj Aν with stopping boundaries A jν = ln 1 β jν β jν and B jν = ln, α jν 1 α jν 7

8 where α jν and β jν are the individual Type I and Type II error probabilities allocated to the j th test. Also, let Tν = T 1ν,..., T dν be the corresponding stopping time for incomplete and complete T rules. In this section, we study the asymptotic behavior of stopping times T 1ν,..., T dν, and Tν as ν and also find the asymptotically optimal allocation of FWER I and FWER II among the d tests in order to minimize the weighted expected stopping time E π Tν = πj 0 E Tν J 0, 3.1 J 0 1,...,d} where πj 0 are weights or prior probabilities assigned to all possible combinations J 0 of true null hypotheses. This problem implies the asymptotic optimization of stopping boundaries B jν, A jν : j = 1,..., d}, or equivalently, the optimal allocation of individual error probabilities α jν, β jν : j = 1,..., d} among d tests. Asymptotic analysis presented here requires the following regularity conditions on the families of distributions f j. θ j : 1 j d }. Regularity Conditions Simplifying notations, we omit index j from θ j, H j, f j, µ j, etc., and consider the following problem. Suppose X 1, X 2,... are iid random variables with density f. θ. Consider testing of H 0 : θ = θ 0 vs H A : θ θ A, where θ A = θ 0 + ϵ. Define a log-likelihood ratio for one variable as Z ϵ = lnfx θ 0 + ϵ/fx θ 0 }. Assume that R1. The density f. θ is identifiable, i.e., θ θ implies P θ fx 1 θ fx 1 θ } > 0. R2. Identity fx θdx = 1 can be differentiated twice under the integral sign. R3. Derivatives l x, θ and l x, θ of log-likelihood lx, θ = ln fx θ are bounded by integrable functions in some neighborhood of θ 0. That is, there are functions L 1 x and L 2 x and δ > 0 such that θ θ 0 < δ implies E θ L j X 1 < for j = 1, 2, and l x, θ 2 L 1 x and l x, θ L 2 x µ-a.s. R4. The moment generating function Ms, ϵ = M Zϵ s = E exps Z ϵ } of Z ϵ exists for some s > 0, and for this s, 3 ϵ 3 Ms, ϵ is bounded for all ϵ within a neighborhood of 0. Conditions R1, R2, and the first part of R3 correspond to conditions A 0, RR3, and R of Borovkov 1997, chap. 2, 16, 26, and 31. Asymptotics of Log-likelihood Ratio Under regularity conditions R1-R4, the log-likelihood ratio Z ϵ is expanded as Z ϵ = ln fx θ 0 + ϵ ln fx θ 0 = ϵl X, θ 0 + ϵ2 2 l X, θ 0 + R 2,ϵ, by the Taylor theorem, where the Lagrange form of the remainder is R 2,ϵ = ϵ3 6 l X, θ and θ [θ0, θ A ]. 8

9 By regularity condition R3, E R 2,ϵ ϵ3 6 E L 2X = Oϵ 3. Therefore, under H 0, E θ0 Z ϵ = ϵe θ0 l X, θ 0 + ϵ2 2 E θ 0 l X, θ 0 + Oϵ 3 = ϵ2 2 Iθ 0 + Oϵ Under H A, using conditions R1-R4, [ E θa Z ϵ = E θa ln fx θ ] A ϵ fx θ A = ϵe θa l X, θ A ϵ2 2 E θ A l X, θ A + Oϵ 3 = ϵ2 2 Iθ A + Oϵ Next, E θ0 Z ϵ = Kθ 0, θ 0 + ϵ and E θa Z ϵ = Kθ A, θ A ϵ, where Kθ 0, θ A = ln fx θ 0 fx θ A fx θ 0dx is the Kullback-Leibler information distance between f. θ 0 and f. θ A. Thus, the first-order Taylor expansion of Z ϵ is Z ϵ = ϵl X, θ 0 + ϵ2 2 l X, θ 1 where θ 1 [θ 0, θ A ]. Therefore, for θ in ϵ-neighborhood of θ 0, E θ Z 2 ϵ = ϵ 2 E θ l X, θ ϵeθ l X, θ 0 l X, θ 1 + ϵ2 4 E θ l X, θ } 2 1 ϵ 2 E θ l X, θ 0 } 2 + ϵ E θ l X, θ 0 2 E θ l X, θ ϵ2 4 E θ L 1 X ϵ 2 E θ l X, θ Oϵ } c 1 ϵ for some constant c 1 > 0. The boundedness of E θ l X, θ 0 2 follows from condition R3 because E θ l X, θ 0 2 E θ l X, θ 2 + θ θ0 E θ l X, θ 2 Eθ l X, θ + ϵeθ l X, θ 2 E θ L1 X + ϵe θ L 1 X E θ L 1 X + ϵe θ L 2 1X, by the Jensen inequality. Let Λ n,ϵ = n i=1 Z i,ϵ be the log-likelihood ratio of n variables. Note that EΛ 2 n,ϵ = n i=1 EZ 2 i,ϵ + i j nc 1 ϵ 2 + nn 1 EZ ϵ 2 nϵ 2 c 1 + n 1kϵ 2, EZ i,ϵ EZ j,ϵ for some positive constant k. The last inequality holds as 3.2 and 3.3 hold. Therefore, EΛ 2 n,ϵ cnϵ 2, 3.5 9

10 for any c c 1 +n 1kϵ 2. Inequalities 3.2, 3.3, 3.4, and 3.5 are frequently used to prove asymptotic results in the later sections. The last auxiliary result establishes the upper bounds for Chernoff entropy ρ ϵ = inf s E e sz ϵ under both null and alternative hypotheses. Its proof is given in the Appendix. Lemma 3.1. There exist positive constant ϵ 0, K 0, and K A such that for all ϵ ϵ 0, i inf E θ 0 e sz ϵ 1 K 0 ϵ 2, 0<s<1 ii inf E θ 1<s<0 A e sz ϵ 1 KA ϵ 2. Asymptotics of SPRT Stopping Time Next, we derive the asymptotic expressions for the single-hypothesis SPRT stopping time T SPRT ϵ = inf n Λ n,ϵ / B, A} for testing H 0 : θ = θ 0 vs H A : θ θ 0 + ϵ with stopping boundaries A and B. If Type I and Type II error probabilities are given as α and β respectively, Wald s approximation for the expected stopping time gives under H 0, and under H A, α ln E θ0 Tϵ SPRT 1 β α + 1 α ln β 1 α, 3.6 Kθ 0, θ 0 + ϵ 1 β ln 1 β E θa Tϵ SPRT α + β ln β 1 α. 3.7 Kθ A, θ A ϵ If the underlying density f satisfies regularity conditions R1-R4, inequalities 3.2 and 3.3 yield, for ϵ = θ A θ 0, Kθ 0, θ A = ϵ2 2 Iθ 0 + Oϵ 3 and Kθ A, θ 0 = ϵ2 2 Iθ A + Oϵ 3 = ϵ2 2 Iθ 0 + Oϵ Therefore, as ϵ 0 under H 0, and under H A, α ln E θ0 Tϵ SPRT 1 β ln E θa Tϵ SPRT 1 β α + 1 α ln β 1 α ϵ 2 Iθ 0 /2 1 β α + β ln β 1 α ϵ 2 Iθ 0 /2 1 + o1 = k 0T SPRT ϵ ϵ o1, o1 = k AT SPRT ϵ ϵ o As seen from 3.9 and 3.10, under both null and alternative hypotheses, ETϵ SPRT as ϵ 0. The next lemma establishes that the stopping time Tϵ SPRT converges to in probability at the rate of ϵ 2. Lemma 3.2. Under regularity conditions R1-R4, 1 P gϵ ϵ2 Tϵ SPRT gϵ 1 as ϵ 0, for any function gϵ such that gϵ as ϵ 0. Corollary 3.1. ϵ 2 T SPRT ϵ is bounded away in probability from 0 and as ϵ 0. The proofs of Lemma 3.2 and Corollary 3.1 are given in Appendix. 10

11 Asymptotics of Weighted Expected Stopping Time T From 3.9 and 3.10, we have E θj T jν 1 ϵ 2 jν for θ j θ j 0, θj A }, as ν, if α jν and β jν are bounded away from 0 for j = 1,..., d. The following lemma provides a lower bound for weighted expected T defined in 3.1. Lemma 3.3. If α jν and β jν are bounded away from 0, there exist constants k j such that } E w Tν j d Proof. Since Tν T jν a.s. for all j d, E w Tν = π J 0 E J0 Tν where J 0 1,...,d} = π j E H j is the marginal prior probability of H j 0 k j ϵ 2 jν 1 + o1. J 0 1,...,d} π J 0 E J0 T jν T jν + 1 π j E j H T jν, 3.11 Aν π j = E w Tν 1 j d J 0 1,...,d}, j J 0 π J 0. Maximizing 3.11 for j = 1,..., d, π j E j H T jν + 1 π j E j H T jν Aν } Next, from 3.9 and 3.10, since α jν and β jν are bounded away from 0, there exist constant k 0j and k Aj such that E j H T jν = k 0j 1 + o1 and E ϵ 2 j jν H T jν = k Aj 1 + o1. The proof is completed by ϵ Aν 2 jν substituting these expressions into As ν, E w Tν converges to at least at the rate of j d ϵ 2 jν. A natural question arises about the best distribution of error probabilities α j and β j, α j = α, β j = β, among d tests. Is equal error spending optimal? If not, is there an asymptotically optimal choice of error probabilities α jν, β jν such that E w Tν attains its lower bound? The answers are given in the following two subsections Case 1: One Test More Difficult than Others In general, by the difficulty of a test we understand the closeness between the tested null and alternative hypotheses that can be measured by Kullback-Leibler distance between the corresponding distributions. In view of 3.8, an equivalent measure is the absolute difference ϵ between the null and alternative parameters. Let the d th test be the most difficult test in order if ϵ dν ϵ jν, that is, ϵ dν = oϵ jν for all j = 1,..., d 1. Suppose the complete T rule rejects H j if and only if Λj Tν,ν c j for some fixed boundary c j [ln 1 β j α j, ln β j 1 α j ]. Define Type I and II error probabilities of j th test at Tν as α j T ν = P Λ j T ν c j H j and βj T ν = P Λ j T < c ν j H j Aν. The following theorem proves that if the d th test is the most difficult in order, then FWER I and FWER II are controlled asymptotically at levels α d and β d respectively. The proof of this theorem is given in Appendix. 11

12 Theorem 3.1. If ϵ dν = oϵ jν for all j = 1,..., d 1, as ν, then α j T ν α d T ν = o1 and β j Tν = o1 for j = 1,..., d 1, = α d + o1 and β d Tν = β d + o1. According to Theorem 3.1, all error probabilities of less difficult tests converge to zero whereas for the most difficult test, they converge to positive constants α d and β d. Hence, the familywise error rates are controlled asymptotically at levels α d and β d instead of α and β. For this reason, equal allocation of error probabilities is not optimal. Indeed, if equal errors α/d and β/d are allocated to all d tests, FWER I and II are controlled at levels lower than required, which has to be at the cost of larger sample size. The following theorem establishes that an optimal Type I and Type II error probability allocation distributes the error probabilities mostly to the most difficult test. Such allocation of error probabilities leads to asymptotically efficient sequential tests that yield asymptotically optimal rate of the expected sample size. Theorem 3.2. Let ϵ dν ϵ jν for j < d, and α ν 0, β ν 0 are chosen so that ln w j α ν = o ϵjν ϵ dν 2 and ln w j β ν = o for w j > 0, d 1 1 w j = 1, and j = 1,..., d 1. Then, error probabilities α dν = α α ν, β dν = β β ν, and, α jν = w j α ν, β jν = w j β ν for j < d, 2 ϵjν are asymptotically optimal as E w Tν attains the asymptotically lower bound E w Tν = k d 1 ϵ 2 + o dν ϵ 2, dν for a constant k d > 0. Proof. The error constraints are satisfied, ϵ dν α jν = α and β jν = β. Using 3.9, we have for all j = 1,..., d 1, ϵ 2 dν E H j T jν = ϵdν ϵ jν 2 α jν ln 1 βjν α jν + 1 α jν ln Iθ j 0 /2 + O ϵ jν βjν 1 α jν 0, as ν, because ϵdν ϵ jν 2 ln βjν 0 from the choice of β jν. Similarly, ϵ 2 dν E H j T jν 0, Aν as ν for j = 1,..., d 1. Thus E w T jν = π j E H j 1 T jν + 1 π j E j H T jν = o Aν ϵ 2 dν for j = 1,..., d 1. 12

13 Using 3.9 and 3.10 for testing H d 0 vs H d, we have where k d = π d α ln E w T dν = π d E H d A T dν + 1 π d E d H T dν = k d Aν ϵ 2 dν 1 β α + 1 α ln β 1 α Iθ d 0 /2 + 1 π d 1 β ln A lower bound for E w Tν is given by } E w Tν π d E d H T dν + 1 π d E d H T dν = k d Aν ϵ 2 dν and an upper bound for E w Tν is obtained as Hence, E w Tν = k d ϵ 2 dν E w Tν + o 1. ϵ 2 dν 3.2. Case 2: Tests of Same Order of Difficulty E w T jν = k d ϵ 2 dν 1 + o ϵ 2. dν 1 + o ϵ 2, dν 1 β α + β ln β 1 α Iθ d 0 / o ϵ 2, dν Here, we consider a situation when all tests have a common order of difficulty, that is, ϵ jν = r j ϵ ν for j = 1,..., d, where r 1,..., r d are constants, and ϵ ν 0 as ν. The following theorem proves that one should not allocate an error probability that is too large or too small to any particular test in this case. Theorem 3.3. An asymptotically optimal allocation of error probabilities α jν, β jν is such that α jν and β jν are bounded away from 0 for all j = 1,..., d and ν = 1, 2,..., and α jν = α, β jν = β. Then k j 1 j d rj 2 + o1 ϵ 2 νe w Tν k j r 2 j + o1, where k j = π j k 0 T jν + 1 π j k A T jν, functions k 0 and k A being defined in 3.9 and Proof. An upper bound for E w Tν is obtained as E w Tν } π j E j H T jν + 1 π j E j H T jν = Aν k j r 2 j ϵ2 ν 1 + o1 }, as α jν, β jν are bounded away from 0. Also, from Lemma 3.3, a lower bound for E w Tν is } E w Tν j d k j r 2 j ϵ2 ν 1 + o1. 13

14 Thus k j 1 j d rj 2 + o1 ϵ 2 νe w Tν k j r 2 j + o1. Theorems 3.2 and 3.3 establish the asymptotically optimal error spending strategy among the d tests that attains the smallest rate of the expected sample size under the specified constraints on FWER I and II. 4. SIMULATION STUDY Sequential procedures for testing multiple hypotheses proposed above and in the literature are compared in this section. To cover all the interesting cases, simultaneous testing of several Normal means µ j and Bernoulli proportions p j is considered under a strong control of FWER I and FWER II at levels α = 0.05 and β = 0.1 respectively. Comparison of Six Schemes with Uniform Error Spending Table 1 compares performance of six multiple testing procedures including the non-sequential Holm method from Holm 1979, the multi-stage step-down procedure from Bartroff and Lai 2010, Tmin, incomplete T, complete T, and intersection rules for the following multiple testing scenarios. A sequence of random vectors X 1, X 2,... R 3 is observed, where X i = X 1 i, X 2 i, X 3 i, X j i Nµ j, 1 for j = 1, 2, and X 3 i Bernoullip. Three tests about parameters µ 1,2 and p are conducted, H 1 : µ 1 = 0 vs µ 1 = 0.5, H 2 : µ 2 = 0 vs µ 2 = 0.5, and H 3 : p = 0.5 vs p = 0.75, with the uniform error spending α j, β j = α/d, β/d, j = 1, 2, 3, and stopping boundaries b j, a j = ln β j, ln α j for the log-likelihood ratio statistics. We consider the following multiple testing scenarios. Multiple testing 1: H 1 0, H2 0, and H3 0 are true, and the coordinates of X i are independent. Multiple testing 2: H 1 0, H2 0, and H3 A are true, and the coordinates of X i are independent. Multiple testing 3a: H 1 0, H2 A, and H3 A are true, and the coordinates of X i are independent. Multiple testing 3b: H 1 0, H2 A, and H3 A are true, and the two normal components X1 i and X 2 i have correlation coefficient For each situation, the estimated actual FWER I and II are presented as percentage, along with the estimated expected sample size ÊT and its standard error. Results are based on 55,000 simulated sequences. Discussion of Results in Table 1 In all the multiple testing scenarios, our proposed incomplete T, complete T, and intersection rule yield uniformly lower error rates than non-sequential Holm and Bartroff-Lai s multi-stage procedure with approximately 50% lower expected sample size. For complete T rule, the decision boundary c j = a j + b j /2 is used for all j = 1,..., d. As expected, complete T rule yields lower error rates than incomplete T rule while using the same sample size. Tmin rule controls FWER I conservatively and certainly saves on a sample size, however, it is at the expense of a high FWER II leading to a significant reduction of familywise power. The estimated expected sample size and FWER I for the Bartroff-Lai procedure are adopted from Bartroff and Lai

15 Table 1. Comparison of six multiple testing schemes for 3 tests. Multiple Tests Scheme FWER I % FWER II % ÊT S.E. of ÊT non-sequential Holm 4 NA 105 NA multiple testing 1 Bartroff-Lai 4.8 NA NA Tmin 0.86 NA incomplete T 3.77 NA complete T 3.37 NA intersection 2.24 NA non-sequential Holm 4.4 NA 105 NA multiple testing 2 Bartroff-Lai 4.2 NA 98.3 NA Tmin incomplete T complete T intersection non-sequential Holm 4.3 NA 105 NA multiple testing 3a Bartroff-Lai 3.2 NA 92.3 NA Tmin incomplete T complete T intersection non-sequential Holm 4.5 NA 105 NA multiple testing 3b Bartroff-Lai 3.6 NA 94.2 NA Tmin incomplete T complete T intersection Asymptotically Optimal Error Spending Consider now the asymptotic setup where one or more parameters under the alternative hypotheses are close to their corresponding null values. Again, parameters µ j and p j of Nµ j, 1 and Bernoullip j distribution are tested with stopping boundaries b j, a j = ln β j, ln α j. The following testing scenarios are compared. In 4a, 4b, 5a, and 5b, one test is more difficult than the others. Ignoring this, tests 4a and 5a continue to use the uniform error allocation, which is no longer optimal, as shown in Section 3. Conversely, tests 4b and 5b spend the error probabilities according to Theorem 3.2, allocating most of them to the most difficult test. In test 6, all four tests are equally difficult, so uniform error allocation is optimal by Theorem 3.3. Multiple testing 4a: Test a battery of d = 4 hypotheses: H 1 : µ 1 = 0 vs µ 1 = 0.1, H 2 : µ 2 = 0 vs µ 2 = 0.5, and H j : p j = 0.5 vs p j = 0.75 for j = 3, 4 with uniform error spending α j, β j = α/d, β/d for j = 1,..., 4. The set of true null hypotheses is J 0 = 1, 3}. Multiple testing 4b: Tests in 4a are conducted with Type I error probabilities 0.04, 0.004, 0.003, and Type II error probabilities 0.08, 0.008, 0.006, 0.006, allocating most of the errors to the most difficult test and ensuring 4 α j = 0.05 and 4 β j = 0.1. Multiple testing 5a: Same tests as in 4a, with J 0 = 3, 4}, i.e., H 3 0 and H 4 0 are true. Multiple testing 5b: Same tests as in 4b, with J 0 = 3, 4}. Multiple testing 6: Test H j : µ j = 0 vs µ j = 0.1 for j = 1,..., 4, with uniform error spending 15

16 α j, β j = α/4, β/4, and J 0 = 3, 4}. For each multiple testing procedure, Table 2 contains the estimates of FWER I and FWER II in percentages, expected stopping time ÊT, expected SPRT stopping times ÊT j for marginal tests H j, j = 1,..., 4, and marginal Type I and Type II error probabilities ˆα j and ˆβ j in percentages. Results are based on 55,000 simulated sequences. Discussion of Results in Table 2 Proposed schemes control FWER I and FWER II at 5% and 10% levels respectively. In all the testing scenarios, incomplete and complete T rule require the same sample size, but the complete T rule yields lower FWER I and FWER II than the incomplete T rule. This reduction in error rates is due to utilizing the full sample for each test and using a sufficient statistic based on T samples. Compared to the incomplete and complete T rule, the intersection rule reduces both error rates even further, but requires greater expected sample size. Equal allocation of error probabilities α/4, β/4 on each test of multiple testing 4a yields much larger expected sample size 731 compared to the expected sample size 478 obtained in multiple testing 4b. This reduction in expected sample size is due to allocating large portions of error probabilities 0.04 and 0.08 to the first test which is the most difficult in this case. Similarly, expected sample size reduces from 850 in multiple testing 5a to 571 in multiple testing 5b as most of the error probabilities are allocated to the most difficult test. These simulation results support Theorem APPENDIX Proof of Lemma 3.1 Taylor expansion of M 0 s, ϵ = E θ0 e sz ϵ yields M 0 s, ϵ = M 0 s, 0 + ϵm 0 s, 0 + ϵ2 2 M 0 s, 0 + R 02,ϵ, 5.1 where M 0 s, 0 = 1 and R 02,ϵ = M 0 s, ϵ ϵ3 6 = Oϵ3 for some 0 ϵ ϵ because M 0 s, ϵ is bounded by regularity condition R4. Differentiating M 0 s, ϵ, obtain [ ϵ E θ 0 e sz ϵ fx θ0 + ϵ s ] [ ] fx θ0 + ϵ s 1 f X θ 0 + ϵ = Eθ0 = E θ0 s. ϵ fx θ 0 fx θ 0 fx θ 0 [ ] Then, from R2, M 0 s, 0 = se f X θ 0 θ0 fx θ 0 = 0. Differentiating M 0 s, ϵ again at ϵ = 0, f M X θ s, 0 = ss 1E θ0 + se θ0 fx θ 0 Therefore 5.1 becomes M 0 s, ϵ = 1 s1 siθ 0 ϵ2 2 + Oϵ3. There exists ϵ 1 such that for all ϵ ϵ 1 and for fixed s 0, 1, f X θ 0 = s1 siθ 0. fx θ 0 s1 siθ 0 + Oϵ Hence, for any ϵ ϵ 1 and K 0 = s1 siθ 0 /4, s1 s Iθ 0. 2 inf M 0s, ϵ 1 K 0 ϵ 2. 0<s<1 16

17 Table 2. Comparing expected sample size and error rates obtained from the three schemes when some of the four tests are difficult. Multiple Tests Scheme FWER I % FWER II % ÊT ÊT 1,..., ÊT 4 ˆαj% ˆβj % incomplete T ˆα1, ˆα3 = 1.16, 1.06 ˆβ 2, ˆβ 4 = 1.89, 1.8 multiple testing 4a 730.7, 36.4, 27.2, 33.7 complete T ˆα1, ˆα3 = 1.16, ˆβ 2, ˆβ 4 = 0.005, Intersection ˆα1, ˆα3 = 1.16, 0 ˆβ 2, ˆβ 4 = 0, incomplete T ˆα1, ˆα3 = 3.54, 0.23 ˆβ 2, ˆβ 4 = 0.64, 0.46 multiple testing 4b 477.1, 46.2, 37.4, 45.5 complete T ˆα1, ˆα3 = 3.54, 0.01 ˆβ 2, ˆβ 4 = 0.04, Intersection ˆα1, ˆα3 = 3.53, ˆβ 2, ˆβ 4 = 0.009, incomplete T ˆα3, ˆα4 = 1.06, 1.07 ˆβ 1, ˆβ 2 = 2.36, 1.88 multiple testing 5a 850.1, 36.4, 27.2, 27.2 complete T ˆα3, ˆα4 = 0, ˆβ 1, ˆβ 2 = 2.36, 0 Intersection ˆα3, ˆα4 = 0, 0 ˆβ 1, ˆβ 2 = 2.36, 0 incomplete T ˆα3, ˆα4 = 0.23, 0.27 ˆβ 1, ˆβ 2 = 7.39, 0.64 multiple testing 5b 570.9, 46.2, 37.4, 37.3 complete T ˆα3, ˆα4 = 0.002, ˆβ 1, ˆβ 2 = 7.39, Intersection ˆα3, ˆα4 = 0, 0 ˆβ 1, ˆβ 2 = 7.37, 0 incomplete T ˆα3, ˆα4 = 1.14, 1.12 ˆβ 1, ˆβ 2 = 2.17, 2.3 multiple testing , 848.3, 729.3, complete T ˆα3, ˆα4 = 0.93, 0.91 ˆβ 1, ˆβ 2 = 1.62, 1.74 Intersection ˆα3, ˆα4 = 0.48, 0.45 ˆβ 1, ˆβ 2 = 0.83,

18 This proves i of Lemma 3.1. Similarly to 5.1, Taylor expansion of M A s, ϵ = E θa e sz ϵ yields M A s, ϵ = M A s, 0 + ϵm A s, 0 + ϵ2 2 M A s, 0 + R A2,ϵ, 5.2 where M A s, 0 = 1 and R A2,ϵ = M A s, ϵ ϵ3 6 = Oϵ3 for some 0 ϵ ϵ, because M A s, ϵ is bounded R4. Along the same steps, we derive M A s, 0 = 0 and M A s, 0 = ss + 1Iθ A. Therefore, 5.2 becomes M A s, ϵ = 1 ϵ2 2 [1 + s siθ A + Oϵ]. There exists ϵ 2 such that for all ϵ ϵ 2 and for fixed s 1, 0, 1 + s siθ A + Oϵ Hence, for any ϵ ϵ 2 and K A = 1 + s siθ A /4, 1 + s s Iθ A. 2 inf M As, ϵ 1 K A ϵ 2. 1<s<0 Both i and ii of Lemma 3.1 hold for all ϵ ϵ 0 = minϵ 1, ϵ 2. This completes the proof. Proof of Lemma 3.2 For an arbitrary ϵ > 0, 1 P gϵ ϵ2 Tϵ SPRT gϵ = 1 P ϵ 2 Tϵ SPRT 1 P ϵ 2 Tϵ SPRT > gϵ. 5.3 gϵ For all i = 1, 2,..., using 3.2 and 3.3, µ ϵ = E Z i,ϵ kϵ 2 for some positive constant k. Let A 0 = mina, B. Then P ϵ 2 Tϵ SPRT 1 = P Tϵ SPRT 1 gϵ ϵ 2 gϵ P P 1 A 0 1 n 1 ϵ 2 gϵ 1 n 1 ϵ 2 gϵ k gϵ 2 Λ n,ϵ A 0 Λ n,ϵ nµ ϵ A 0 k gϵ 1 ϵ 2 gϵ n=1 EZ n,ϵ µ ϵ The last inequality in 5.4 follows from the Kolmogorov imal inequality for independent random variables e.g., Billingsley From 3.4, EZ n,ϵ µ ϵ 2 c 1 ϵ 2 for some c 1 > 0. Therefore, 5.4 implies P ϵ 2 Tϵ SPRT 1 c 1 /gϵ gϵ 2 = o1 as ϵ A 0 k/gϵ 18

19 From the Markov inequality, P ϵ 2 T SPRT ϵ ϵ > gϵ ϵ2 ET SPRT. gϵ Then, using 3.9 and 3.10, ϵ 2 ETϵ SPRT = O1 as ϵ 0. Hence, The proof is completed by applying 5.5 and 5.6 in 5.3. Proof of Corollary 3.1 c 1 δ P ϵ 2 T SPRT ϵ gϵ = o1 as ϵ Consider δ such that A 0 < 1. Such a choice of δ exists because the quadratic form in δ has discriminant kδ 2 c A 1 0kc 1 > 0. Substituting δ for gϵ in the inequality 5.5, obtain and therefore, Thus, ϵ 2 T SPRT ϵ P ϵ 2 T SPRT ϵ P ϵ 2 T SPRT ϵ δ > δ 1 c 1 δ A 0 kδ 2, c 1 δ A 0 kδ 2 > 0. is bounded away from 0 in probability as ϵ 0. Next, applying the Markov inequality, lim λ Hence, ϵ 2 T SPRT ϵ = O P 1 as ϵ 0. lim sup P ϵ 2 Tϵ SPRT > λ lim lim sup ϵ 0 λ ϵ 0 Proof of Theorem 3.1 Let c d = A d = ln 1 βd α d, and B d = ln βd 1 α d. Consider α d T = P ν d H = P H d Λ d Tν Λ d Tν,ν A d, Tν = T dν + P d H,ν A d + o1 = α d + o1, ϵ 2 ETϵ SPRT = 0. λ Λ d Tν,ν A d, T ν T dν because P Tν T dν = o1. The last equality is subject to Wald s approximation ignoring the overshoot for the d th test. Similarly, β d T = P ν d H Aν = P H d Aν Λ d Tν Λ d Tν,ν < A d, Tν = T dν + P d H Aν,ν < A d + o1 = P d H Λ d Aν ν T Λ d Tν,ν B d,ν < A d, Tν T dν + o1 = β d + o1. 19

20 Then, for any sequence of positive integers g ν as ν such that g ν = o ϵ 2 dν and j<d ϵ 2 jν = o g ν, P Tν < g ν P T dν < g ν P Λ d 1 m g ν m,ν A 0d where A 0d = min A d, B d g ν E Z m,ν d 2 A0d Cϵ 2 dν g 2 using 5.4 ν m=1 K d ϵ 2 dν g ν A0d Cϵ 2 dν g ν 2 = o 1, where the last inequality follows from 3.4. Here, C and K d are some positive constants. For all j = 1, 2,..., d 1, α j T P ν j Λ j H T ν,ν c j, Tν g ν + P j H Tν < g ν for any 0 s 1. Note that P H j = P H j m g ν m i=1 e s m g ν e s m i=1 Zj i,ν } Z j i,ν c j m i=1 Zj i,ν m=n e sc j + o1 + o1 is a nonnegative supermartingale. Applying Theorem 1 in Lake 2000, which is analogous to the Doob s imal inequality for nonnegative submartingales, P j H e s m i=1 Zj i,ν e sc j e sc j E j m g ν H e s gν i=1 Z j i,ν An upper bound of α j T ν is obtained by taking infimum over all s, inf e s gν i=1 Z j i,ν α j T ν 0 s 1 e sc j E j H inf = E 0 s 1 H j inf E 0 s 1 H j e s gν i=1 Z j i,ν + o 1 e szj i,ν } gν + o 1 + o 1 1 L j ϵ 2 jν} gν + o 1 for some large ν = o 1. The last inequality is obtained by using Lemma 3.1. A similar approach proves that for all j < d, β j ν 0 as ν. T ACKNOWLEDGEMENTS This research is supported by the National Science Foundation grant DMS Research of the second author is partially supported by the National Security Agency grant H We thank the associate editor and professor Mukhopadhyay for their help and attention to our work. 20

21 REFERENCES Armitage, P Sequential Analysis with More Than Two Alternative Hypotheses, and Its Relation to Discriminant Function Analysis, Journal of Royal Statistical Society, Series B 12: Baillie, D. H Multivariate Acceptance Sampling Some Applications to Defence Procurement, Statistician, 36: Bartroff, J. and Lai, T. L Multistage Tests of Multiple Hypotheses, Communications in Statistics- Theory and Methods, 39: Baum, C. W. and Veeravalli, V. V A Sequential Procedure for Multihypotheses Testing, IEEE Transactions on Information Theory, 40: Benjamini, Y. and Hochberg, Y Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing, Journal of Royal Statistical Society, Series B 57: Benjamini, Y., Bertz, F. and Sarkar, S Recent Developments in Multiple Comparison Procedures, IMS Lecture Notes - Monograph Series, Beachwood, Ohio: Institute of Mathematical Statistics. Berger, J. O Statistical Decision Theory, New York: Springer-Verlag. Betensky, R. A An O Brien-Fleming Sequential Trial for Comparing Three Treatments, Annals of Statistics, 24: Billingsley, P Probability and Measure, New York: Wiley. Borovkov, A. A Matematicheskaya Statistika Mathematical Statistics, Novosibirsk: Nauka. Darkhovsky, B Optimal Sequential Tests for Testing Two Composite and Multiple Simple Hypotheses, Sequential Analysis, 30: De, S. K. and Baron, M Step-up and Step-down Methods for Testing Multiple Hypotheses in Sequential Experiments, Journal of Statistical Planning and Inference, DOI: /j.jspi Edwards, D Extended-Paulson Sequential Selection, Annals of Statistics, 15: Edwards, D. G. and Hsu, J. C Multiple Comparisons with the Best Treatment, Journal of American Statistical Association, 78: Glaspy, J., Jadeja, J. S., Justice, G., Kessler, J., Richards, D., Schwartzberg, L., Rigas, J., Kuter, D., Harmon, D., Prow, D., Demetri, G., Gordon, D., Arseneau, J., Saven, A., Hynes, H., Boccia, R., O Byrne, J., and Colowick, A. B A Dose-Finding and Safety Study of Novel Erythropoiesis Stimulating Protein NESP for the Treatment of Anemia in Patients Receiving Multicycle Chemotherapy, British Journal of Cancer, 84: Govindarajulu, Z The Sequential Statistical Analysis of Hypothesis Testing, Point and Interval Estimation, and Decision Theory, Columbus, Ohio: American Sciences Press. Hamilton, D. C. and Lesperance, M. L A Consulting Problem Involving Bivariate Acceptance Sampling by Variables, Canadian Journal of Statistics, 19: Holm, S A Simple Sequentially Rejective Multiple Test Procedure, Scandinavian Journal of Statistics, 6: Hughes, M. D Stopping Guidelines for Clinical Trials with Multiple Treatments, Statistics in Medicine, 12: Jennison, C. and Turnbull, B. W Group Sequential Tests for Bivariate Response: Interim Analyses of Clinical Trials with Both Efficacy and Safety Endpoints, Biometrics, 49: Jennison, C. and Turnbull, B. W Group Sequential Methods with Applications to Clinical Trials, Boca Raton: Chapman & Hall. Lake, D. E Minimum Chernoff Entropy and Exponential Bounds for Locating Changes, IEEE Transactions on Information Theory, 46:

22 Lange, N., Ryan, L., Billard, L., Brillinger, D., Conquest, L., and Greenhouse, J Case Studies in Biometry, New York: Wiley. Lee, M.-L. T Analysis of Microarray Gene Expression Data, Boston: Kluwer Academic Publishers. Novikov, A Optimal Sequential Multiple Hypothesis Testing, Kybernetika, 45: O Brien, P. C Procedures for Comparing Samples with Multiple Endpoints, Biometrics, 40: Pocock, S. J., Geller, N. L., and Tsiatis, A. A The Analysis of Multiple Endpoints in Clinical Trials, Biometrics, 43: Sarkar, S. K Step-up Procedures Controlling Generalized FWER and Generalized FDR, Annals of Statistics, 35: Sidak, Z Rectangular Confidence Regions for the Means of Multivariate Normal Distributions, Journal of American Statistical Association, 62: Sobel, M. and Wald, A A Sequential Decision Procedure for Choosing One of Three Hypotheses Concerning the Unknown Mean of a Normal Distribution, Annals of Mathematical Statistics, 20: Tang, D.-I. and Geller, N. L Closed Testing Procedures for Group Sequential Clinical Trials with Multiple Endpoints, Biometrics, 55: Tartakovsky, A. G. and Veeravalli, V. V Change-Point Detection in Multichannel and Distributed Systems with Applications, In N. Mukhopadhyay, S. Datta and S. Chattopadhyay, Eds., Applications of Sequential Methodologies, pages , New York: Marcel Dekker, Inc. Tartakovsky, A. G., Li, X. R., and Yaralov, G Sequential Detection of Targets in Multichannel Systems, IEEE Transactions on Information Theory, 49:

Multiple Testing in Group Sequential Clinical Trials

Multiple Testing in Group Sequential Clinical Trials Multiple Testing in Group Sequential Clinical Trials Tian Zhao Supervisor: Michael Baron Department of Mathematical Sciences University of Texas at Dallas txz122@utdallas.edu 7/2/213 1 Sequential statistics

More information

arxiv: v2 [stat.me] 6 Sep 2013

arxiv: v2 [stat.me] 6 Sep 2013 Sequential Tests of Multiple Hypotheses Controlling Type I and II Familywise Error Rates arxiv:1304.6309v2 [stat.me] 6 Sep 2013 Jay Bartroff and Jinlin Song Department of Mathematics, University of Southern

More information

PRINCIPLES OF OPTIMAL SEQUENTIAL PLANNING

PRINCIPLES OF OPTIMAL SEQUENTIAL PLANNING C. Schmegner and M. Baron. Principles of optimal sequential planning. Sequential Analysis, 23(1), 11 32, 2004. Abraham Wald Memorial Issue PRINCIPLES OF OPTIMAL SEQUENTIAL PLANNING Claudia Schmegner 1

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Sequential tests controlling generalized familywise error rates

Sequential tests controlling generalized familywise error rates Sequential tests controlling generalized familywise error rates Shyamal K. De a, Michael Baron b,c, a School of Mathematical Sciences, National Institute of Science Education and Research, Bhubaneswar,

More information

Multistage Tests of Multiple Hypotheses

Multistage Tests of Multiple Hypotheses Communications in Statistics Theory and Methods, 39: 1597 167, 21 Copyright Taylor & Francis Group, LLC ISSN: 361-926 print/1532-415x online DOI: 1.18/3619282592852 Multistage Tests of Multiple Hypotheses

More information

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials Testing a secondary endpoint after a group sequential test Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 9th Annual Adaptive Designs in

More information

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL

More information

Sequential Plans and Risk Evaluation

Sequential Plans and Risk Evaluation C. Schmegner and M. Baron. Sequential Plans and Risk Evaluation. Sequential Analysis, 26(4), 335 354, 2007. Sequential Plans and Risk Evaluation Claudia Schmegner Department of Mathematical Sciences, DePaul

More information

Discussion on Change-Points: From Sequential Detection to Biology and Back by David Siegmund

Discussion on Change-Points: From Sequential Detection to Biology and Back by David Siegmund This article was downloaded by: [Michael Baron] On: 2 February 213, At: 21:2 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

arxiv: v2 [math.st] 20 Jul 2016

arxiv: v2 [math.st] 20 Jul 2016 arxiv:1603.02791v2 [math.st] 20 Jul 2016 Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals Y. Song and G. Fellouris Department of Statistics,

More information

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity Prof. Kevin E. Thorpe Dept. of Public Health Sciences University of Toronto Objectives 1. Be able to distinguish among the various

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints The Design of Group Sequential Clinical Trials that Test Multiple Endpoints Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Bruce Turnbull

More information

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks A Gatekeeping Test in a Group Sequential Design 1 A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks Ajit C. Tamhane Department of Industrial

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box 90251 Durham, NC 27708, USA Summary: Pre-experimental Frequentist error probabilities do not summarize

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 18: 1184 1196, 2008 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400802369053 SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE

More information

Statistica Sinica Preprint No: SS R1

Statistica Sinica Preprint No: SS R1 Statistica Sinica Preprint No: SS-2017-0072.R1 Title Control of Directional Errors in Fixed Sequence Multiple Testing Manuscript ID SS-2017-0072.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0072

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation August 01, 2012 Disclaimer: This presentation reflects the views of the author and should not be construed to represent the views or policies of the U.S. Food and Drug Administration Introduction We describe

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design Comparing Adaptive Designs and the Classical Group Sequential Approach to Clinical Trial Design Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim Frank Bretz Statistical Methodology, Novartis Joint work with Martin Posch (Medical University

More information

INTRODUCTION TO INTERSECTION-UNION TESTS

INTRODUCTION TO INTERSECTION-UNION TESTS INTRODUCTION TO INTERSECTION-UNION TESTS Jimmy A. Doi, Cal Poly State University San Luis Obispo Department of Statistics (jdoi@calpoly.edu Key Words: Intersection-Union Tests; Multiple Comparisons; Acceptance

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

On Generalized Fixed Sequence Procedures for Controlling the FWER

On Generalized Fixed Sequence Procedures for Controlling the FWER Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing

More information

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping : Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Group-Sequential Tests for One Proportion in a Fleming Design

Group-Sequential Tests for One Proportion in a Fleming Design Chapter 126 Group-Sequential Tests for One Proportion in a Fleming Design Introduction This procedure computes power and sample size for the single-arm group-sequential (multiple-stage) designs of Fleming

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.100/sim.0000 Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

Multiple Dependent Hypothesis Tests in Geographically Weighted Regression

Multiple Dependent Hypothesis Tests in Geographically Weighted Regression Multiple Dependent Hypothesis Tests in Geographically Weighted Regression Graeme Byrne 1, Martin Charlton 2, and Stewart Fotheringham 3 1 La Trobe University, Bendigo, Victoria Austrlaia Telephone: +61

More information

A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics

A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics Introduction A Brief Introduction to Intersection-Union Tests Often, the quality of a product is determined by several parameters. The product is determined to be acceptable if each of the parameters meets

More information

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials Research Article Received 29 January 2010, Accepted 26 May 2010 Published online 18 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4008 Mixtures of multiple testing procedures

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information

Dose-response modeling with bivariate binary data under model uncertainty

Dose-response modeling with bivariate binary data under model uncertainty Dose-response modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,

More information

Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem

Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem Lei Yang, Ph.D. Statistical Scientist, Roche (China) Holding Ltd. Aug 30 th 2018, Shanghai Jiao Tong

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj

More information

Optimal SPRT and CUSUM Procedures using Compressed Limit Gauges

Optimal SPRT and CUSUM Procedures using Compressed Limit Gauges Optimal SPRT and CUSUM Procedures using Compressed Limit Gauges P. Lee Geyer Stefan H. Steiner 1 Faculty of Business McMaster University Hamilton, Ontario L8S 4M4 Canada Dept. of Statistics and Actuarial

More information

Research Article Degenerate-Generalized Likelihood Ratio Test for One-Sided Composite Hypotheses

Research Article Degenerate-Generalized Likelihood Ratio Test for One-Sided Composite Hypotheses Mathematical Problems in Engineering Volume 2012, Article ID 538342, 11 pages doi:10.1155/2012/538342 Research Article Degenerate-Generalized Likelihood Ratio Test for One-Sided Composite Hypotheses Dongdong

More information

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

Some General Types of Tests

Some General Types of Tests Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests.

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests. 1/35 Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests. Martin Posch Center for Medical Statistics, Informatics and Intelligent Systems Medical University

More information

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences Control of Directional Errors in Fixed Sequence arxiv:1602.02345v2 [math.st] 18 Mar 2017 Multiple Testing Anjana Grandhi BARDS, Merck Research Laboratories Rahway, NJ 07065 Wenge Guo Department of Mathematical

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials

On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials Anastasios A. Tsiatis and Cyrus Mehta http://www.stat.ncsu.edu/ tsiatis/ Inefficiency of Adaptive Designs 1 OUTLINE OF TOPICS Hypothesis

More information

2 Mathematical Model, Sequential Probability Ratio Test, Distortions

2 Mathematical Model, Sequential Probability Ratio Test, Distortions AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 153 162 Robust Sequential Testing of Hypotheses on Discrete Probability Distributions Alexey Kharin and Dzmitry Kishylau Belarusian State University,

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

Test Volume 11, Number 1. June 2002

Test Volume 11, Number 1. June 2002 Sociedad Española de Estadística e Investigación Operativa Test Volume 11, Number 1. June 2002 Optimal confidence sets for testing average bioequivalence Yu-Ling Tseng Department of Applied Math Dong Hwa

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

When enough is enough: early stopping of biometrics error rate testing

When enough is enough: early stopping of biometrics error rate testing When enough is enough: early stopping of biometrics error rate testing Michael E. Schuckers Department of Mathematics, Computer Science and Statistics St. Lawrence University and Center for Identification

More information

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013 Adaptive designs beyond p-value combination methods Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013 Outline Introduction Combination-p-value method and conditional error function

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

A Type of Sample Size Planning for Mean Comparison in Clinical Trials

A Type of Sample Size Planning for Mean Comparison in Clinical Trials Journal of Data Science 13(2015), 115-126 A Type of Sample Size Planning for Mean Comparison in Clinical Trials Junfeng Liu 1 and Dipak K. Dey 2 1 GCE Solutions, Inc. 2 Department of Statistics, University

More information

On weighted Hochberg procedures

On weighted Hochberg procedures Biometrika (2008), 95, 2,pp. 279 294 C 2008 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn018 On weighted Hochberg procedures BY AJIT C. TAMHANE Department of Industrial Engineering

More information

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics 1 Multiple Endpoints: A Review and New Developments Ajit C. Tamhane (Joint work with Brent R. Logan) Department of IE/MS and Statistics Northwestern University Evanston, IL 60208 ajit@iems.northwestern.edu

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

Hochberg Multiple Test Procedure Under Negative Dependence

Hochberg Multiple Test Procedure Under Negative Dependence Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014

More information

Alpha-Investing. Sequential Control of Expected False Discoveries

Alpha-Investing. Sequential Control of Expected False Discoveries Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint

More information

Multistage Methodologies for Partitioning a Set of Exponential. populations.

Multistage Methodologies for Partitioning a Set of Exponential. populations. Multistage Methodologies for Partitioning a Set of Exponential Populations Department of Mathematics, University of New Orleans, 2000 Lakefront, New Orleans, LA 70148, USA tsolanky@uno.edu Tumulesh K.

More information

Sequential Testing in Reliability and Validity Studies With Repeated Measurements per Subject

Sequential Testing in Reliability and Validity Studies With Repeated Measurements per Subject http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 8, No. 1; 019 Sequential Testing in Reliability and Validity Studies With Repeated Measurements per Subject Steven B. Kim

More information

On adaptive procedures controlling the familywise error rate

On adaptive procedures controlling the familywise error rate , pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

Finding Critical Values with Prefixed Early. Stopping Boundaries and Controlled Type I. Error for A Two-Stage Adaptive Design

Finding Critical Values with Prefixed Early. Stopping Boundaries and Controlled Type I. Error for A Two-Stage Adaptive Design Finding Critical Values with Prefixed Early Stopping Boundaries and Controlled Type I Error for A Two-Stage Adaptive Design Jingjing Chen 1, Sanat K. Sarkar 2, and Frank Bretz 3 September 27, 2009 1 ClinForce-GSK

More information

Adaptive designs to maximize power. trials with multiple treatments

Adaptive designs to maximize power. trials with multiple treatments in clinical trials with multiple treatments Technion - Israel Institute of Technology January 17, 2013 The problem A, B, C are three treatments with unknown probabilities of success, p A, p B, p C. A -

More information

Accuracy and Decision Time for Decentralized Implementations of the Sequential Probability Ratio Test

Accuracy and Decision Time for Decentralized Implementations of the Sequential Probability Ratio Test 21 American Control Conference Marriott Waterfront, Baltimore, MD, USA June 3-July 2, 21 ThA1.3 Accuracy Decision Time for Decentralized Implementations of the Sequential Probability Ratio Test Sra Hala

More information

The SEQDESIGN Procedure

The SEQDESIGN Procedure SAS/STAT 9.2 User s Guide, Second Edition The SEQDESIGN Procedure (Book Excerpt) This document is an individual chapter from the SAS/STAT 9.2 User s Guide, Second Edition. The correct bibliographic citation

More information

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,

More information

Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data

Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Taposh Banerjee University of Texas at San Antonio Joint work with Gene Whipps (US Army Research Laboratory) Prudhvi

More information

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. BEST TESTS Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. 1. Most powerful test Let {f θ } θ Θ be a family of pdfs. We will consider

More information

Topic 3: Hypothesis Testing

Topic 3: Hypothesis Testing CS 8850: Advanced Machine Learning Fall 07 Topic 3: Hypothesis Testing Instructor: Daniel L. Pimentel-Alarcón c Copyright 07 3. Introduction One of the simplest inference problems is that of deciding between

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

Lecture 13: p-values and union intersection tests

Lecture 13: p-values and union intersection tests Lecture 13: p-values and union intersection tests p-values After a hypothesis test is done, one method of reporting the result is to report the size α of the test used to reject H 0 or accept H 0. If α

More information