SIMULTANEOUS CONFIDENCE BOUNDS WITH APPLICATIONS TO DRUG STABILITY STUDIES. Xiaojing Lu. A Dissertation

Similar documents
AN ALTERNATIVE APPROACH TO EVALUATION OF POOLABILITY FOR STABILITY STUDIES

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

STAT 525 Fall Final exam. Tuesday December 14, 2010

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Modified Simes Critical Values Under Positive Dependence

Step-down FDR Procedures for Large Numbers of Hypotheses

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

Statistical Applications in Genetics and Molecular Biology

Chapter 1 Statistical Inference

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Multiple Comparison Methods for Means

Ch 2: Simple Linear Regression

Control of Generalized Error Rates in Multiple Testing

UNIVERSITY OF TORONTO Faculty of Arts and Science

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

Control of Directional Errors in Fixed Sequence Multiple Testing

[y i α βx i ] 2 (2) Q = i=1

On adaptive procedures controlling the familywise error rate

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Statistical Inference

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Specific Differences. Lukas Meier, Seminar für Statistik

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

More about Single Factor Experiments

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Categorical Predictor Variables

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

The International Journal of Biostatistics

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Probability and Statistics Notes

c 2011 Kuo-mei Chen ALL RIGHTS RESERVED

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Inference for Regression

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

22s:152 Applied Linear Regression. 1-way ANOVA visual:

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

Multiple Linear Regression

Master s Written Examination

Familywise Error Rate Controlling Procedures for Discrete Data

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Power and sample size determination for a stepwise test procedure for finding the maximum safe dose

Regression With a Categorical Independent Variable: Mean Comparisons

Hochberg Multiple Test Procedure Under Negative Dependence

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Exercises and Answers to Chapter 1

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Lec 1: An Introduction to ANOVA

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Unit 12: Analysis of Single Factor Experiments

Inferences about a Mean Vector

A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics

Multiple Testing. Tim Hanson. January, Modified from originals by Gary W. Oehlert. Department of Statistics University of South Carolina

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons

Chapter Seven: Multi-Sample Methods 1/52

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Lecture 6 April

Chapter 12. Analysis of variance

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis

Inference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities

STAT 512 sp 2018 Summary Sheet

Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements Part II: By-Time-Point and Multivariate Control Chart

1 Least Squares Estimation - multiple regression.

Generalized Linear Models

Simple Linear Regression

Statistics 3858 : Maximum Likelihood Estimators

Ch 3: Multiple Linear Regression

BIOS 2083 Linear Models c Abdus S. Wahed

Post-Selection Inference

Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni based closed tests

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Master s Written Examination - Solution

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Data Mining Stat 588

Section 4.6 Simple Linear Regression

Multivariate Statistical Analysis

Bias Variance Trade-off

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Some General Types of Tests

Enquiry. Demonstration of Uniformity of Dosage Units using Large Sample Sizes. Proposal for a new general chapter in the European Pharmacopoeia

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Statistics for scientists and engineers

EFFECT OF THE UNCERTAINTY OF THE STABILITY DATA ON THE SHELF LIFE ESTIMATION OF PHARMACEUTICAL PRODUCTS

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37


Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Transcription:

SIMULTANEOUS CONFIDENCE BOUNDS WITH APPLICATIONS TO DRUG STABILITY STUDIES Xiaojing Lu A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 26 Committee: John Chen Sub Ramakrishnan Graduate Faculty Representative Gabor Szekely Truc Nguyen

ii ABSTRACT John T. Chen, Advisor The focus of this research was to develop simultaneous confidence bounds for all contrasts of several regression lines with a constrained explanatory variable. The pioneering work of Spurrier provided a set of simultaneous confidence bounds for exact inference on all contrasts of several simple linear regression lines over the entire range (, ) using the same n design points. However, in many applications, the explanatory variables are constrained to smaller intervals than the entire range (, ). Spurrier clearly stated in the article (JASA, 999) that the inference problem becomes much more complicated when the explanatory variable is bounded to a given interval. In fact, Wei Liu et al. (JASA, 24) have investigated this issue, but were unable to solve the problem. Instead, they were obliged to rely on simulation based methods which produced approximate probability points for simultaneous comparisons. A noted criticism of their method is that the results are not exact and the simulations must be repeated for each application. In this research, a set of simultaneous confidence bounds for all contrasts of several linear regression lines was constructed for when the explanatory variable is restricted to a fixed interval, [ x, x ], where x is a predetermined constant. These results greatly improve those of Spurrier since restricting the explanatory variable to a smaller interval results in narrower confidence bounds. Further, since the methods of this research are exact, they are superior to the earlier work of Wei Liu et al. A significant area of this research concerned a certain statistic that plays a crucial role in constructing confidence bounds with a constrained explanatory variable, and a pivotal

iii quantity that aids in the discovery of critical values for determining the confidence bounds. The pivotal quantity is the maximum value of an associated function, and the statistic is the cut-off point at which the function is optimized. It is of primary importance to find a closed-form expression for the pivotal quantity and to derive its exact distribution. In this research, both of these problems were solved. In addition, the exact distribution of the statistic was found to be a standard Cauchy distribution; in fact, amazingly, it has also been shown that the statistic is independent of the pivotal quantity. These research results shed surprising new light on long standing knotty problems in biostatistics. Applications of this method to drug stability studies were examined. In situations where multiple batches of a drug product are manufactured, it is desired to pool data from different batches to obtain a single shelf-life for all batches. This research provided a new pooling method that was demonstrated to be more versatile and efficient than the existing pooling procedures.

iv ACKNOWLEDGMENTS I want to express my sincere appreciation to my advisor, Dr. John T. Chen, for bringing me to this fertile area of Statistics, for his patience and encouragement during the manuscript preparation, and for his excellent guidance and support through the completion of this research. I would like to thank all of the members of my committee, Dr. Gabor Szekely, Dr. Truc Nguyen, and Dr. Sub Ranakrishnan for their advice and helpful comments. I have additionally benefitted from insightful suggestions by Dr. Hangfeng Chen. I would also like to extend my gratitude to the staff in the Department of Mathematics and Statistics at Bowling Green State University for their assistance; they include Cyndi Patterson, Marcia Seubert, and Mary Busdeker. Finally, I owe very special thanks to my husband, G. Jay Kerns, who patiently proofread my manuscript, gave many useful comments, and supported me in my educational pursuit.

v TABLE OF CONTENTS CHAPTER : INTRODUCTION TO MULTIPLE COMPARISONS. Introduction....................................2 Simultaneous Statistical Inference....................... 3.3 Multiple Comparison Inference......................... 5.3. Multiple Comparison With a Control................. 5.3.2 Multiple Comparison With the Best.................. 7.3.3 All-pairwise Comparisons....................... 9.3.4 All-contrast Comparisons.........................4 An Introduction to Multiple Comparisons in General Linear Model.....5 Stepwise Hypothesis Tests........................... 2.6 Introduction to Drug Stability Studies.................... 4 CHAPTER 2: SPURRIER S EXACT CONFIDENCE BOUNDS 7 2. Introduction................................... 7 2.2 Setting of the Problem............................. 7 2.3 A Closed-form Expression for the Pivotal Quantity............. 9 2.4 The Exact Distribution of the Pivotal Quantity............... 23 2.5 Comparison of Scheffé s Bounds and Spurrier s Bounds........... 45 2.6 Example..................................... 47 CHAPTER 3: CONFIDENCE BOUNDS OVER A FIXED INTERVAL 49 3. Introduction................................... 49

vi 3.2 A Closed-form Expression for the Pivotal Quantity............. 5 3.3 The Exact Distribution of the Variable V................... 52 3.4 Distribution Theory.............................. 56 3.5 The Exact Distribution of the Pivotal Quantity............... 63 CHAPTER 4: DRUG STABILITY STUDIES WITHOUT TIME EFFECTS 65 4. Introduction................................... 65 4.2 FDA s Pooling Procedures........................... 66 4.3 Pooling Procedures Using MCB........................ 67 4.4 Examples.................................... 73 CHAPTER 5: DRUG STABILITY STUDIES WITH TIME EFFECTS 8 5. Introduction................................... 8 5.2 Evaluating Drug Stability Using an Arrhenius Model............ 82 5.3 Liu et al. s Procedure for Pooling Batches.................. 86 5.4 Pooling Batches Based on Confidence Bounds for All-Contrast Comparisons 9 REFERENCES 97 Appendix A: MATLAB PROGRAMS FOR FIGURES 4. AND 4.2 2 A. Matlab Program for Figure 4......................... 2 A.2 Matlab Program for Figure 4.2........................ 3 Appendix B: MATLAB PROGRAM FOR SIMULATING Q AND V 5

vii LIST OF TABLES 3. Simulation Results for M 3 iterations.................. 6 4. Listing of Data Set.............................. 74 4.2 ANCOVA Results for Testing Batch Slopes (Data Set ).......... 74 4.3 Individual Batch Regressions (Data Set ).................. 75 4.4 MCW 75% and 95% CI s (Data Set ).................... 76 4.5 Listing of Data Set 2.............................. 77 4.6 ANCOVA Results for Testing Batch Slopes (Data Set 2).......... 77 4.7 Individual Batch Regressions (Data Set 2).................. 78 4.8 MCW 75% and 95% CI s (Data Set 2).................... 79 4.9 Calculated Shelf-lives for Each Decision Rule................. 79 5. Potency Assay Results (% of claim)...................... 93 5.2 Confidence Bounds and Maximum Distances for Different Contrasts... 96

viii LIST OF FIGURES 3. Plots of T against a.............................. 5 3.2 Scatterplot and Chiplot of Q and V..................... 62 3.3 Scatterplot and Chiplot of Q and V...................... 62 4. Stability Data and Individual Regression Lines (Data Set )........ 75 4.2 Stability Data and Individual Regression Lines (Data Set 2)........ 78 5. Assay Results and Individual Regression Lines................ 93 5.2 Confidence Bounds for Expected Percent of Label Claim for Batch Minus Expected Percent of Label Claim for Batch 5................. 95

CHAPTER INTRODUCTION TO MULTIPLE COMPARISONS. Introduction This chapter gives an introduction to different types of multiple comparison methods and multiple comparisons in the general linear model. We will begin this chapter by discussing (in Section.2) simultaneous statistical inference. Simultaneous statistical inference is statistical inference on treatment means themselves, and is inference on several treatment means simultaneously. In Section.3, we will introduce the multiple comparisons inference, which is simultaneous inference on certain functions of the differences of the treatment means. This section describes four types of multiple comparison inference: multiple comparison with the control (MCC), multiple comparisons with the best (MCB), all-pairwise comparisons (MCA), and all-contrast comparisons (ACC). For each multiple comparison inference, a few well-established methods will be stated. Section.4 gives an introduction to multiple comparisons in the general linear model, which is the focus of this research. Multiple comparison inference, covered in Section.3, is appropriate in experiments where the response measured depends on the treatments only. However, in many situations, there exist one or more covariates that have some impact on the response. In such cases, multiple comparisons of treatments in terms of a parametric function is more meaningful than in terms of their means. Section.5 presents stepwise procedures for multiple hypothesis testing. Bonferroni s procedure, Holm s step-down procedure (979), and Hochberg s stepup procedure (988) will be described in this section. In Section.6, we will give a brief introduction to applications of multiple comparison procedures in drug stability studies.

2 Often times, the purpose of studies is to compare several treatment effects by estimating differences or by testing a family of hypotheses. The control of the Type I Error rate when testing simultaneously a family of hypotheses is a central issue in the area of multiple comparisons. The traditional familywise error rate (FWER) criterion (Hochberg and Tamhane (987)) which states that the probability of one or more Type I Errors should be kept at or less than a pre-specified level α is often adopted to control the multiplicity effect. A criticism of the FWER criterion is that it lowers the power of detecting real effects due to the stringent requirement on Type I Errors. Benjamini and Hochberg (995) suggested the false discovery rate (FDR) criterion, which is a different point of view for how the errors in multiple testing could be considered. The FDR criterion is designed to control the expected proportion of errors among the rejected hypotheses, and it has the following two properties in comparison with the FWER: if all null hypotheses are true, the FDR is equivalent to the FWER; otherwise, the FDR is less than or equal to the FWER. As a result, any procedure that controls the FWER also controls the FDR, and if a procedure is designed to control the FDR only, a potential gain in power may be expected. Multiple comparisons procedures are one of the most frequently applied statistical methods in practice. For example, consider a pharmaceutical company that compares several batches of a drug for the purpose of estimating the shelf-life of the drug (see Example in Chapter 4). In this example, six batches of the drug are manufactured, and it is desired to pool data from different batches to estimate a single shelf-life for the drug. The use of a multiple comparisons procedure is called for in this situation to make correct inferences, in particular, to determine an appropriate expiration date of the drug. In Chapter 4, we will talk about this example in detail.

3.2 Simultaneous Statistical Inference For simultaneous statistical inference, one useful statistical model is the one-way model, which is given by Y ij µ i + ɛ ij, i,..., k, j,..., n i, (.2.) where Y ij is j th response for the ith treatment, µ, µ 2,..., µ k are the treatment means under k treatments, and ɛ,..., ɛ knk are independent and identically distributed normal errors with mean and variance σ 2 unknown. Let n i ˆµ i Y ij /n i, j and ˆσ 2 k n i (Y ij ˆµ i ) 2 /ν with ν j k (n i ) denote the sample mean and the pooled sample variance, respectively. One type of simultaneous statistical inference is inference on the treatment means µ, µ 2,..., µ k themselves. For example, we might be interested in constructing k simultaneous confidence intervals, { µ i (L i, U i ), i,..., k }, at the ( α)% confidence level. It is understood that the fact that each individual two-sided confidence interval for µ i has coverage probability ( α)% does not guarantee the overall coverage probability to be ( α)%. In order for the inferences to be correct simultaneously with a probability of ( α)%, we must adjust for multiplicity to ensure that each individual inference on µ, µ 2,..., µ k is correct with a probability somewhat higher than ( α)%. There are several methods of adjusting for multiplicity, which we will describe briefly here.. The Studentized maximum modulus method (SMM). This method is based on the

4 Studentized maximum modulus statistic ˆµ i µ i max i k ˆσ/, (.2.2) n i and provides exact ( α)% simultaneous confidence intervals for µ i : µ i ˆµ i ± m α,k,ν ˆσ/ n i for i,..., k, (.2.3) where m α,k,ν is the α quantile of the Studentized maximum modulus statistic (.2.2). 2. The Bonferroni inequality method. This method uses the Bonferroni inequality that for any events E,..., E p, p p P ( Em) c P (Em), c m m and provides a set of conservative ( α)% simultaneous confidence intervals for µ,..., µ k : µ i ˆµ i ± t α/2k,ν ˆσ/ n i for i,..., k, (.2.4) where t α/2k,ν is the α/2k quantile of the t distribution with ν degrees of freedom. 3. Scheffé s method. This method is based on the pivotal statistic k ( n iˆµ i n i µ i ) 2 /k ˆσ 2, (.2.5) which has an F distribution with k and ν degrees of freedom. It gives exact ( α)% simultaneous confidence intervals for all linear combinations of µ,..., µ k : k l i µ i k l iˆµ i ± k kf α,k,ν ˆσ( li 2 /n i ) /2, for all l (l,..., l k ) R k, (.2.6) where F α,k,ν is the α quantile of an F distribution with k and ν degrees of freedom.

5.3 Multiple Comparison Inference Multiple comparison inference is simultaneous inference on a comparison of the treatment means, and is a setting different from simultaneous statistical inference, which is inference on the treatment means themselves, as described in Section.2. The parameters of interest in multiple comparison methods are functions of contrasts of µ,..., µ k. According to the parameters of interest primarily, and to the strength of the inference secondly, multiple comparison methods can be classified into four types: MCC, MCB, MCA, and ACC. These four types of multiple comparison methods will be described in the following four subsections..3. Multiple Comparison With a Control Dunnett (955) pioneered the concept of MCC. He suggested that at times when a control is present, the primary interest of comparison may be the comparison of each new treatment with the control. Suppose treatment k is the control. Then the parameters of interest in MCC are µ i µ k, for i,..., k, (.3.) the difference between the mean of each new treatment and the mean of the control. There are two types of MCC: one-sided MCC and two-sided MCC. The proper choice of an MCC method depends on the type of inference desired in different situations. If it is desired to infer whether any new treatments is better or worse than the control, one-sided MCC is a better choice. If it is of interest to detect whether the effects of new treatments and of the control are practically equivalent or different, two-sided MCC is preferred. For one-sided MCC, Dunnett s method (955) gives the simultaneous confidence lower

bounds and upper bounds for the difference between each new treatment µ i and the control mean µ k, as shown in the following theorem. 6 Theorem.3.. P { µ i µ k > ˆµ i ˆµ k dˆσ 2/n for i,..., k } P { µ i µ k < ˆµ i ˆµ k + dˆσ 2/n for i,..., k } α, (.3.2) where d is the solution to the equation [Φ(z + 2ds)] k dφ(z)γ(s)ds α. (.3.3) Here φ is the standard normal distribution function and γ is the density of ˆσ/σ. In addition to Dunnett s method, there are other methods that have been developed for one-sided MCC inference, which include the stepdown method of Naik (975), Marcus, Peritz and Gabriel (976) and the stepup method of Dunnett and Tamhane (99). For two-sided MCC, Dunnett s method (955) provides the simultaneous confidence intervals for the difference between each new treatment µ i and the control mean µ k, as given in the following theorem. Theorem.3.2. P { µ i µ k ˆµ i ˆµ k ± d ˆσ 2/n for i,..., k } α, (.3.4) where d is the solution to the equation [Φ(z + 2 d s) Φ(z 2 d s)] k dφ(z)γ(s)ds α. (.3.5)

7.3.2 Multiple Comparison With the Best MCB was first proposed by Hsu (98, 982), and it was designed to compare each treatment with the best of the other treatments. Suppose a larger treatment effect implies a better treatment. Then the parameters of primary interest are µ i max j i µ j, for i,..., k. (.3.6) If µ i µ j <, j i then treatment i is not the best since there is another treatment better than it; on the other hand, if µ i µ j >, j i then treatment i is the best treatment since it is better than all other treatments. In case a smaller treatment effect implies a better treatment, the parameters of primary interest are µ i min j i and different conclusions follow depending on whether µ i min j i µ j, for i,..., k, (.3.7) µ j is positive or negative. MCB inference can be categorized into two types: constrained MCB and unconstrained MCB. The difference between these two types is for constrained MCB inference, the simultaneous confidence intervals on µ i min j i µ j are constrained to contain, while for unconstrained MCB inference, those intervals are not constrained to contain. For situations where the magnitude of the difference between the best and those identified to be not the best is not of concern, constrained MCB is preferred since it achieves sharper inference than unconstrained MCB inference. For constrained MCB, if a confidence

interval for µ i min j i if a confidence interval for µ i min j i µ j has a lower limit, then the ith treatment is identified as the best; µ j has an upper limit, then the ith treatment is not the best. The following theorem (Hsu, 984b) gives a set of ( α)% simultaneous confidence intervals for µ i min j i Theorem.3.3. Let µ j for constrained MCB inference. 8 D i (ˆµ i min j i ˆµ j dˆσ 2/n), D + i (ˆµ i min j i ˆµ j + dˆσ 2/n) +. Then for all µ (µ,..., µ k ) and σ 2, P µ,σ 2{ µ i min j i µ j [D i, D+ i ] for i,..., k } α. (.3.8) Here d is the solution to Equation (.3.3) as given in Section.3., x min{, x } and x + max{, x }. In cases where one desires lower bounds on how much treatments identified not to be the best are worse than the true best, unconstrained MCB is a proper choice for making desired inference. The following theorem provides a set of confidence intervals that achieve confidence level of α. Theorem.3.4. For all µ (µ,..., µ k ) and σ 2, P µ,σ 2{ µ i min j i µ j ˆµ i min j i ˆµ j ± q ˆσ 2/n for i,..., k } α, (.3.9) with equality when µ... µ k. Here q is the solution to the equation { Zi Z j P ˆσ } q for all i > j α, (.3.) where Z,..., Z k are iid standard normal random variables.

9.3.3 All-pairwise Comparisons For MCA, the parameters of primary interest are µ i µ j, for all i j. (.3.) There are several methods available to provide simultaneous confidence intervals for allpairwise differences, µ i µ j, i j. Some of these methods are listed here.. Tukey s (953) Method. Tukey s method provides the following ( α)% simultaneous confidence intervals for all-pairwise differences: µ i µ j ˆµ i ˆµ j ± q ˆσ 2/n for all i j, (.3.2) where q is the solution to the equation { (ˆµ i µ i ) (ˆµ j µ j ) P ˆσ 2/n q for all i > j } α. (.3.3) 2. Bonfinger s (985) Confident Directions Method. This method provides the following constrained ( α)% simultaneous confidence intervals for all-pairwise differences: µ i µ j [ (ˆµ i ˆµ j q ˆσ 2/n), (ˆµ i ˆµ j + q ˆσ 2/n) + ] for all i j, (.3.4) where q is the solution to the equation { (ˆµ i µ i ) (ˆµ j µ j ) P ˆσ 2/n q for all i > j } α, (.3.5) x min{, x } and x + max{, x }. In situations where the equalities among the µ i s are impossible, Bonfinger s confident direction method gives sharper inference than deduction from Tukey s simultaneous confidence intervals.

3. Hayter s (99) One-sided Comparisons. Hayter derived the following ( α)% simultaneous lower confidence bounds on µ i µ j for all i > j: µ i µ j > ˆµ i ˆµ j q ˆσ 2/n for all i > j, (.3.6) where q is the same critical value as in the simultaneous confidence intervals (.3.4) of Bonfiger (985), i.e., q is the solution to Equation (.3.5). These simultaneous confidence intervals provide sharper inference in situations where it is suspected that µ µ 2... µ k and one might be primarily interested in lower confidence bounds on µ i µ j for all i > j..3.4 All-contrast Comparisons For ACC, the parameters of primary interest are k c i µ i, with c + c 2 +... + c k. (.3.7) Scheffé derived the following ( α)% simultaneous confidence intervals for all-contrast comparisons: k c i µ i k c iˆµ i ± (k )F α,k,ν ˆσ( k c 2 i /n i ) /2. (.3.8) In fact, all other three types of multiple comparisons, MCC, MCB and MCA are special cases of ACC, and hence can be deduced from ACC. The reason that we consider specific types of multiple comparisons is because direct inference is sharper than deduced inference. For example, inference on all-pairwise differences deduced from Scheffé s method, which is designed for inference on all linear combinations of the means, is weaker than inference given by Tukey s method, which is designed specifically for inferences on all-pairwise differences of the means. In applications, the proper choice of a multiple comparison method depends

primarily on the type of inference desired, and secondarily on the strength of inference intended to achieve..4 An Introduction to Multiple Comparisons in General Linear Model Multiple comparison methods described in Section.3 are based on the assumption that independent simple random samples are taken under the treatments and the response measured depends on the treatment only. However, in many real-life situations, the response of an experiment are affected by the treatment, as well as some covariates. In such cases, comparing the treatments may not be meaningful without adjusting for the effects of covariates. In experiments where covariates are present, the appropriate statistical model is the general linear model (GLM) Y Xβ + ɛ, (.4.) where Y N is the vector of responses, X N p is the design matrix, β p is the vector of parameters, and ɛ N is a vector of iid normally distributed errors with mean and unknown variance σ 2. The one-way model, as discussed in Section.2 is a GLM, and the analysis of covariance (ANCOVA) model, which will be discussed later and takes the form Y ij τ i + β i X ij + ɛ ij, is also a GLM. In a GLM, it is not reasonable to compare the treatment effects alone without considering the covariate effects, because the average response under the ith treatment may depend on the value of a covariate. For example, in the ANCOVA model, if β i β j, then whether or not the average response under the ith treatment is larger or smaller than the average response under the jth treatment depends on the value of the covariate, X.

2 Consequently, the parameters of interest in the ANCOVA model are (τ i + β i X) (τ j + β j X), (.4.2) a linear function of the covariate X. The desired inference in such cases would be multiple comparison inference in terms of a parametric function of treatment means rather than in terms of treatment means, and as a result, a set of simultaneous confidence bounds is more appropriate to make the desired inference than a set of simultaneous confidence intervals. In this research, we will focus attention on the ANCOVA model, and construct a set of simultaneous confidence bounds for comparisons of several regression lines. There are also situations where comparisons are to be made of parameters that do not correspond to long-run average treatment effects. For example, in drug stability studies, the parameters β,..., β k in the ANCOVA model correspond to the degradation rates of batches of drug products, and the parameter of concern is the comparisons of these rates, that is, β i β j. In Chapter 4, we will discuss this situation in detail..5 Stepwise Hypothesis Tests In the area of testing multiple hypotheses, the Bonferroni inequality is often used to set an upper bound for the familywise error rate. If T,..., T n is a set of test statistics with corresponding p-values P,..., P n for testing hypotheses H,..., H n, the classical Bonferroni multiple test procedure rejects H {H,..., H n } if any p-value is less than α/n. Furthermore, for each P i α/n, i,..., n, the specific hypothesis H i is rejected. The Bonferroni inequality, { n } P (P i α/n) α, ( α ), (.5.)

3 ensures that the probability of rejecting at least one hypothesis when all are true is no greater than α. The Bonferroni procedure is simple to apply and requires no distributional assumptions. A disadvantage of this procedure is that it is conservative, especially when the test statistics are highly correlated. Holm (979) improved Bonferroni s procedure and presented a sequentially rejective Bonferroni procedure. This procedure is a step-down multiple test procedure that is much less conservative but still maintains the FWE at α. Let P (),..., P (n) be the ordered p-values and H (),..., H (n) be the corresponding hypotheses. Let (p) α/(n p + ), p n. (.5.2) Then (p), p n, is a strictly increasing sequence of constants. Holm s step-down procedure begins by testing if P () (). If so, one rejects H () and continues to check whether P (2) (2). If not, all hypotheses are accepted. In general, Holm s procedure rejects H (i) when, for all j,..., i, P (j ) (j). (.5.3) Holm (979) proved that with the cut-off constants defined in (.5.2), the step-down procedure controls in general the FWE in the strong sense. Instead of testing sequentially starting with the smallest p-value, Hochberg (988) suggested to start with the largest p-value. The procedure proposed by Hochberg (988) is a step-up multiple test procedure. It begins by testing if P (n) (n). If so, one rejects all hypotheses. If not, one accepts H (n) and goes on to check whether P (n ) (n ). In general, H (i) is rejected if P (j) (j), for any j i. (.5.4)

4 Hochberg s procedure is more powerful than Holm s procedure and still keeps the FWE at α. In addition to Holm s procedure and Hochberg s procedure, there are many alternative procedures in the area of stepwise hypothesis testing. These include Simes procedure (986), Hommel s procedure (988), Wright s procedure (992), among others..6 Introduction to Drug Stability Studies The Food and Drug Administration (FDA) requires that for every drug product in market, its shelf-life (or expiration date) must be indicated on the container label. The expiration date of a drug product is defined as the time interval that the average drug characteristic (e.g., strength and purity) of the drug is expected to remain within approved specifications after manufacture. It is important to provide consumers with certain assurances that the drug product will retain its identity, strength, potency, dissolution, and purity during the claimed shelf-life period. The manufacturers (drug companies) usually conduct a stability study to ensure that a drug product can meet the approved specifications prior to its expiration date being printed on the package. Drug stability studies are normally designed to characterize the degradation of the drug product over time and to estimate the shelf-life based on the degradation curves. Generally, drug stability studies consist of a random sample of dosage units (e.g., tablets, capsules, vials) from a given batch or several batches placed in a storage room with controlled temperature and humidity conditions. The class of stability studies includes accelerated studies and long-term studies. Accelerated studies are usually conducted under a high level of special temperature and relative humidity conditions to increase the level of stress. The Arrhenius model is one of the most commonly used statistical models for estimation of drug stability parameters in accelerated

5 drug stability studies. Applications of this model in drug stability studies will be detailed in Section 5.. For long-term studies, the drug product is stored under room temperature and humidity conditions, and stability testing is performed under regular environmental conditions. Often times, multiple batches of a drug product are manufactured, and there may be more or less variation among batches. Consequently, the estimated shelf-life may vary from batch to batch. In practice, it is desired to combine data from different batches to estimate a single shelf-life of the drug. Several approaches for pooling data have been studied, including the FDA Guideline (987), multiple comparisons with the worst (Ruberg and Hsu, 992), MCA of regression lines (Liu et al.,24), and ACC of regression lines proposed in this research. The FDA Guideline (987) suggested to test the equality of the slopes of the regression line fitted for each batch, assuming the drug characteristic is expected to decrease linearly as time increases. Two batches are claimed as practically equivalent, and hence can be pooled, if they have similar slopes. In the case of shelf-life estimation, the most negative degradation rate is of interest, and this has led Ruberg and Hsu (992) to propose the approach of multiple comparisons with the worst (MCW). This pooling method compares all batches with the (unknown) worst batch and provides simultaneous confidence interval estimates of slope differences. These simultaneous confidence intervals are then used to make decisions on pooling as many batches as possible with the worst batch. Liu et al. (24) proposed the approach of MCA of the regression lines. They developed a set of simultaneous confidence bands and suggested a decision rule to pool batches. The proposed set of simultaneous confidence bands is claimed to be an improvement over a set of simultaneous confidence intervals since it allows one to assess the difference between two regression lines over a given range of time, rather than at one particular time point. The pooling method that we have

6 proposed in this research provides a set of simultaneous confidence bands based on ACC of regression lines. This method is more efficient in applications to drug stability studies than Liu et al. s method since the number of comparisons required to make conclusions can be greatly decreased if we choose appropriate contrasts. In Chapters 4 and 5, we will discuss these pooling methods in detail.

7 CHAPTER 2 SPURRIER S EXACT CONFIDENCE BOUNDS 2. Introduction Most research in the multiple comparison literature has been devoted to comparing the means of k( 3) groups under the assumption of iid normal errors. However, it is important in some circumstances to compare k( 3) groups based on some parametric function other than the mean. The pioneer work of Spurrier (999) provides a set of simultaneous bounds for exact inference on all contrasts of several simple linear regression lines over the entire range (-, ) using the same n design points. In this chapter we discuss how to develop exact simultaneous confidence bounds for all contrasts of three or more regression lines when the explanatory variable takes values on the whole real line. We begin this chapter by describing (in Section 2.2) the setting of the problem. In this section, we will introduce a pivotal quantity which is essential for developing the exact confidence bounds in the later sections. Section 2.3 shows how to find a closedform expression for the pivotal quantity. In Section 2.4, we show how to derive the exact distribution of the pivotal quantity. Section 2.5 compares the exact bounds developed by Spurrier (999) with the ones developed by Scheffé. An example to illustrate the exact confidence bounds is given in Section 2.6. 2.2 Setting of the Problem The simple linear regression model for the n observations from the ith group is Y ij α i + β i x j + ɛ ij, (2.2.)

8 for i,..., k and j,..., n. All error terms are assumed to be iid N (, σ 2 ). Without loss of generality, assume that the predictor variable values have been centered and scaled such that x and x x, where the n-dimensional vectors x and are defined as x (x,..., x n ) and (,..., ). These assumptions are well-known in multiple comparison literature, and it is certainly useful to extend to any functions theoretically. However, none of such non-linear assumptions has been investigated in the literature, and we will explore this direction in the future. Let ˆα i and ˆβ i denote the least squares estimators of α i and β i, i,..., k, respectively. Let ˆσ 2 denote the pooled error mean square with d.f., ν k(n 2), and let C denote the set of vectors c (c,..., c k ) such that k c i. Define Z i n /2 (ˆα i α i )/σ and Z 2i ( ˆβ i β i )/σ, for i,..., k. (2.2.2) Then,under the assumptions that x and x x, Z i and Z 2i are iid standard normal for all i,..., k. Let Z and Z 2 denote the sample means of the Z i s and the Z 2i s, respectively. A (-α)% simultaneous confidence bound for all contrasts of several regression lines can be obtained based on the traditional form of the point estimate plus or minus a probability point times the estimated standard error: [ ] /2 k k c i (α i + β i x) c i (ˆα i + ˆβ k i x) ± bˆσ (/n + x 2 ) c 2 i (2.2.3)

9 for all c C and all x (, ), where b is the probability point that depends on k, ν, and α. To determine the probability point b in Equation (2.2.3), Spurrier (999) defined a random variable Tc,x, 2 which is given by T c,x k c i[(ˆα i α i ) + ( ˆβ i β i )x] ˆσ[(/n + x 2 ), (2.2.4) k (c2 i )]/2 and he stated that by the union-intersection principle, the probability point b is the solution to the equation P ( T c,x b, real x and c C) α, (2.2.5) or, equivalently, the positive solution to the equation P [ ] sup (Tc,x) 2 b 2 α. (2.2.6) c C, x (, ) Therefore, an appropriate pivotal quantity sup(t 2 c,x) is determined by the unionintersection method. In the following two sections, a closed-form expression for this pivotal quantity will be found and then its exact distribution will be derived. 2.3 A Closed-form Expression for the Pivotal Quantity The pivotal quantity introduced in Section 2.2 will help us to compute the constant b in Equation (2.2.6) for the exact simultaneous confidence bounds in Equation (2.2.3). To compute b, an exact distribution of the pivotal quantity is required. It will be helpful to find a closed-form expression for the quantity before deriving the exact distribution. The following theorem gives a closed-form expression for sup(t 2 c,x).

Theorem 2.3.. The pivotal quantity sup(t 2 c,x) Q/(ˆσ 2 /σ 2 ), where Q {Q + Q 22 + [4R 2 Q Q 22 + (Q 22 Q ) 2 ] /2 }/2, with Q jj k (Z ji Z j )(Z j i Z j ), j and j, 2, and R Q 2 /[Q Q 22 ] /2. 2 Proof of Theorem 2.3.. With the definition of the random variable T c,x, write T 2 c,x { k c i[(ˆα i α i ) + ( ˆβ i β i )x]} 2 ˆσ 2 [(/n + x 2 ) k (c2 i )] { k c i[(ˆα i α i )/σ + ( ˆβ i β i )x/σ]} 2 (ˆσ/σ) 2 [(/n + x 2 ) k (c2 i )] { k c i[z i / n + Z 2i x]} 2 (ˆσ/σ) 2 [(/n + x 2 ) by Equation (2.2.2) k (c2 i )], { k c i[(z i Z )/ n + (Z 2i Z 2 )x]} 2 (ˆσ/σ) 2 [(/n + x 2 ) k (c2 i )], since k c i. Holding x fixed, then T 2 c,x is maximized when c i (Z i Z )/ n + (Z 2i Z 2 )x by the Cauchy-Schwarz inequality. The proof is as follows: The Cauchy-Schwarz inequality states that: For any two random variables X and Y, EXY (E X 2 ) /2 (E Y 2 ) /2, which can be re-written as (EXY ) 2 EX 2 EY 2. (2.3.) This result also applies to numerical sums when there is no explicit reference to an expectation, and hence Equation (2.3.) can also be written in the form: ( k c ia i ) 2 k c2 i k a 2 i. (2.3.2) Notice that when x is fixed, the maximization T 2 c,x is equivalent to the maximization of the quantity T { k c i[(z i Z )/ n+(z 2i Z 2 )x]} 2 / k (c2 i ). Let a i (Z i Z )/ n+ (Z 2i Z 2 )x. Then, based on Equation (2.3.2), the quantity T attains its maximum k a2 i when c i a i (Z i Z )/ n + (Z 2i Z 2 )x.

2 T 2 x Now denote this maximum value of T 2 c,x for fixed x by T 2 x, which is given by k [(Z i Z )/ n + (Z 2i Z 2 )x] 2 /n + x 2 n + nx 2 + nx 2 (ˆσ/σ) 2 [(/n + x 2 )] k [(Z i Z ) 2 /n + 2x(Z i Z )(Z 2i Z 2 )/ n + (Z 2i Z 2 ) 2 x 2 ] (ˆσ/σ) 2 k (Z i Z ) 2 /n + 2x/ n k (Z i Z )(Z 2i Z 2 ) + x 2 k (Z 2i Z 2 ) 2 ] (ˆσ/σ) 2 k (Z i Z ) 2 + 2x n k (Z i Z )(Z 2i Z 2 ) + nx 2 k (Z 2i Z 2 ) 2 ] (ˆσ/σ) 2. Let a /( + nx 2 ). Then a [, ], a (nx 2 )/( + nx 2 ), and [a( a)] /2 ( nx 2 )/( + nx 2 ). Therefore, with the quantities Q jj, j and j, 2 defined earlier, we have T 2 x aq + 2[a( a)] /2 Q 2 + ( a)q 22 (ˆσ/σ) 2. (2.3.3) Now, we need to maximize T 2 x with respect to x, or, equivalently, with respect to a [, ]. This can be done by maximizing the numerator of T 2 x, aq +2[a( a)] /2 Q 2 + ( a)q 22, denoted by T, with respect to a [, ]. Taking the derivative of T with respect to a and setting it equal to, the following equation is obtained: Q + Rearranging of Equation (2.3.4) gives the equation 2a [a( a)] /2 Q 2 Q 22. (2.3.4) Q 22 Q Q 2 2a, (2.3.5) [a( a)] /2 squaring both sides of Equation (2.3.5) yields Now define [ ] 2 Q22 Q 2 Q 2 ( 2a)2 4a( a). (2.3.6) V Q 22 Q, provided Q 2. (2.3.7) 2 Q 2

22 Substitution of Equation (2.3.7) into Equation (2.3.6) yields V 2 ( 2a)2 4a( a). (2.3.8) Standard arguments show that there are two possible solutions to Equation (2.3.8), given by a ± [V 2 /( + V 2 )] /2. (2.3.9) 2 Notice that < { [V 2 /( + V 2 )] /2 }/2 < /2 and /2 < { + [V 2 /( + V 2 )] /2 }/2 <. To decide which solution is the valid solution to Equation (2.3.4), the following two cases are considered:. If Q 22 > Q, then V > and from Equation (2.3.5), 2a >, that is, a < /2. In this case, the solution { [V 2 /( + V 2 )] /2 }/2 should be chosen and hence the solution to Equation (2.3.4) is a [ V/( + V 2 ) /2 ]/2 since V >. 2. If Q 22 < Q, then V < and from Equation (2.3.5), 2a <, that is, a > /2. In this case, the solution { + [V 2 /( + V 2 )] /2 }/2 is chosen and hence the solution to Equation (2.3.4) is also a [ V/( + V 2 ) /2 ]/2 since V <. With the arguments above, and after checking the second derivative with respect to a, it is concluded that T (or Tx 2 ) achieves its maximum at a [ V/( + V 2 ) /2 ]/2, denoted by A. Now, write sup(tc,x) 2 aq + 2[a( a)] /2 Q 2 + ( a)q 22 sup (2.3.) a [,] (ˆσ/σ) 2 {Q + Q 22 + ( 2a)(Q 22 Q ) + 4 Q 2 [a( a)] /2 }/2 sup a [,] (ˆσ/σ) 2 {Q + Q 22 + V/( + V 2 ) /2 (Q 22 Q ) + 2 Q 2 /( + V 2 ) /2 }/2 (ˆσ/σ) 2,

since when a [ V/( + V 2 ) /2 ]/2, 2a V/( + V 2 ) /2, and a( a) /[4( + V 2 )], 23 {Q + Q 22 + [(Q 22 Q ) 2 + 4Q 2 2]/[2 Q 2 ( + V 2 ) /2 ]}/2 (ˆσ/σ) 2 {Q + Q 22 + [(Q 22 Q ) 2 + 4Q 2 2] /2 }/2 (ˆσ/σ) 2, this is because ( + V 2 ) /2 [(Q 22 Q ) 2 + 4Q 2 2] /2 /(2 Q 2 ) by the definition of V {Q + Q 22 + [(Q 22 Q ) 2 + 4R 2 Q Q 22 ] /2 }/2 (ˆσ/σ) 2 Q ˆσ 2 /σ 2, (2.3.) where Q {Q + Q 22 + [4R 2 Q Q 22 + (Q 22 Q ) 2 ] /2 }/2 and R Q 2 /[Q Q 22 ] /2. Notice that with the definition of the random variable V, Q 2 is assumed to be not equal to. However, even when Q 2, this result still holds. This is because if Q 2, Q [Q + Q 22 + Q 22 Q ]/2 max(q, Q 22 ); and if Q 2, the numerator of the right side of Equation (2.3.) is equal to max [(Q Q 22 )a + Q 22 ], or, equivalently, equal a [,] to max(q, Q 22 ), which is the same as Q. The proof is complete. 2.4 The Exact Distribution of the Pivotal Quantity In Section 2.3, a closed-form expression for the pivotal quantity sup(t 2 c,x) was found. The next step is to derive the exact distribution of this quantity. Before the derivation of the exact distribution, some remarks are collected that will be needed throughout the derivation. First, as mentioned in Section 2.2, (Z,..., Z k ) and (Z 2,..., Z 2k ) are independent sets of k iid standard normal variables under the design constraints. The variable Q jj defined earlier is the numerator of the sample variance of the jth set, j, 2. The variable

24 R is the sample correlation coefficient computed on (Z i, Z 2i ), i,..., k. Moreover, Z ji s depend on the original regression data only through the ˆα i s and ˆβ i s. Second, the variables Q, Q 22, ν(ˆσ 2 /σ 2 ), and R are mutually independent (Anderson (958) and Graybill (976)). Furthermore, the first three variables have χ 2 distributions with k, k, and ν degrees of freedom, and R has the density f R (r) Γ[(k )/2]( r2 ) (k 4)/2 Γ[(k 2)/2](π) /2 for r. (2.4.) The following theorem gives the exact distribution of the quantity sup(t 2 c,x). Here is some notation that is used in the theorem and in the proof. Let F ν, ν 2 denote the distribution function of the F distribution with ν and ν 2 degrees of freedom and let p i i j (2j ) for positive integer i. Theorem 2.4.. For odd k 3, P [sup(t 2 c,x) b 2 ] k 2 F k,ν[2b 2 /(k )] (k )/2 π /2 b k 2 Γ[(ν + k 2)/2] Γ[(k )/2]Γ(ν/2)ν (k 2)/2 [ + (b 2 /ν)] [(ν+k 2)/2] F,ν+k 2 [b 2 (ν + k 2)/(b 2 + ν)] + Γ[(k )/2]2 (k )/2 (k 5)/2 (k 3 2i)Γ[(k + 2i)/2] p i+ where the summation is defined to be for k 3 or 5. F k +2i,ν [2b 2 /(k + 2i)],

25 For even k 4, P [sup(t 2 c,x) b 2 ] p (k 2)/2 Γ(k/2)2 (k 2)/2 [F k,ν (b 2 /k) F k 2,ν (b 2 /(k 2))] + p (k 2)/2 (k 2)/2 iγ(k 2 i) 2 (k 2)/2 i Γ(k/2 i) F 2(k 2 i),ν(b 2 /(k i 2)). The following lemma will be used in the proof of Theorem 2.4., and the proof of this lemma will be given after the proof of Theorem 2.4. is completed. Lemma 2.4.. Let a be an odd integer. Then z q a/2 exp( q/2)erf [(q/2) /2 ] dq Γ[(a + 2)/2]2 (a+2)/2 (a+)/2 [G (z)] 2 /2 + (/π) [Γ(i)G 2i (2z)/p i ] (a+)/2 (2/π) /2 G (z) [z i (/2) exp( z/2)/p i ], where the summations are defined to be if a. Proof of Theorem 2.4.. First we derive the density of Q, f(q). Define W min(q, Q 22 ), W 2 max(q, Q 22 ), and W 3 R 2. It follows from Equation (2.4.) that W 3 has the density f W3 (w 3 ) Γ[(k )/2]( w 3) (k 4)/2 Γ[(k 2)/2][πw 3 ] /2, w 3. Also, W and W 2 have the joint density f W,W 2 (w, w 2 ) 2f W (w )f W2 (w 2 ) [ ] 2 2 exp( (w Γ[(k )/2]2 (k )/2 + w 2 )/2)(w w 2 ) (k 3)/2.

26 This is because of the results of order statistics and of the fact that Q and Q 22 are independent and identically distributed random variables, each following a χ 2 distribution with k degrees of freedom. As remarked earlier, Q, Q 22, and R are mutually independent. Then Q, Q 22, and W 3 are also mutually independent, and hence W 3 is independent of W and W 2. Therefore, the following joint density of (W,W 2,W 3 ) is obtained: f(w, w 2, w 3 ) 2 exp( (w + w 2 )/2)(w w 2 ) (k 3)/2 Γ[(k )/2]( w 3 ) (k 4)/2 [Γ((k )/2)2 (k )/2 ] 2 Γ[(k 2)/2][πw 3 ] /2, 2 exp( (w + w 2 )/2)(w w 2 ) (k 3)/2 ( w 3 ) (k 4)/2 Γ[(k )/2]Γ[(k 2)/2]2 (k 2) [πw 3 ] /2, (2.4.2) for w w 2 <, w 3. Now the joint density of (W,W 2,Q) can be found. Note that W 3 (Q W )(Q W 2 ) W W 2. This is because Q {Q + Q 22 + [4R 2 Q Q 22 + (Q 22 Q ) 2 ] /2 }/2, and this can be rearranged into the equation (2Q Q Q 22 ) 2 4R 2 Q Q 22 + (Q 22 Q ) 2. Then, R 2 (2Q Q Q 22 ) 2 (Q 22 Q ) 2 4Q Q 22 4Q2 4Q(Q + Q 22 ) + 4Q Q 22 4Q Q 22 (Q Q )(Q Q 22 ) Q Q 22 (Q W )(Q W 2 ) W W 2.

27 Consider the following change of variables, W W, W 2 W 2, The Jacobian can be computed to be w J w 2 w 3 W 3 (Q W )(Q W 2 ) W W 2. w w 2 q w w w w 2 q w 2 w 2 w w 2 q w 3 w 3 It follows that the joint density of (W,W 2,Q) is 2q w w 2 w w 2. f(w, w 2, q) 2q w w 2 w w 2 exp( (w + w 2 )/2)(w w 2 ) (k 3)/2 ( [(q w )(q w 2 )/(w w 2 )]) (k 4)/2 2 (k 2) Γ[(k )/2]Γ[(k 2)/2]{π[(q w )(q w 2 )/(w w 2 )]} /2, (2q w w 2 ) exp( (w + w 2 )/2)[q(w + w 2 q)] (k 4)/2 2 (k 2) Γ[(k )/2]Γ[(k 2)/2][π(q w )(q w 2 )] /2, (2.4.3) for w w 2 < q w + w 2 <. Now consider the further change of variables, This leads to the inverse functions X 2Q W W 2, Q X 2 Q W 2Q W W 2. W Q( X X 2 ), W 2 Q[ X ( X 2 )], Q Q,

28 and hence the Jacobian can be computed to be w w w x x 2 q J w 2 x 2 w 2 x w 2 q q q q x x 2 q qx 2 qx x x 2 q( x 2 ) qx x ( x 2 ) q 2 x x 2 q 2 x ( x 2 ) q 2 x. To obtain the joint density of (X, X 2, Q), notice the following equalities: w + w 2 2q qx q w qx x 2 q w 2 qx ( x 2 ) And also, note that x 2 w +w 2 q, and w w 2 < q w +w 2, so w +w 2 q < 2, and hence < x. x 2 q w 2q w w 2 q w 2q w w 2, since w w 2, and x 2 q w 2q w w 2 < q w 2q w q, since q > w 2, so 2 x 2 <. < q <.

29 Then the joint density of (X, X 2, Q) is f(x, x 2, q) q2 x exp( (2q qx )/2)[q(q qx )] (k 4)/2 (2q 2q + qx ) 2 (k 2) Γ[(k )/2]Γ[(k 2)/2][πqx x 2 qx ( x 2 )] /2, for < x, /2 x 2 <, < q <. q k 2 exp( q + qx /2)x ( x ) (k 4)/2, (2.4.4) 2 (k 2) Γ[(k )/2]Γ[(k 2)/2][πx 2 ( x 2 )] /2 At this point, the density of Q can be found by integrating the joint density of (X, X 2, Q) with respect to x and x 2. This yields f(q) this is because 2 q k 2 exp( q + qx /2)x ( x ) (k 4)/2 2 (k 2) Γ[(k )/2]Γ[(k 2)/2][πx 2 ( x 2 )] /2 dx 2 dx q k 2 exp( q + qx /2)x ( x ) (k 4)/2 2 (k 2) Γ[(k )/2]Γ[(k 2)/2](π) /2 π /2 q k 2 exp( q) 2 k Γ[(k )/2]Γ[(k 2)/2] 2 [x 2 ( x 2 )] /2 dx 2 dx x ( x ) (k 4)/2 exp(qx /2) dx, (2.4.5) 2 [x 2 ( x 2 )] dx /2 2 2 2 [ 2 2 2y y( y 2 ) /2 dy, let ( x 2) /2 y dy ( y 2 ) /2 ] arcsin(y) / 2 2 ( π 4 ) π 2. In order to derive the density of Q, the following two cases will be considered: even k 4 and odd k 3. Case : k 4 and k is even. Consider the integration x ( x ) (k 4)/2 exp(qx /2) dx. Let x y, and for

3 convenience of notation also let (k 4)/2 n. Then, x ( x ) (k 4)/2 exp(qx /2) dx ( y)y n exp(q( y))/2 d( y) (y n y n+ ) exp(q( y)/2) dy [ exp(q/2) y n exp( qy/2) dy ] y n+ exp( qy/2) dy. (2.4.6) Note that y n+ exp( qy/2) dy (2/q)y n+ d exp( qy/2) (2/q) [ y n+ exp( qy/2) ] + (2/q) (n + ) exp( qy/2)y n dy (2/q) exp( q/2) + 2(n + ) q y n exp( qy/2) dy (2.4.7) Substituting Equation (2.4.7) into Equation (2.4.6) yields x ( x ) (k 4)/2 exp(qx /2) dx (2/q) + [ 2(n + )/q] exp(q/2) y n exp( qy/2) dy (2.4.8) For the term yn exp( qy/2) dy it follows after repeated integration by parts and collecting terms that n+ y n exp( qy/2) dy 2 i n! exp( q/2) (n i + )!q i + 2 n+ n!. (2.4.9) (n + (k 2)/2)!qn+

3 Substituting Equation (2.4.9) into Equation (2.4.8) yields 2 q x ( x ) (k 4)/2 exp(qx /2) dx + 2(n + ) q q 2 q + k 2 q q n+ (k 2)/2 2 i n! 2(n + ) q (n i + )!q i q The last step uses the fact that (k 4)/2 n. exp(q/2)2 n+ n! (n + (k 2)/2)!q n+ 2 i [(k 4)/2]! + (q k + 2) exp(q/2)2(k 2)/2 [(k 4)/2]!. [(k 2)/2 i]! q i q k/2 Combining Equations (2.4.5) and (2.4.), the density of Q is obtained as (2.4.) f(q) π /2 q k 2 exp( q) 2 k Γ[(k )/2]Γ[(k 2)/2] 2 q + k 2 q (k 2)/2 2 i [(k 4)/2]! + (q k + 2) exp(q/2)2(k 2)/2 [(k 4)/2]!. q [(k 2)/2 i]! q i q k/2 To proceed, notice that when k is even, the following equalities hold: (2.4.) Γ[(k 2)/2] [(k 4)/2]! (2.4.2a) Γ[(k )/2] π/2 p (k 2)/2 2 (k 2)/2 (2.4.2b) Substituting Equations (2.4.2a) and (2.4.2b) into Equation (2.4.) yields the following equation π /2 q k 2 exp( q) (q k + 2) exp(q/2)2 (k 2)/2 [(k 4)/2]! 2 k Γ[(k )/2]Γ[(k 2)/2] q k/2 exp( q/2)q(k 4)/2 (q k + 2) 2p (k 2)/2. (2.4.3)

32 Now consider Equation (2.4.) again. Write the first two terms in the bracket as (2/q) + k 2 q q (k 2)/2 (k 2)/2 (k 2)/2 (k 2)/2 k 2 q (k 2)/2 2 i Γ[(k 2)/2] Γ(k/2 i)q i (k 2)2 i Γ[(k 2)/2] Γ(k/2 i)q i+ (k 2)2 i Γ[(k 2)/2] Γ(k/2 i)q i+ 2 i [(k 4)/2]! [(k 2)/2 i]! q i + (2/q) (k 2)/2 i2 (k 2)/2 (k 2)/2 2 i Γ[(k 2)/2] Γ(k/2 i)q i 2 i Γ[(k 2)/2] Γ(k/2 i)q i 2 i+ Γ[(k 2)/2] Γ(k/2 i )q i+, i2 i+ Γ[(k 2)/2] Γ(k/2 i)q i+ + (k 2)2(k 2)/2 Γ[(k 2)/2] Γ[k/2 (k 2)/2]q (k 2)/2+ (change of dummy variable) by the fact that Γ(k/2 i) (k/2 i )Γ(k/2 i ), (k 2)/2 i2 i+ Γ[(k 2)/2] Γ(k/2 i)q i+ + (k 2)2(k 2)/2 Γ[(k 2)/2] q k/2. Then, π /2 q k 2 exp( q) 2 k Γ[(k )/2]Γ[(k 2)/2] (k 2)/2 (k 2)/2 (k 2)/2 i2 i+ Γ[(k 2)/2] + (k 2)2(k 2)/2 Γ[(k 2)/2] Γ(k/2 i)q i+ q k/2 i exp( q)q k 3 i 2 (k 2)/2 i Γ(k/2 i)p (k 2)/2 + qk/2 2 exp( q)(k 2)/2 p (k 2)/2, by Equation (2.4.2b) i exp( q)q k 3 i 2 (k 2)/2 i Γ(k/2 i)p (k 2)/2. (2.4.4) Therefore, the density of Q is obtained after combining Equations (2.4.3) and (2.4.4): f(q) exp( q/2)q(k 4)/2 (q k + 2)/2 + < q <. (k 2)/2 i exp( q)q k 3 i / p 2 (k 2)/2 i Γ(k/2 i) (k 2)/2, (2.4.5)

33 It follows that the distribution function of Q is P (Q q) q exp( x/2)x(k 4)/2 (x k + 2)/2 + p (k 2)/2 + p (k 2)/2 q q [ exp( x/2)x (k 2)/2 (k 2)/2 2 (k 2)/2 i exp( x)x k 3 i 2 (k 2)/2 i Γ(k/2 i) dx [ Γ(k/2)2 (k 2)/2 (G k (q) G k 2 (q)) ] p (k 2)/2 + p (k 2)/2 (k 2)/2 exp( x/2)x(k 4)/2 (k 2) 2 i q x k 3 i exp( x) dx 2 (k 2)/2 i Γ(k/2 i) i exp( x)x k 3 i / p 2 (k 2)/2 i Γ(k/2 i) (k 2)/2 dx ] dx where G k denotes the distribution funtion of the chi-squared distribution with k df, and now let y 2x, [ Γ(k/2)2 (k 2)/2 (G k (q) G k 2 (q)) ] p (k 2)/2 + p (k 2)/2 (k 2)/2 i 2 (k 2)/2 i Γ(k/2 i) 2q [ Γ(k/2)2 (k 2)/2 (G k (q) G k 2 (q)) ] p (k 2)/2 + p (k 2)/2 (k 2)/2 (/2) k 2 i y k 3 i exp( y/2) dy i 2 (k 2)/2 i Γ(k/2 i) (/2)k 2 i Γ(k 2 i)2 k 2 i G 2(k 2 i) (2q) [ Γ(k/2)2 (k 2)/2 (G k (q) G k 2 (q)) ] p (k 2)/2 + p (k 2)/2 (k 2)/2 iγ(k 2 i) 2 (k 2)/2 i Γ(k/2 i) G 2(k 2 i)(2q). (2.4.6)

34 Let h(u) denote the density of U ˆσ 2 /σ 2. Then P [sup(t 2 c,x) b 2 ] P [Q b 2 U] Notice that P (Q b 2 u)h(u) du, by conditioning on u. (2.4.7) G k (b 2 u)h(u) du G k (b 2 u) dh(u), where H(u) is the distribution function of U, P (Y/U < b 2 ), where Y has a chi-squared distribution with k df, U has distribution function H(u), and Y and U are independent, P ( Y/k Uν/ν < b2 k ) F k,ν (b 2 /k), because Uν has a chi-squared distribution with ν df. Finally, after substituting Equation (2.4.6) into Equation (2.4.7) and using the result obtained above that G k (b 2 u)h(u) du F k,ν (b 2 /k), the following result follows in the case where k 4 and k is even: P [sup(t 2 c,x) b 2 ] p (k 2)/2 Γ(k/2)2 (k 2)/2 + p (k 2)/2 (k 2)/2 [(G k (b 2 u) G k 2 (b 2 u))]h(u) du iγ(k 2 i) 2 (k 2)/2 i Γ(k/2 i) G 2(k 2 i) (2b 2 u)h(u) du p (k 2)/2 Γ(k/2)2 (k 2)/2 [F k,ν (b 2 /k) F k 2,ν (b 2 /(k 2))] + p (k 2)/2 (k 2)/2 iγ(k 2 i) 2 (k 2)/2 i Γ(k/2 i) F 2(k 2 i),ν(b 2 /(k i 2)). (2.4.8)