Reports of the Institute of Biostatistics

Similar documents
Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

x i1 =1 for all i (the constant ).

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Economics 130. Lecture 4 Simple Linear Regression Continued

First Year Examination Department of Statistics, University of Florida

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

x = , so that calculated

Lecture 4 Hypothesis Testing

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Statistics II Final Exam 26/6/18

Chapter 12 Analysis of Covariance

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Chapter 11: Simple Linear Regression and Correlation

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

/ n ) are compared. The logic is: if the two

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Statistics for Economics & Business

F statistic = s2 1 s 2 ( F for Fisher )

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Joint Statistical Meetings - Biopharmaceutical Section

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Chapter 13: Multiple Regression

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Linear Approximation with Regularization and Moving Least Squares

Chapter 8 Indicator Variables

Estimation: Part 2. Chapter GREG estimation

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

STAT 511 FINAL EXAM NAME Spring 2001

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

STAT 3008 Applied Regression Analysis

A Robust Method for Calculating the Correlation Coefficient

Jon Deeks and Julian Higgins. on Behalf of the Statistical Methods Group of The Cochrane Collaboration. April 2005

A Comparative Study for Estimation Parameters in Panel Data Model

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Modeling and Simulation NETW 707

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Professor Chris Murray. Midterm Exam

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Statistics for Business and Economics

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Lecture 6: Introduction to Linear Regression

4.3 Poisson Regression

Lecture 3 Stat102, Spring 2007

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

STATISTICS QUESTIONS. Step by Step Solutions.

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Random Partitions of Samples

Andreas C. Drichoutis Agriculural University of Athens. Abstract

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Basic Business Statistics, 10/e

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Lecture 3: Probability Distributions

Biostatistics 360 F&t Tests and Intervals in Regression 1

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Using Multivariate Rank Sum Tests to Evaluate Effectiveness of Computer Applications in Teaching Business Statistics

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Lecture 6 More on Complete Randomized Block Design (RBD)

A new construction of 3-separable matrices via an improved decoding of Macula s construction

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Comparison of Regression Lines

Lecture Notes on Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Time-Varying Coefficient Model with Linear Smoothing Function for Longitudinal Data in Clinical Trial

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Testing for seasonal unit roots in heterogeneous panels

Composite Hypotheses testing

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

ANOVA. The Observations y ij

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Difference Equations

Econometrics of Panel Data

Comments on Detecting Outliers in Gamma Distribution by M. Jabbari Nooghabi et al. (2010)

Polynomial Regression Models

Chapter 5 Multilevel Models

On mutual information estimation for mixed-pair random variables

Transcription:

Reports of the Insttute of Bostatstcs No 0 / 2007 Lebnz Unversty of Hannover Natural Scences Faculty Ttel: IUT for multple endponts Authors: Maro Hasler

Introducton Some of the focus n new drug development has been shfted to develop new medcnes whch may not necessarly be more effectve but have some other advantages compared to currently marked drugs, lke reducng toxcty. An applcaton, e.g., s to show safety of a new treatment on multple endponts compared to a reference. A rgorous clamng s to declare global safety f and only f each endpont s safe. Two-sded hypotheses are approprate for most endponts because a drecton of a harm effect s not known a pror. Ths s, each endpont both must not undershoot a gven lower lmt of the reference and must not overshoot a gven upper lmt of ths reference, respectvely. Because t s often hard to fx unform absolute safety thresholds jontly for all endponts, ratos (not dfferences) to control shall be consdered, too. The equvalence thresholds must be set a pror. But they are relatve, e.g. n percent, gvng an easy nterpretaton. For example, the new treatment wll be declared as safe f, for each endpont, not undershootng a lower lmt of 80% of the reference and not overshootng an upper lmt of 25% of ths reference, respectvely. Much work has been done on the assessment of boequvalence or therapeutc equvalence between two treatments on a unvarate endpont. But there s lmted research on the assessment of equvalence on multple endponts. The tradtonal way to treat ths problem, the ntersecton-unon-test (IUT), s known to be conservatve n many stuatons. Aganst the background of ths problem, the queston arses whether there are tests not havng ths weak pont. In fact, there are some mproved tests based on the IUT but most of them only hold for specal cases. On the other hand, dfferent approaches exst, lke the Hotellng s T 2 -test, usng a square sum test statstc for the dfferences n the means to show equvalence on multple endponts. A short recommendaton n lterature s: Bloch et al. [2], Berger and Hsu [], Casella and Berger [3], Hochberg and Tamhane [5], Wu et al. [8]. These tests ether do not explot the complete type I error - they have level α, not sze α - or they are not applcable for ratos. Lke the unon-ntersecton-test (UIT) for whch a multvarate t-dstrbuton can be derved for the global test statstc, the dea was to do the same for the ntersecton-unon-test (IUT). The tradtonal IUT becomes less conservatve for hgh correlatons and, hence, very conservatve for lower or negatve. A multvarate approach, takng correlatons nto account, was assumed to avod ths handcap. The expected advantage was to get a sze-α test ths way. 2 Unon-ntersecton and ntersecton-unon method 2. Unon-ntersecton method The unon-ntersecton method (UI) of test constructon mght be useful when the null hypothess can be convenently expressed as an ntersecton of a famly of hypotheses, ths s, H 0 H 0.

Suppose that a sutable test s avalable for each H 0 : θ Θ versus H : θ Θ c. We can then wrte H 0 : θ Θ. Say the rejecton regon for the test of H 0 s {x : T (x) R }. Hence, accordng to Roy (953), the rejecton regon for the unon-ntersecton test of H 0 s k {x : T (x) R }. Ths means that the global null hypothess H 0 s rejected f and only f at least one of ts component local null hypotheses H 0 s rejected. I.e., a new drug s tested and sad to be hazardous f at least one endpont s hazardous. Dependng on the test drecton the local rejecton regon for each of the ndvdual tests may be {x : T (x) > c}. wth a common c for each ndvdual test. The global rejecton regon of the UIT s k {x : T (x) > c} {x : max T (x) > c}.,...,k Thus, the test statstc for testng H 0 s T (x) max,...,k T (x). For the nverse test drecton, the local rejecton regon for each of the H 0 s Analogcal consderatons lead to the test statstc {x : T (x) < c}. T (x) mn T (x).,...,k 2.2 Intersecton-unon method In contrast to the unon-ntersecton method (IU) of test constructon the ntersecton-unon method s useful f the null hypothess can be convenently expressed as an unon of a famly of hypotheses, ths s, k H 0 H 0. 2

Agan, supposng that a sutable test s avalable for each H 0 : θ Θ versus H : θ Θ c we can then wrte k H 0 : θ Θ. The rejecton regon for the test of H 0 s {x : T (x) R }. Hence, the rejecton regon for the ntersecton-unon test of H 0 s {x : T (x) R }. Ths means that the global null hypothess H 0 s rejected f and only f each of ts component local null hypotheses H 0 s rejected. I.e., a new drug s tested and sad to be safe f each endpont s safe. Theorem: Let α be the sze of the test of H 0 wth rejecton regon R (,..., k). Then the IUT wth rejecton regon R k R s a level-α test, that s, ts sze s at most α wth α max,...,k α. Proof: Let θ k Θ. Then θ Θ for some and P θ (X R) P θ (X R ) α α. Suppose the test drecton for whch the local rejecton regon for each of the ndvdual tests s {x : T (x) > c} wth a common c for each ndvdual test. Then the global rejecton regon of the IUT s {x : T (x) > c} {x : mn T (x) > c}.,...,k And thus, the test statstc for testng H 0 s T (x) mn T (x).,...,k Agan, the nverse test drecton leads to the local rejecton regon for each of the H 0, ths s now, {x : T (x) < c}. q.e.d. And we obtan the test statstc T (x) max,...,k T (x). 3

3 Test procedure 3. Assumptons For,..., k and j,...,, let X j denote the outcomes for k endponts of an expermental treatment. Suppose that these random varables follow a k-varate normal dstrbuton wth mean vector µ X (µ X,... µ Xk ) and unknown covarance matrx Σ X. In the same manner, let the outcomes Y j of a reference treatment be k-varate normal dstrbuted wth parameters µ Y (µ Y,... µ Y k ) and Σ Y. Suppose that X j and Y j are mutually ndependent and Σ X Σ Y Σ. In ths way, the expermental and the reference treatment are presumed to have the same varaton per each sngle endpont. Let X ( X,..., X k ), Ȳ (Ȳ,..., Ȳk) and ˆΣ X, ˆΣ Y be the sample mean vectors and the sample covarance matrces for both treatments, respectvely, wth X j The pooled sample covarance matrx ˆΣ s gven by wth the elements X j, Ȳ j ˆΣ ( ) ˆΣ X ( ) ˆΣ Y 2 Y j. ˆσ j Ĉov j ( )Ĉov(X, X j ) ( )Ĉov(Y, Y j ) 2 (, j k) where Ĉov(X, X j ) and Ĉov(Y, Y j ) are the estmates for the covarances of the several endponts. Ths does not mean the same weghtng as Bloch et al. [2] do. But ths denotaton results n the fact that the dagonal elements then are ˆσ ˆσ 2 ( )S 2 X ( )S 2 Y 2 (,..., k) wth S 2 X j (X j X ) 2, S 2 Y j (Y j Ȳ) 2 whch are necessary n the followng test procedure. From the pooled sample covarance matrx ˆΣ, we then derve the estmaton of the common correlaton matrx of the data ˆR. The object s to compare the new expermental treatment wth the reference, and to consder t to be safe f each endpont s safe. Ths means an ntersecton-unon test. We frst observe the one-sded, later on, the equvalence test problem. 4

3.2 Test for dfferences n means The new expermental treatment s declared to be safe f and only f each endpont does not undershoot a gven fxed lmt of the reference. Ths results n the component local tests H 0 : µ X µ Y δ vs. H : µ X µ Y > δ () wth a relevant threshold δ. The global null hypothess of the underlyng ntersecton-unon test (IUT) s k H 0 H 0. Fgure shows the parameter space of a test for the case of k 2 endponts. The rejecton regon for the test of H 0 s {x, y : T (x, y ) > c}. wth the t-test statstcs T X Ȳ δ, ˆσ a common quantle c for each ndvdual test and the pooled estmators ˆσ 2 for σ2. Under the margnal assumptons of H 0, that s, µ X µ Y δ 0, the test statstcs T are t-dstrbuted wth 2 degrees of freedom. The global rejecton regon of the IUT s {x, y : T (x, y ) > c} {x, y : mn {T (x, y )} > c}.,...,k And thus, the test statstc for testng H 0 s T (x, y) mn {T (x, y )}. (3),...,k Under the margnal assumptons of all H 0 (the ntersecton of them), the test statstcs T approxmatvely follow a jont k-varate t-dstrbuton wth 2 degrees of freedom and a correlaton matrx dependng on the data s correlaton matrx, R. But because the global null hypothess s a unon - and not an ntersecton - of ts local hypotheses, the margn of ths global null hypothess s not unque whch would be necessary for dervng a jont k-varate t-dstrbuton under H 0. So, we, ndeed, have to take quantles c t ν, α of a unvarate t-dstrbuton. The decson rule s to reject H 0 and to conclude global safety f T (x, y) > t ν, α. (4) If safety s declared f and only f each endpont does not overshoot a gven fxed lmt of the reference, the component local tests are H 0 : µ X µ Y δ vs. H : µ X µ Y < δ (5) 5

Fgure : Parameter space of the test by dfferences for non-nferorty wth k 2 endponts. wth a relevant threshold δ. Fgure 2 shows the parameter space of a test for the case of k 2 endponts. The rejecton regon for the test of H 0 s {x, y : T (x, y ) < c} wth the t-test statstcs accordng to Equaton. The global rejecton regon of the IUT s {x, y : T (x, y ) < c} {x, y : max {T (x, y )} < c}.,...,k The test statstc for testng H 0 s now The decson rule now s to reject H 0 f whch corresponds wth T (x, y) < t ν, α. T (x, y) max,...,k {T (x, y )}. (6) T (x, y) < t ν,α (7) 6

Fgure 2: Parameter space of the test by dfferences for non-superorty wth k 2 endponts. Now, the new expermental treatment s declared to be safe f and only f each endpont both does not undershoot a gven fxed lower lmt of the reference and does not overshoot a gven fxed upper lmt of the reference, respectvely. Ths results n the component local tests for H 0 : µ X µ Y δ () or µ X µ Y δ vs. H : µ X µ Y > δ () and µ X µ Y < δ (8) wth relevant thresholds δ () < δ. The global null hypothess of the underlyng ntersecton-unon test s k k H 0 H 0 {H () 0 H 0 } wth H () 0 : µ X µ Y δ () and H 0 : µ X µ Y δ. The global test on equvalence s an IUT because the null hypothess can be expressed as a unon of a famly of hypotheses. Each local test tself s an IUT, too, because made up of two one-sded tests wth contrary drecton. In rewrtng H 0 by H 0 k H () 0 k H 0 H () 0 H 0, 7

we reorganze the test problem. H () 0 and H 0 represent two one-sded IUT now wth contrary drecton we have already focused. The test for the global H 0 s stll an IUT because the null hypothess s agan a unon of two hypotheses. Fgure 3 shows the parameter space of a test for the case of k 2 endponts. The rejecton regon for the test of H 0 s {x, y : T () (x, y ) > c () } wth the t-test statstcs T () {x, y : T (x, y ) < c }. X Ȳ δ (), T ˆσ X Ȳ δ ˆσ, (9) the quantles c () and c for the ndvdual test and the pooled estmators ˆσ 2 for σ2. Under the margnal assumptons of H () () 0, the test statstcs T are t-dstrbuted wth 2 degrees of freedom. Under the margnal assumptons of H 0, the test statstcs T are t-dstrbuted wth 2 degrees of freedom. From the consderatons above, t follows that the rejecton regon for ths IUT s {x, y : mn T (),...,k (x, y ) > c () } {x, y : max We now rewrte the test hypotheses of Equaton (8) as follows,,...,k T (x, y ) < c }. H 0 : µ X µ Y δ () or µ Y µ X δ vs. H : µ X µ Y > δ () and µ Y µ X > δ. (0) All the consderatons above stay the same but the par of test statstcs accordng to Equaton (9) changes nto T () X Ȳ δ () T ˆσ, T Ȳ X δ ˆσ, () The test statstcs T and now have converse test drectons and hence, T () and same. Herewth, the rejecton regon can be transformed nto {x, y : mn T () (x, y ) > c () } {x, y : mn T (x, y ) > c }.,...,k,...,k T have the As mentoned above, the test for the global H 0 s an IUT because the null hypothess s an unon of two hypotheses. But the local null hypotheses H () 0 and H 0 exclude each other. When H () 0 s true then H 0 can not. Hence, we can not assume both the margnal assumptons of H () 0 and H 0. There s no unque margn for the global null hypothess. The followng relatons can be shown, E(T () H 0 ) E( T H () 0 ) δ δ () σ, δ δ () σ. (3) 8

Fgure 3: Parameter space of a test by dfferences on equvalence for k 2 endponts, the alternatve hypothess H s an ntersecton of two one-sded alternatve hypotheses H () and H. These relatons are easy to see n wrtng the test statstcs T () and T n terms of each other, ) X Ȳ δ () δ δ ˆσ T () T T δ δ () ˆσ, Ȳ X δ δ () δ () ˆσ T () δ δ () ˆσ. (Ȳ X δ ˆσ ( ) X Ȳ δ () ˆσ δ δ () ˆσ δ δ () ˆσ Therefore, under the margnal assumpton of H () 0, the test statstc T follows a non-central unvarate t-dstrbuton wth 2 degrees of freedom, non-centralty parameter θ δ δ (). (4) σ 9

The test statstc T () follows the same dstrbuton but under the margnal assumpton of H 0. For ths reason, we need two test statstcs for testng H 0, namely { } { } T () (x, y) mn T () (x, y ), T (x, y) mn T (x, y ). (5),...,k,...,k Agan, under the margnal assumptons of all H () 0 (the ntersecton of them), the test statstcs T approxmatvely follow a jont k-varate t-dstrbuton wth 2 degrees of freedom and a correlaton matrx dependng on the data s one, R. But because of the sad reasons, one can not derve a jont k-varate t-dstrbuton under H 0. So, we, have to take quantles c t ν, α of a unvarate t-dstrbuton. The decson rule s to reject H () 0 f T () (x, y) > t ν, α. In the same manner, the decson rule s to reject H 0 f Safety can only be concluded f both T (x, y) > t ν, α. T () (x, y) > t ν, α and T (x, y) > t ν, α. (6) 3.3 Test for ratos of means Most of the results of the test for dfferences n means holds for the case of ratos, too. So, () chances nto H 0 : µ X ψ vs. H : µ X > ψ (7) µ Y µ Y wth a relevant threshold ψ. Fgure 4 shows the parameter space of a test for the case of k 2 endponts. The local rato-test statstcs are The test statstc for testng H 0 s T X ψ Ȳ. (8) ˆσ ψ2 T (x, y) The decson rule s to reject H 0 and to conclude global safety f mn {T (x, y )}. (9),...,k T (x, y) > t ν, α. (20) Correspondngly, (5) chances nto H 0 : µ X µ Y ψ vs. H : µ X µ Y < ψ. 0

Fgure 4: Parameter space of the test by ratos for non-nferorty wth k 2 endponts. Fgure 5 shows the parameter space of a test for the case of k 2 endponts. The local rato-test statstcs are the same as n Equaton (8). The test statstc for testng H 0 s now The decson rule s to reject H 0 f whch corresponds wth T (x, y) < t ν, α. T (x, y) max,...,k {T (x, y )}. (22) T (x, y) < t ν,α (23) When the new expermental treatment s declared to be safe f and only f each endpont both does not undershoot a gven relatve lower lmt of the reference and does not overshoot a gven relatve upper lmt of the reference, respectvely, (8) chances nto H 0 : µ X ψ () µ Y or µ X ψ µ Y H : µ X µ Y > ψ () and µ X µ Y < ψ (24) wth relevant thresholds ψ () < ψ. Fgure 6 shows the parameter space of a test for the case of k 2 endponts. The local rato-test statstcs are X ψ () ˆσ T () Ȳ ψ()2, T X ψ ˆσ vs. Ȳ ψ2. (25)

Fgure 5: Parameter space of the test by ratos for non-superorty wth k 2 endponts Fgure 6: Parameter space of a test by ratos on equvalence for k 2 endponts, the alternatve hypothess H s an ntersecton of two one-sded alternatve hypotheses H () and H. 2

They wll be rewrtten nto X ψ () ˆσ T () Ȳ ψ()2, T σ Ȳ ψ ˆσ ψ()2 X, (26) havng the same test drectons now. The followng relatons can be shown, ( ) ( ) µ E(T () H 0 ) ψ X ψ () µ Y, (27) E( T H () 0 ) ( ψ () ) µ Y σ ( ψ ) µ X. (28) These relatons are agan easy to see n wrtng the test statstcs T () and T n terms of each other, T () T X ψ () X ψ () ˆσ X ψ () ˆσ Ȳ ψ Ȳ ψ ˆσ Ȳ ψ ˆσ Ȳ Ȳ ψ ˆσ Ȳ Ȳ ψ ψ()2 Ȳ Ȳ ψ ψ()2 ψ()2 X X ψ () ˆσ X Ȳ ψ X X X X ψ () Ȳ X X ψ () Ȳ X Ȳ ψ ˆσ T X ψ()2 Ȳ X ψ () Ȳ ˆσ T () ˆσ ˆσ, ψ()2 X ψ () Ȳ ˆσ ˆσ ψ()2. ψ()2 ψ()2 3

Therefore, under the margnal assumpton of H () 0, the test statstc T follows a non-central unvarate t-dstrbuton wth 2 degrees of freedom, non-centralty parameter ( ) ( ) ψ () θ µ Y µ ψ X. (29) σ Under the margnal assumpton of H 0 non-centralty parameter θ (), the test statstc T () ( ) ( µ ψ Xk k σ k ψ () k ψ()2 k follows the same dstrbuton but wth ) µ Y k. (30) For ths reason, we now have two test statstcs for testng H 0 as follows, { } { } T () (x, y) mn T () (x, y ), T (x, y) mn T (x, y ). (3),...,k,...,k The decson rule s to reject H 0 and to conclude safety f both T () (x, y) > t ν, α and T (x, y) > t ν, α. (32) 3.4 α-smulatons Smulaton studes were performed for 2, 4, 8 and 20, 40, 80 endponts wth several means and varances. For each fxed number of endponts k {2, 4, 8, 20, 40, 80}, dfferent grades of correlaton were consdered: maxmal negatve correlaton, correlaton 0, correlaton 0.5 and maxmal correlaton. For each fxed k and grade of correlaton, the endponts were equcorrelated, ths s ρ j ρ for all j k. Note that the negatve correlatons n the left column are bounded below by ρ mn k. 00000 smulaton runs were taken for 2, 4, 8 endponts, 0000 for 20, 40, 80 endponts. Each smulaton result was obtaned usng a program code n the statstc software R [7] and applyng the package mvtnorm by Genz and Bretz [4]. The am to show that usng related quantles of a k-varate t-dstrbuton to obtan a sze-α test, was not acheved. Indeed, ths approach leads to an exact sze α but only for the ntersecton of all margns of local null hypotheses k H 0. Quantles of a unvarate t-dstrbuton result n very conservatve decsons for that stuaton. But for the case of nterest (the unon k H 0 H 0 ), the unvarate method keeps the α-level conservatvely, whle the multvarate qute fals. An example s: 4 endponts wth correlaton 0, one-sded testng (non-nferorty), balanced sample sze 00, coeffcent of varaton 0.25, µ Y (0.,, 0, 00), µ X (0.079,.0, 0, 00), δ ( 0.02, 0.20, 2.00, 20.00). That means that µ X s nferor (unsafe), the others are non-nferor (safe). Because not each endpont s safe, global safety could not be declared. The related type I errors are: 0.37 (multvarate method) and 0.03 (tradtonal unvarate IUT). 4

4 Dscusson One possblty to show boequvalence or therapeutc equvalence between two treatments on multple endponts s usng an IUT for ether dfferences n means or ratos of them. Ths then yelds a global tests whch rejects f and only f each local test rejects. E.g., a new expermental treatment s declared to be safe f each endpont s safe, and safety s defned n not under-/ overshootng a gven fxed lmt of a reference. The IUT s known to be very conservatve n many stuatons. One reason s that t does not take any correlatons nto account. Each endpont wll be tested separately usng quantles or p-values from unvarate t-dstrbutons. Another reason s the nature of the margn of null hypothess, H 0. So, the am was to extend the IUT to a multvarate approach lke the UIT usng a multvarate t-dstrbuton nstead of a, say Bonferron, adjustment. The concluson so far s that there s no such easy equvalent multvarate-t approach for the IUT. The studed one does not keep the α-level for the complete space of the null hypotheses. Another noteworthy fact s that an IUT for showng equvalence between two treatments on multple endponts always comes to a global decson. All endponts together are equvalent or not. If not, omttng the hazardous endponts does not synonymously mean the equvalence of the remanng ones. To demonstrate equvalence on a subset of endponts (at least,..., k of k) and to dentfy those, the procedure of Quan et al. [6] s an approprate soluton, for example. 5

References [] R.L. Berger and J.C. Hsu. Boequvalence trals, ntersecton-unon tests and equvalence confdence sets. Statstcal Scence, (4):283 39, 996. [2] D.A. Bloch, T.L. La, and P. Tubert-Btter. One-sded tests n clncal trals wth multple endponts. Bometrcs, 57:039 047, 200. [3] G. Casella and R.L. Berger. Statstcal Inference. Duxbury, Thomson Learnng, 2002. [4] A. Genz, F. Bretz, and R port by T. Hothorn. mvtnorm: Multvarate Normal and t Dstrbuton, 2006. R package verson 0.7-5. [5] Y. Hochberg and A. C. Tamhane. Multple Comparson Procedures. John Wley and Sons, Inc., New York, 987. [6] Hu Quan, Jm Bolognese, and Weyng Yuan. Assessment of equvalence on multple endponts. Statstcs n Medcne, 20:359 373, 200. [7] R Development Core Team. R: A Language and Envronment for Statstcal Computng. R Foundaton for Statstcal Computng, Venna, Austra, 2007. ISBN 3-90005-07-0. [8] Y. Wu, M.G. Genton, and L.A. Stefansk. A multvarate two-sample mean test for small sample sze and mssng data. Bometrcs, 62:877 885, 2006. 6