ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

Similar documents
Asymptotic non-null distributions of test statistics for redundancy in high-dimensional canonical correlation analysis

Estimation of the large covariance matrix with two-step monotone missing data

High-Dimensional Asymptotic Behaviors of Differences between the Log-Determinants of Two Wishart Matrices

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Hotelling s Two- Sample T 2

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

arxiv: v2 [stat.me] 3 Nov 2014

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

1 Extremum Estimators

Estimating function analysis for a class of Tweedie regression models

Introduction to Probability and Statistics

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

4. Score normalization technical details We now discuss the technical details of the score normalization method.

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

On split sample and randomized confidence intervals for binomial proportions

SOME TESTS CONCERNING THE COVARIANCE MATRIX IN HIGH DIMENSIONAL DATA

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Ratio Estimators in Simple Random Sampling Using Information on Auxiliary Attribute

Likelihood Ratio Tests in Multivariate Linear Model

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

MULTIVARIATE SHEWHART QUALITY CONTROL FOR STANDARD DEVIATION

Uniform Law on the Unit Sphere of a Banach Space

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

arxiv:cond-mat/ v2 25 Sep 2002

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

The power performance of fixed-t panel unit root tests allowing for structural breaks in their deterministic components

Slides Prepared by JOHN S. LOUCKS St. Edward s s University Thomson/South-Western. Slide

Slash Distributions and Applications

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

arxiv: v3 [physics.data-an] 23 May 2011

arxiv: v1 [math.st] 2 Apr 2016

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A-optimal diallel crosses for test versus control comparisons. Summary. 1. Introduction

Chapter 3. GMM: Selected Topics

An Introduction to Multivariate Statistical Analysis

Analysis of variance, multivariate (MANOVA)

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Generating Swerling Random Sequences

arxiv: v1 [physics.data-an] 26 Oct 2012

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

Generalized Coiflets: A New Family of Orthonormal Wavelets

ON THE SET a x + b g x (mod p) 1 Introduction

Khinchine inequality for slightly dependent random variables

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplement to the paper Accurate distributions of Mallows C p and its unbiased modifications with applications to shrinkage estimation

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT)

HENSEL S LEMMA KEITH CONRAD

pp physics, RWTH, WS 2003/04, T.Hebbeker

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Hypothesis Test-Confidence Interval connection

7.2 Inference for comparing means of two populations where the samples are independent

i) the probability of type I error; ii) the 95% con dence interval; iii) the p value; iv) the probability of type II error; v) the power of a test.

Notes on Instrumental Variables Methods

ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION

An Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling

An Inverse Problem for Two Spectra of Complex Finite Jacobi Matrices

An Analysis of TCP over Random Access Satellite Links

On the Properties for Iteration of a Compact Operator with Unstructured Perturbation

Variable Selection and Model Building

On the sphericity test with large-dimensional observations. Citation Electronic Journal of Statistics, 2013, v. 7, p

One-way ANOVA Inference for one-way ANOVA

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

CONGRUENCE PROPERTIES OF TAYLOR COEFFICIENTS OF MODULAR FORMS

Estimation of Separable Representations in Psychophysical Experiments

Linear Ridge Estimator of High-Dimensional Precision Matrix Using Random Matrix Theory

Spectral Analysis by Stationary Time Series Modeling

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014

Hidden Predictors: A Factor Analysis Primer

A Note on Massless Quantum Free Scalar Fields. with Negative Energy Density

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

Additive results for the generalized Drazin inverse in a Banach algebra

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

COMMUNICATION BETWEEN SHAREHOLDERS 1

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

STK4900/ Lecture 7. Program

A Time-Varying Threshold STAR Model of Unemployment

AVERAGES OF EULER PRODUCTS, DISTRIBUTION OF SINGULAR SERIES AND THE UBIQUITY OF POISSON DISTRIBUTION

A Unified Framework for GLRT-Based Spectrum Sensing of Signals with Covariance Matrices with Known Eigenvalue Multiplicities

Robustness of multiple comparisons against variance heterogeneity Dijkstra, J.B.

Introduction Model secication tests are a central theme in the econometric literature. The majority of the aroaches fall into two categories. In the r

Understanding and Using Availability

Sign Tests for Weak Principal Directions

Asymptotically Optimal Simulation Allocation under Dependent Sampling

The spectrum of kernel random matrices

MATH 2710: NOTES FOR ANALYSIS

ARTICLE IN PRESS. Physica B

The Poisson Regression Model

Johan Lyhagen Department of Information Science, Uppsala University. Abstract

Developing A Deterioration Probabilistic Model for Rail Wear

Understanding and Using Availability

Chapter 7: Special Distributions

Maximum Likelihood Asymptotic Theory. Eduardo Rossi University of Pavia

Transcription:

J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno and Hirofumi Wakaki This aer is concerned with Demster trace criterion for multivariate linear hyothesis which was roosed for high dimensional situation First we derive asymtotic null and nonnull distributions of Demster trace criterion when both the samle size and the dimension tend to infinity Our aroximations are examined through some numerical exeriments Next we comare the ower of Demster trace criterion with the ones of three classical criteria; likelihood ratio criterion, Lawley-Hotelling trace criterion, and Bartlett-Nanda-Pillai trace criterion when the dimension is large comared to the samle size Key words and hrases: Asymtotic null and nonnull distributions, Demster trace criterion, high dimensional situation, MANOVA tests, ower comarison Introduction Let Y be an N observation matrix which is obtained by indeendently observinga dimensional variate y =(y,,y ) for N subjects A multivariate linear model for Y is exressed as () Y = AΘ+E, where A is a known N k design matrix with rank(a) =k, Θ is a k unknown arameter matrix, and E is an N error matrix It is assumed that the rows of E are indeendently distributed as N (0, Σ) For testing (2) H 0 : CΘ =O vs H : CΘ O, let S h and S e be the matrices of sums of squares and roducts due to the hyothesis and the error defined by S h =(C ˆΘ) [ C(A A) C ] C ˆΘ, S e =(Y A ˆΘ) (Y A ˆΘ), resectively, where C is a q k known matrix with rank(c) = q, and ˆΘ = (A A) A Y Then S h and S e are indeendently distributed as a noncentral Received March 3, 2004 Revised May20, 2004 Acceted May24, 2004 *Graduate School of Science, Hiroshima University, -3- Kagamiyama, Higashi-Hiroshima 739-8526, Jaan

20 YASUNORI FUJIKOSHI ET AL Wishart distribution W (q, Σ; MM ) and a central Wishart distribution W (n, Σ), where n = N k, and M is a q matrix such that (3) MM =(CΘ) [ C(A A) C ] CΘ Under the assumtion that n, the followingthree well known statistics have been used: (i) Likelihood Ratio statistic: log( S e / S e + S h ) (ii) Lawley-Hotellingtrace criterion: tr S h Se (iii) Bartlett-Nanda-Pillai trace criterion: tr S h (S e + S h ) When n<, S e becomes singular, and it will be imossible to use the classical statistics For such cases, a non-exact test was first roosed by Demster (958, 960) for one and two samle cases For testing(2), we can write the corresondingstatistic as (iv) Demster trace criterion: (4) T D = (tr S h )/(tr S e ), which may be called Demster trace criterion For the null distribution of T D,it has been roosed to use F -aroximations by Demster (958, 960), Takeda and Goto (999), etc It seems that the F -aroximations are good in some situations with Σ = λi The test was alied by Demster (960) to an examle which consist of 62 biological measurements and 2 subjects, in order to examine whether there is evidence that the 62 items could be used to distinguish between alcoholic or non-alchoholic In general, it is getting imortant to develo multivariate theory for analyzingmultivariate datasets with fewer observations than the dimension, or with large dimension in comarison of the number of samles, in various alied area Note that the criterion has been roosed for high dimensional case, and so, it is imortant to study its asymtotic behavior when the dimension is large comared to the number n In this aer we study its distribution under a high dimensional framework: (5) A : q;fix, n,, c (0, ) n Further, in addition to A we will assume (6) A2 : tr Σk = O(), (k =, 2), for the null case, and in addition to A and A2 (7) A3 : tr Σk Ω=O(), (k =, 2), for the nonnull case, where Ω=Σ /2 Θ C (C(A A) C ) CΘΣ /2

ASYMPTOTICS IN HIGH DIMENSIONAL MANOVA 2 The assumtion (6) will be natural The assumtion (7) corresonds to local alternatives when Σ = λi In Section 2 we derive asymtotic null distribution of T D Our aroximations are numerically examined through some exeriments On the other hand, Wakaki et al (2002) derived asymtotic distributions of the null and nonnull distributions of the three classical test statistics under the assumtion (5) with c< and the assumtions (6) and (7) In Section 3 we comare the ower of Demster trace criterion with the ones of three classical tests Naturally it is seen that T D will be more owerful than the classical tests as becomes near to n 2 Demster s test In this section we derive the limitingnull and nonnull distributions of Demster test statistic 2 Limiting null distribution Under the null hyothesis, S h and S e are indeendently distributed as the central Wishart distributions, W (q, Σ) and W (n, Σ), resectively Let T D = { n trs } h q trs e Let U and V be defined by U = trs h qtrσ, V = trs e ntrσ (2) 2qtrΣ 2 2ntrΣ 2 For our asymtotics, we assume (5) and (6) Then, consideringthe characteristic functions of U and V, it is seen that U and V are asymtotically distributed as the nomal distribution N(0, ) Using(2), we can exand T D as { T D = U 2nq (trσ 2 )/ + q } n(trσ)/ V 2 (trσ 2 )/ + q n(trσ)/ 2q(trΣ = 2 )/ U + o() (trσ)/ Therefore, we obtain the followingtheorem Theorem 2 (6), it holds that Under the asymtotic framework (5) and the assumtion T D σ D d N(0, ), where d denotes convergence in distribution, and σ D = 2q(tr Σ 2 )/ (tr Σ)/

22 YASUNORI FUJIKOSHI ET AL For a ractical situation, we need to use an estimator instead of σ D since Σ is usually unknown Srivastava (2003) has roosed a high dimensional consistent estimator, ie, (n, )-consistent estimator given by ˆσ D = 2q {(trs 2 e )/n 2 (trs e ) 2 /n 3 } / (trs e )/(n) Table 2 Uer 5 ercent oints Σ = diag(,,) Table 22 Actual error robabilities of the first kind when the nominal level is 005 Σ = diag(,,) q n Simu σ D ˆσ D 2 40 40 3589 3290 3354 2 40 80 354 3290 3328 2 80 40 355 3290 339 2 80 80 3459 3290 3339 2 20 20 3427 3290 3304 2 20 200 3406 3290 3337 2 200 20 3399 3290 330 2 200 200 3398 3290 3283 4 40 40 558 4652 4860 4 40 80 5039 4652 4652 4 80 40 4969 4652 4648 4 80 80 490 4652 4665 4 20 20 4835 4652 4648 4 20 200 4803 4652 4567 4 200 20 4798 4652 47 4 200 200 4788 4652 4709 q n σ D ˆσ D 2 40 40 00640 006 2 40 80 0067 00597 2 80 40 00604 00555 2 80 80 00586 00560 2 20 20 00564 00558 2 20 200 00557 00532 2 200 20 0055 00546 2 200 200 0055 00553 4 40 40 0067 00595 4 40 80 00632 00632 4 80 40 00606 00607 4 80 80 00595 0059 4 20 20 00563 00564 4 20 200 00557 00589 4 200 20 00548 00528 4 200 200 00550 00528 Table 23 Maximal value of q =2, 4, 6, 8, 0 n 20 40 60 80 00 20 40 60 80 200 20 8 8 40 6 4 6 0 8 8 60 2 2 6 8 0 0 0 0 80 4 4 8 0 8 0 0 0 00 4 6 6 8 8 0 0 0 20 2 6 8 0 0 0 0 0 40 2 4 4 8 0 0 0 0 0 60 4 4 8 0 0 0 0 0 80 2 4 4 0 0 0 0 0 0 200 2 2 6 8 0 0 0 0 0

ASYMPTOTICS IN HIGH DIMENSIONAL MANOVA 23 We attemted to clear numerical accuracy of the limitingdistribution in Theorem 2 Note that the null distribution of T D deends on Σ through its characteristic roots λ,λ 2,,λ, and hence we may assume that Σ = diag(λ,,λ ) We choosed the values of q, n,, and (λ,,λ ) as follows: q; 2, 4, 6, 8, 0, n, ; 20, 40, 60, 80, 00, 20, 40, 60, 80, 200, (λ,,λ )=(,,) The error robabilities of the first kind are almost monotone decreasingfor n and, monotone increasingfor q, and lager than 005 If n or is smaller than 50, the error robabilities of the first kind are almost lager than 006 If n and are larger than 00, the error robabilities of the first kind are almost smaller than 006 and larger than 005 We show a art of these results on Table 2 and Table 22 Table 2 gives the estimated uer 5 ercent oints based on Monte Carlo simulation, the limitingdistribution in Theorem 2 and the limitingdistribution when σ D is relaced by ˆσ D, resectively Table 22 gives the actual error robabilities of the first kind by usingthe aroximated ercent oints Table 23 gives the maximal values of q in the above range such that the error robability of the first kind is smaller than 006 and larger than 005 It should be noted that our aroximation method underestimates the ercent oint Further, the aroximations are rather imrovingby usingˆσ D instead of σ D However, the aroximations are not very accurate excet the case when n is large in comarison with In order to get more accurate aroximations it is exected to obtain asymtotic exansions u to O(n /2 )oro(n ) under the high dimensional framework (5) and the assumtion (6) with k = 3or k = 4 22 Limiting nonnull distribution Under alternative hyotheses, S e and S h are indeendently distributed as the central and noncentral Wishart distributions, W (n, Σ) and W (q, Σ,MM ), resectively, where M is defined in (3) Let TD = { n trs h q trσω } trs e trσ Let U and V be defined by (22) U = (trs h qtrσ trσω), V = n (trs e ntrσ) For our asymtotics, we assume (5), (6) and (7) Then U and V are asymtotically indeendent and normal More recisely, V 2(tr Σ 2 )/ d N(0, )

24 YASUNORI FUJIKOSHI ET AL Further, usingthe characteristic function of the noncentral Wishart distribution (see, chater 0 of Muirhead (982)), the cumulant generating function of U can be exanded as log E[ex(itU)] = log E [ ( it ex trs h )] it = q ( 2 logdet I 2it ) Σ (qtrσ + trσω) 2 trω + ( 2 trω I 2it ) Σ it (qtrσ + trσω) { } 2it trσ + 2(it)2 trσ 2 2 trω = q 2 { trω + 2it + trσω + 4(it)2 2 it (qtrσ + trσω) + o() { } =(it) 2 q trσ2 +2 trσ2 Ω + o() Using(22) we obtain TD = { nu + q n(trσ)/ + n(trσω)/ V + n(trσ)/ = (trσ)/ U + o() Therefore we obtain the followingtheorem } trσ 2 Ω q trσω } trσ Theorem 22 Under the asymtotic framework (5) and the assumtions (6) and (7), it holds that T D σ D d N(0, ), where σ D = 2q(trΣ 2 )/ + 4(trΣ 2 Ω)/ (trσ)/ In a secial case Ω = 0, we get the limiting null distribution in Theorem 2 3 Power comarison In this section we comare the ower of Demster test with ones of likelihood ratio test, Lawley-Hotellingtest, and Bartlett-Nanda-Pillai test Let δ D = T D T D = trσω trσ,

ASYMPTOTICS IN HIGH DIMENSIONAL MANOVA 25 then the ower of T D with significance level α can be exressed as P D = Pr( T D >σ D z α )=Pr(T D >σ D z α δ D ), where z α is the uer 00α % oint of the standard nomal distribution Using Theorem 22, the asymtotic ower is ( ) lim P D = lim Φ δd σ D z α If the order of trσ k Ω(k =, 2) is larger than, δ D so that the asymtotic ower tends to one, while if the order of trσ k Ω(k =, 2) is smaller than, the asymtotic ower is α since δ D 0 and σd σ D When trσ k Ω=O( )(k =, 2), we obtain the asymtotic result easily, then ( ) lim P δd D =Φ z α σ D (3) =Φ ( σ D trσω 2qtrΣ 2 z α ) On the other hand, let T LR = T H = { m T BNP = ( + m ){ trs hse ( + m S e log (+ S e + S h + q log ) }, m } q, ) {( + m ) } trs h (S e + S h ) q, then under the assumtion trω = O( ), the ower of T G (G=LR, H, BNP) with significance level α can be exressed (see Wakaki et al (2002)) as ( ) lim P δ0 G =Φ σ z α ( ) trω/ (32) =Φ z α, 2q( + r) where r = /m and m = n + q Note that Ω in the case T D is used for trω in the case T G Comaring(3) with (32) and neglectingthe terms of o(), (trω)/ +r (trω)/ +r > trσω trσ 2 P G >P D, < trσω trσ 2 P G <P D

26 YASUNORI FUJIKOSHI ET AL By the definition of Ω, ie, Ω = Σ /2 MM Σ /2 we have trσω (trω/ ) trσ = trmm 2 trσ MM (trσ 2 )/ (vec(m)) vec(m) = (vec(m)) (I q Σ )vec(m) trσ 2 / max λ j j trσ 2 / min λ j j = trσ 2 / min λ j j, max λ j j where λ i s are the characteristic roots of Σ Thus, neglecting the terms of o() we can see that min λ j j R λ = max j λ j > +r P G <P D In articular, () If Σ = ai (a: constant), then P G <P D (2) If R λ is near to one, then P G <P D (3) If R λ is small and is small, then P G >P D (4) If c is close to one, r becomes large since r = /m c/( c), and hence P G <P D Acknowledgements The authors would like to thank two referees for their careful readings and useful comments References Demster, A P (958) A high dimensional two samle significance test, Ann Math Statist, 29, 995 00 Demster, A P (960) A significance test for the searation of two highly multivariate small samles, Biometrics, 6, 4 50 Muirhead, R J (982) Asects of Multivariate Statistical Theory, John Wiley & Sons, New York Srivastava, M (2003) Multivariate analysis with fewer observations than the dimension, submitted for ublication Takeda, J and Goto, M (999) Comarison of highly multivariate small samles Estimation of effective dimensionality in the Demster s aroximation test, Jaanese J Alied Statist, 28, 63 77 (in Jaanese) Wakaki, H, Fujikoshi, Y and Ulyanov, V (2002) Asymtotic exansions of the distributions of MANOVA test statistics when the dimension is large, TR No 02-0, Statistical Research Grou, Hiroshima University