Likelihood Ratio Tests for High-dimensional Normal Distributions

Size: px

Start display at page:

Download "Likelihood Ratio Tests for High-dimensional Normal Distributions"

Felix Johnson
6 years ago
Views:

1 Likelihood Ratio Tests for High-dimensional Normal Distributions A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Fan Yang IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy Tiefeng Jiang, Adviser December, 011

3 Acknowledgements Ever since I first began graduate study at the University of Minnesota, I have been extremely interested in probability theory and the challenges and opportunities that this topic presents in statistics, and many other areas of research. During the time that I have been constructing my thesis, I have had the chance to reconsider, revise, and refine many of the original concepts and methodologies foundational to my research. I have also had the opportunity to explore some of these concepts in applied settings, which has helped me to integrate my theoretical exploration in ways that have proven useful and practical. The combined research and writing process has been a rich and enlightening one for me, and over time, I have grown and evolved both in my thinking and in my approach to research. As every doctoral student can attest, completion of a dissertation cannot be accomplished in isolation. It involves the support and contribution of many people, both inside and outside the department. I would first like to thank my advisor, Professor Tiefeng Jiang. I am deeply indebted to him for his support and commitment, his enthusiastic guidance, and his insight throughout the research and writing process. He could not have been more generous with his time and effort in directing me toward the completion of my thesis. As my research has been largely carried out in the School of Statistics, I would therefore like to profusely thank the faculty, staff, and fellow students whose contributions were always tremendously helpful and illuminating to me. I would also like to extend my deepest thanks to Boston Scientific Corporation, to my supervisor Mark Balhorn, and to my many co-workers and colleagues for their extraordinary support. Excellence infuses every aspect of Boston Scientific and my affiliation with this company has helped me to challenge myself in an environment where innovation i

4 and creativity are the very foundation. Finally, I should express my sincere gratitude to my best friend Lauren Pacelli for her tireless effort to review and edit my thesis and for her constant encouragement and motivation throughout my entire graduate study. ii

5 Dedication This thesis is dedicated to my parents, Yongyu Yang and Liuqian Zeng, for their strength and encouragement no matter what path I have chosen, and for their unwavering belief in me. It is also dedicated to my wife Lili Yang who is a great joy in my life and a source of inspiration to me always. iii

6 Abstract For a random sample of size n obtained from p-variate normal distributions, we consider the likelihood ratio tests LRT) for their means and covariance matrices. Most of these test statistics have been extensively studied in the classical multivariate analysis see, e.g., ] and 36]) and their limiting distributions under the null hypothesis were proved to be a Chi-Square distribution under the assumption that n goes to infinity while p remains fixed. In our research, we consider the high-dimensional case where both p and n go to infinity and their ratio p/n y 0, 1]. We prove that the likelihood ratio test statistics under this assumption will converge in distribution to a normal random variable and we also give the explicit forms of its mean and variance. We run simulation study to show that the likelihood ratio test using this new central limit theorem outperforms the one using the traditional Chi-square approximation for analyzing high-dimensional data. iv

7 Contents Acknowledgements Dedication Abstract List of Tables List of Figures i iii iv vii viii 1 Background 1 Asymptotic Expansion of Multivariate Gamma Function 5 3 Testing Covariance Matrix of Normal Distribution Proportional to Identity Matrix Sphericity Test) Introduction High-dimensional Likelihood Ratio Test for Sphericity Proof of Theorem Testing Independence of Components of Normal Distribution Introduction High-dimensional LRT for Testing Independence of Components of a Normal Distribution Proof of Theorem v

8 5 Testing Equality of Multiple Normal Distributions Introduction High-dimensional Likelihood Ratio Test for Equality of Multiple Normal Distributions Proof of Theorem Testing Equality of Multiple Covariance Matrices of Normal Distributions Introduction High-dimensional Likelihood Ratio Test for Equality of Multiple Covariance Matrices Proof of Theorem Testing Specified Mean Vector and Covariance Matrix of Normal Distribution Introduction High-dimensional LRT for Testing Specified Mean Vector and Covariance Matrix Proof of Theorem Testing Complete Independence of Normal Distribution Introduction High-dimensional Likelihood Ratio Test for Complete Independence Proof of Theorem Conclusion and Discussion 66 References 68 vi

9 List of Tables 3.1 Sizes and Powers of LRT for Sphericity Sizes and Powers of LRT for Independent Normal Components Sizes and Powers of LRT for Equality of Multiple Normal Distributions Sizes and Powers of LRT for Equality of Multiple Covariance Matrices Sizes and Powers of LRT for Specified Normal Distribution Sizes and Powers of LRT for Complete Independence vii

10 List of Figures 3.1 Sizes and Powers of LRT for Sphericity Sizes and Powers of LRT for Independent Normal Components Sizes and Powers of LRT for Equality of Multiple Normal Distributions Sizes and Powers of LRT for Multiple Covariance Matrices Sizes and Powers of LRT for Specified Normal Distribution Sizes and Powers of LRT for Complete Independence viii

11 Chapter 1 Background In recent years, researchers have become increasingly sophisticated at using technology to expand their capabilities. They are now able to collect massive quantities of data that can produce revelations leading to new insights, more predictive power, better treatment protocols, and faster results at lower costs. However, as technology continues to expand accessibility to the amount and type of data down to minute levels, a whole new set of challenges arises concerning how the data is to be managed to maximize its potential usefulness. The data must be identified, reviewed, categorized, evaluated, and understood if it is to become meaningful information. The cost and time for managing data becomes significant; and therefore, the collection, organization, and processing of datasets with enormous dimensions pose a critical challenge for both researchers and professionals, one that have become a major focus of statistics, mathematics, and computer science far into the foreseeable future. Traditional statistical theory, particularly in multivariate analysis, did not contemplate the demands of high dimensionality in data analysis due to technological limitations. Consequently, in many cases, tests of hypotheses and many other modeling procedures that can be found in the classical multivariate analysis textbooks such as Anderson ], Muirhead 36], and Siotani et al. 43] were developed under the assumption that the dimension of the dataset, conventionally denoted by p, is considered a fixed small constant or at least negligible compared with the sample size n. However, this assumption is no longer true for many modern datasets, because their dimensions can be proportionally 1

12 large compared with the sample size. Examples of high-dimensional datasets include: Financial Data. Over the past decades, the global financial market has grown rapidly in size and complexity thanks to the innovation in financial engineering and the proliferation of financial derivatives. Today, over tens of thousands of financial products, including securities, commodities, currencies, options, swaps, and credit contracts etc., are being traded around the clock with bid information recorded to the financial databases. One big challenge in financial data analysis is to make short-term forecast to portfolios comprised of a great variety of financial products. This usually involves analysis of a dataset with a high dimension number of financial products) but a limited sample size number of trading records within a short time period). Consumer Data. Increasing popularity of using search engine and shopping online is fundamentally changing the landscape of consumer behavior study. As every browsing, searching and purchasing activities are being recorded and compiled into databases, advertisers and marketing researchers more and more rely on analyzing these data to correlate consumer actions to product attributes. One such example is the movie rating data published by Netflix for its one million dollar prize contest between 006 and 009 that purported to improve the prediction of movie ratings based on consumers movie preferences. The training data for this competition included approximately 480, 000 customer ratings on 18, 000 movies. Manufacturing Data. Modern statistical process control for lean manufacturing often adopts real-time data collection that are automatically performed by the test equipment on production lines. Data corresponding to a large number of product and process characteristics, from raw material receiving inspection, components in-process monitoring, to finished goods testing are being recorded to production databases and analyzed for preventing defects and detecting process shift. In some cases, the number of attributes tested on products is comparable to or even exceeding the production volume, leading to datasets with the dimension p greater than the sample size n. Multimedia Data. Multimedia objects, such as images, flash animations, and

13 3 audio and video clips, are ubiquitous in contemporary life. One of the challenging problems in multimedia analysis is on similarity search, i.e., to seek a collection of multimedia objects from a larger pool that are similar to a given query object. As the similarity between two multimedia objects cannot be measured directly, the comparison usually involves mapping some important primitives of a multimedia object to a numeric vector called a feature vector) in a high-dimensional space so that the similarity can be quantified by the distance in that space. Because the quality of the search improves as the dimension of that space goes up, a similarity search often ends up with handling high-dimensional datasets. More examples of high-dimensional data can be found in Donoho 14] and Johnstone 7]. The failure of the traditional multivariate method for high-dimensional data had been observed by Dempster 13] in as early as Dempster identified the issue that the classical Hotelling s T test for the difference of the means of two p-variate normal distributions becomes inapplicable when the dimension p is greater than the within sample degree of freedom n 1 + n 1). This is because in this case, the Hotelling s T statistic is undefined. Dempster proposed a so-called non-exact F test based on the ratio of two mean square distances as an alternative solution for this case. and Saranadasa 4] further showed that even when the Hotelling s T statistic is well defined, Dempster s F test is more powerful than the Hotelling s T test if the dimension is proportionally close to the within sample degree of freedom. Most recently, Bai et al. 3] studied the likelihood ratio test LRT) for the covariance matrix of a normal distribution and showed that using the traditional Chi-square approximation e.g. see Section 8.4 of 36]) to the limiting distribution of the test statistic will result in a much inflated test size or alpha error) even with moderate size of p and n. Bai et al. 3] developed corrections to the traditional likelihood ratio test to make it suitable for testing a high-dimensional normal distribution. In Bai s derivation, the dimension p is no longer considered a fixed constant, but rather a variable that goes to infinity along with the sample size n; and the ratio between p and n converges to a constant y i.e., p n lim n n Bai = y 0, 1). 1.1)

14 4 Jiang et al. 1] further extended Bai s result to cover the case of y = 1 and also proposed a new LRT for testing equality of two covariance matrices of normal distributions in high-dimensional situation. Besides the likelihood ratio tests, many other traditional hypothesis tests in multivariate analysis have also been revisited in the past decade for high-dimensional cases. Examples include Ledoit and Wolf 9] and Schott 40], 41], 4]. Fujikoshi et al.16] gave a book-length survey on multivariate methods under the high-dimensional framework when p/n y > 0. In this thesis, we study several other likelihood ratio tests for means and covariance matrices of high-dimensional normal distributions. Most of these tests have the asymptotic results for their test statistics derived decades ago under the assumption of a large n but a fixed p. Our results supplement these traditional results in providing alternatives to analyze high-dimensional datasets. The rest of this thesis is organized as follows: In Chapter, we prepare the proof of our main theorems by developing an asymptotic expansion of a multivariate Gamma function. In Chapter 3 through Chapter 8, we prove central limit theorems for six commonly used likelihood ratio test statistics in the high-dimensional cases. Using these central limit theorems will allow one to perform the likelihood ratio tests on datasets with dimension p comparable to the sample size n. We also compare the performance test size and power) of the proposed high-dimensional LRTs against the traditional one through simulation. In the final chapter, we summarize our results and conclude by offering some open problems for future consideration.

15 Chapter Asymptotic Expansion of Multivariate Gamma Function In this chapter, we develop an asymptotic expansion of the multivariate Gamma function. This result plays a pivot role in later chapters as we derive the limiting distributions of the likelihood ratio test statistics for mean vectors and covariance matrices of high-dimensional normal distributions. This is because that the expectation of the moments of these test statistics can be expressed in a close form using multivariate Gamma functions. We begin our discussion with the definition of the multivariate Gamma function. Recall the definition of a univariate Gamma function Γ ) on the complex space as: Γα) := 0 exp x)x α 1 dx for α C with Reα) > 0.1) A multivariate Gamma function, denoted by Γ p ), is a generalization of the univariate Gamma function: DEFINITION.1 The multivariate Gamma function of dimension p > 1, denoted by Γ p α), is defined as Γ p α) := A>0 exp tr A)] deta) α p+1)/ da.) for α C with Reα) > p 1)/, where the integration is over the set of all positive definite symmetric matrix {A : A > 0}. 5

16 Apparently, when p = 1, the multivariate Gamma function.) reduces to its univariate form.1), i.e. Γ 1 α) = Γα). It was also shown e.g., Theorem.1.1 from Muirhead 36]) that a multivariate Gamma function can be expressed as a product of univariate Gamma functions, Γ p α) = π pp 1)/4 p Γ α 1 ] i 1) 6 for Reα) > 1 p 1)..3) This result is more useful from computational aspect and sometime considered as an equivalent definition of multivariate Gamma distribution. The multivariate Gamma function has a wide application in multivariate statistics. For instance, it appears in the expression of the probability density function of Wishart distribution and inverse Wishart distributions see James 0] for reference). The asymptotic expansion of a univariate Gamma function was best known as the Sterling formula see, e.g., p.368 from 17] or eq. 37) on page 04 from 1]): log Γz) = z log z z 1 log z + log π + 1 ) 1 1z + O Rez) 3..4) Next, we derive some useful results on the asymptotic expansion of the multivariate Gamma function. Since our application of these results are mainly in the statistical area, we limit our discussion to the real space R for the sake of simplicity in derivation. We begin with the following lemma: LEMMA.1 Let b := bx) be a real-valued function defined on 0, ). As x log Γx + b) Γx) = b log x + b b x where Γ ) is the univariate Gamma function as defined in.1) and Ox 1/ ), if bx) = O x ) as x + ; cx) = Ox ), if bx) = O1) as x +. + cx).5) Proof. It follows the Stirling formula.4) that log Γx + b) Γx) = x + b) logx + b) x log x b 1 logx + b) log x] x + b 1 x ) ) 1 + O x 3.6)

17 as x +. First, use the fact that log1 + t) = t t /) + Ot 3 ) as t 0, we have x + b) logx + b) x log x = x + b) log x + log 1 + b )] x log x x where Ox 1/ ) c 1 x) = Ox ) bx b = x + b) log x + b 3 x + O x 3 b 3 = b log x + b + b x b3 x + O = b log x + b + b x + c 1x) if bx) = O x), if bx) = O1), x )] x log x ) + O ) b 4 x 3 7 as x +. Similarly, as x +, logx + b) log x = log 1 + b ) = x b/x + O x 1) b/x + O x ) if bx) = O x), if bx) = O1); 1 x + b 1 x = b xx + b) = O x 3/) O x ) if bx) = O x), if bx) = O1). Substituting these assertions in.6), we have where log Γx + b) Γx) Ox 1/ ) cx) = Ox ) = b log x + b b x + cx) if bx) = O x), if bx) = O1), as x +. LEMMA. Let n > p = p n. Assume that p/n y 0, 1) and t = t n = O1). Then, as n, log n 1 i=n p Γ i + t) Γ i ) = n log n p1 + log ) n p) logn p)] t t 1.5t) log1 y) + o1).

18 Proof. Since p/n y 0, 1), then n p + as n. By Lemma.1, there exists integer C 1 such that log Γ i +t) = t log i Γ i ) + t t i + ϕi) and ϕi) C 1 for i all i n p as n is sufficiently large, where here and later in this proof we write t for t n for short notation. Notice t log i = t log i t log. Then, n 1 i=n p log Γ i + t) Γ i ) = pt log + t n 1 i=n p = pt log + t t) = pt log + t t) log i + t t) n 1 i=n p n 1 i=n p n 1 i=n p n 1 1 i + ϕi) i=n p 1 i + t log n! n p)! t log n n p) + O ) 1 n 1 i + t log1 y) + t log n! + o1),.7) n p)! since n 1 i=n p ϕi) = O1/n) and logn/n p)] log1 y) as n. Notice that n 1 i=n p n 1 1 i i=n p i i 1 1 n 1 x dx = 1 n p 1 x dx. By working on the lower bound in a similar way, we also have log n n n 1 n p = 1 n p x dx i=n p This implies, by assumption that p/n y 0, 1), that n 1 i=n p 1 i 1 n 1 i 1 n p 1 x dx = log n 1 n p 1. log1 y).8) as n. Second, by the Stirling formula on factorials see, e.g., p.10 from 15]), there are some θ n, θ n 0, 1), 8 log n! n p)! πnn n θn n+ e 1n = log πn p)n p) n p e n+p+ θ n 1n p) = n log n n p) logn p) p + 1 log n n p + o1) = n log n n p) logn p) p 1 log1 y) + o1)

19 as n. Join this with.7) and.8), we arrive at 9 log n 1 i=n p Γ i + t) Γ i ) = pt log t t) log1 y) + t log1 y) + tn log n tn p) logn p) pt t log1 y) + o1) = pt1 + log ) t 1.5t) log1 y) + tn log n tn p) logn p) + o1) as n. The proof is then complete. LEMMA.3 Let n > p = p n and r n = log1 p/n)] 1/. Assume that p/n 1 and t = t n = O1/r n ) as n Then, as n, log n 1 i=n p Γ i + t) Γ i ) = plog n 1 log )t + rn t p n + 1.5)t ] + o1)..9) Proof. Obviously, lim n r n = +. Hence, {t n ; n } is bounded. By Lemma.1, there exist integers C 1 and C such that for all i C. log Γ i + t) Γ i ) = t log i + t t + ϕi) and ϕi) C 1 i i.10) We will use.10) to estimate n 1 Γ i +t) i=n p. However, when n p is small, say, or Γ i ) 3 which is possible since p/n 1), the identity.10) can not be directly applied to estimate each term in the product of n 1 i=n p Γ i +t). We next use a truncation to solve Γ i ) the problem thanks to the fact that Γ i +t) Γ i ) 1 as n for fixed i. Fix M C. Write Then, a i = Γ i + t) Γ i ) for i 1 and γ n = n 1 i=n p Γ i + t) Γ i ) = γ n 1, if n p M; M 1 n 1 i=n p) M i=n p a i, if n p < M. Γ i + t) Γ i )..11)

20 Easily, ) M min 1 a i) γ n max 1 a i) 1 i M 1 i M ) M 10 for all n 1. Note that, for each i 1, a i 1 as n since lim n t n = 0. Thus, since M is fixed, the two bounds above go to 1 as n. Consequently, lim n γ n = 1. This and.11) say that n 1 i=n p Γ i + t) Γ i ) n 1 i=n p) M as n. By.10), as n is sufficiently large, we know log n 1 i=n p) M Γ i + t) Γ i ) = n 1 i=n p) M Γ i + t) Γ i ).1) t log i ) + t t + ϕi) i with ϕi) C 1 i for i C. Write t logi/) = t log i t log. It follows that log n 1 i=n p) M Γ i + t) Γ i ) = t n n p) M] log + t +t t) n 1 i=n p) M 1 i + n 1 i=n p) M n 1 i=n p) M ϕi) log i := A n + B n + C n + D n.13) as n is sufficiently large. Now we analyze the four terms above separately. By distinguishing the cases n p > M and n p M, we get A n pt log t log ) n p M In p M) M log )t..14) Now we estimate B n. By the same argument as in.14), we get n 1 i=n p) M hi) n 1 i=n p) hi) M hi).15)

21 for hx) = log x or hx) = 1/x on x 0, ). By the Stirling formula see, e.g., p.10 from 15]), n! = πnn n θn n+ e 1n with θ n 0, 1) for all n 1. It follows that for some θ n, θ n 0, 1), n 1 i=n p log i = n! log n p)! + log n p n = πnn n θn n+ e 1n log πn p)n p) n p e n+p+ θ n 1n p) = n log n n p) logn p) p + 1 log n p n + log n p n + R n with R n 1 as n is sufficiently large. Recall B n = t n 1 i=n p) M log i. We know from.15) that B n tn log n tn p) logn p) tp + t log n p ] Ct,.16) n where C here and later stands for a constant and can be different from line to line. 11 Now we estimate C n. Recall the identity s n := n 1/i) = log n + c n for all n 1 and lim n c n = c, where c is the Euler constant. Thus, s n s n p ) logn/n p)] c n + c n p. Moreover, n i=n p+1 1 i = s n s n p and n 1 i=n p 1 i n i=n p+1 1 i 1. Therefore, n 1 i=n p 1 i log n C. n p Consequently, since C n = t + t) n 1 i=n p) M 1/i), we know from.15) that ) n C n t + t) log t + t)c..17) n p Finally, it is easy to see from the second fact in.10) that D n C 1 i=m 1 i.18)

22 for all n. Now, reviewing that t = t n 0 as n, we have from.1),.13),.14),.16) and.17) that, for fixed integer M > 0, 1 A n + B n + C n + D n = pt log + tn log n tn p) logn p) tp + t log n p ] n ) n +t t) log + D n + o1) n p = pt1 + log ) + t 1.5t + nt ) log n t p n + 1.5)t ] logn p) }{{} E n +D n + o1) as n. Write logn p) = log n r n. Then From.18) we have that E n = plog n 1 log )t + rn t p n + 1.5)t ]. lim sup n A n B n + C n + D n ) E n C 1 for any M C. Recalling.1) and.13), letting M, we eventually obtain the desired conclusion. i=m 1 i For simplicity, we may combine Lemma. and Lemma.3 into one proposition: PROPOSITION.1 Let n > p = p n and r n = log1 p/n)] 1/. Assume that p/n y 0, 1] and t = t n = O1/r n ) as n. Then, as n, log n 1 i=n p Γ i + t) Γ i ) = ptlog n 1 log ) + rn t p n + 1.5)t ] + o1). Proof. The equality corresponding to the case y = 1 follows from Lemma.3. If y 0, 1), then lim n r n = log1 y)] 1/, and hence {t n : n 1} is bounded. It follows that ptlog n 1 + log ) + rn t p n + 1.5)t ] = ptlog n 1 log ) + tp n) log 1 p ) t 1.5t ) log 1 p ). n n

23 The last term above is identical to t 1.5t ) log1 y)+o1) since p/n y as n. Moreover, pt log n + tp n) log 1 p ) n The above three assertions conclude ptlog n 1 log ) + rn t + p n + 1.5)t ] = pt log n + tp n) logn p) log n] = nt log n tn p) logn p). = nt log n pt1 + log ) tn p) logn p) t 1.5t ) log1 y) + o1) as n. This is exactly the right hand side of.9). LEMMA.4 Let n > p = p n and r n = log1 p/n)] 1/. Assume p/n y 0, 1] and t = t n = O1/r n ) as n. Then as n. Γ n log + t) Γ n p Γ n ) ) ] Γ n p = rnt + o1).19) + t) Proof. We prove the lemma by considering two cases. Case i): y 0, 1). In this case, n p and lim n r n = log1 y)] 1/ 0, ), and hence {t n } is bounded. By Lemma.4, log Γ n + t) n ) 1 Γ n ) = t log + O n) n p log Γ n p ) Γ n p = t log + t) ) ) 1 + O n p as n. Add the two assertions up, we get that the left hand side of.19) is equal to as n. So the lemma holds for y 0, 1). t log 1 p ) + o1) = r n nt + o1).0) Case ii): y = 1. In this case, r n + and t n 0 as n. By Lemma.4, there exist integers C 1 and C such that log Γ m + t) m ) Γ m ) = t log + t t m + ϕm) and ϕm) C 1 m.1) 13

24 for all m C. For any ɛ > 0, take integer M C such that C 1 ɛ. Set M Γ i A n = log min + t) ] Γ i 1 i M Γ i ) and B n = log max + t) ] 1 i M Γ i ). 14 Γ n p +t) Γ n p Thus, A n log B ) n for all 1 n p M. Consequently, n p Γ + t) log Γ n p ) t log n p A n + B n + t log M.) for all n with 1 n p M. If n p > M, noticing lim n t n = 0, then there exists integer C 3 C such that t t + ϕn p) n p t + t + C 1 M ɛ as n C 3 by the second assertion in.1). Consequently, by the first assertion in.1), n p Γ + t) log Γ n p ) t log n p ɛ.3) for all n C 3 with n p M. Since lim n t n = 0 we know A n 0 and B n 0 as n. Joining.) with.3), we conclude that n p Γ + t) log Γ n p ) t log n p A n + B n + t log M + ɛ < 3ɛ as n is sufficiently large. This says the left hand side of the above goes to 0 as n. Equivalently, log n p Γ + t) Γ n p ) = t log n p + o1).4) as n. By Lemma.1 and the fact that lim n t n = 0, log Γ n + t) Γ n ) = t log n ) 1 + O n as n. Subtracting this by.4), then using the same argument as in.0), we obtain.19). Based on all the results presented, we complete this chapter with our main results on the asymptotic expansion of the multivariate Gamma function:

25 PROPOSITION. Let n > p = p n and r n = log1 p/n)] 1/. Assume p/n y 0, 1] and t = t n = O1/r n ) as n. Then, as n, log Γ p n + t) Γ p n ) Proof. First, It follows that This implies = ptlog n 1 log ) + rn t p n + 0.5) t ] + o1). Γ p n + t ) Γ p n + t) Γ p n ) = Γ p n + t) Γ p n ) n 1 j=n p = Γ n + t) = π pp 1)/4 p n p = π pp 1)/4 i=n 1 Γ j + t + 1 ) Γ j + 1 ) = Γ ) n Now, according to Lemma.1, we have log n 1 j=n p Γ n p) Γ n p + t ) n i Γ + t + 1 ) j Γ + t + 1. ) n j=n p+1 n 1 j=n p 15 Γ j + t) Γ j )..5) Γ j + t) Γ j ). Γ j + t) Γ j ) = ptlog n 1 log ) + rn t p n + 1.5)t ] + o1) as n. On the other hand, from Lemma.4, Γ n log + t) Γ n p Γ n ) ) ] Γ n p = rnt + o1) + t) as n. Combining the last three equalities, we have log Γ p n + t) Γ p n ) = ptlog n 1 log ) + rn t p n + 0.5)t ] + o1). COROLLARY.1 Let n > p = p n and r n = log1 p/n)] 1/. Assume p/n y 0, 1] and s = s n = O1/r n ) and t = t n = O1/r n ) as n. Then, as n, log Γ p n + t) Γ p n + s) = pt s)log n 1 log ) t s ) p n + 0.5) t s) ] + o1). +r n

26 Proof. Simply write 16 log Γ p n + t) Γ p n + s) = log Γ n p + t) Γ n ) log Γ n p + s) p Γ n ). p Then the conclusion is a direct result of Proposition..

27 Chapter 3 Testing Covariance Matrix of Normal Distribution Proportional to Identity Matrix Sphericity Test) 3.1 Introduction Let x 1,, x n be i.i.d. R p -valued random vectors from a normal distribution N p µ, Σ), where µ R p is the mean vector and Σ is the p p covariance matrix. Consider the hypothesis test: H 0 : Σ = λi p vs H 1 : Σ λi p 3.1) where λ > 0 is unknown. Denote x = 1 n n x i, A = n x i x)x i x), and S = A n 1. 3.) The likelihood ratio statistic of test 3.1) was first derived by Mauchly 34] as ) tra p ) trs p V n = deta = dets. 3.3) p p 17

28 Notice that the matrix A and S are not of full rank when p > n and consequently their determinants are equal to zero in this case. 18 This indicates that the likelihood ratio test of 3.1) only exists when p n. The statistic V n is commonly known as the ellipticity statistic. Gleser 18] showed that the likelihood ratio test with the rejection region {V n c α } where c α is chosen so that the test has a significant level of α) is unbiased. The distribution of the test statistic V n can be studied through its moments. When the null hypothesis H 0 : Σ = λi p is true, the following result is referenced from page 341 of Muirhead 36]: EVn h ) = p ph Γ n 1)p] Γ n 1) + ph ] Γ n 1) p + h ] Γ n 1 ] for h > 1 p. 3.4) When p is assumed a fixed integer, the following result, referenced from section of 36] and section of ], gives an explicit expansion of the distribution function of ρ log V n, where ρ = 1 p + p + )/6np 6p), as M = ρn 1) : Pr n 1)ρ log V n x ] = Prχ f x) + γ M Prχ f+4 x) Prχ f x)] + O M 3) 3.5) where f = p + )p 1)/, γ = n 1) ρ ω, and ω given by ω = p 1)p )p + )p3 + 6p + 3p + ) 88p n 1) ρ. 3.6) Nagarsenker and Pillai 37] tabulated the lower 5 percentile and 1 percentile of the asymptotic distribution of V n under the null hypothesis H 0 : Σ = λi p. A different test for sphericity other than the likelihood ratio test was recommended by John 5] who studied the test statistic U = 1 ] S p 1/p)trS I p = 1/p)trS ) 1/p)trS] According to John 5], the test that rejects the null hypothesis when U > c α, where c α is determined by the significant level of α, is a locally most powerful invariant test for sphericity and this test is more universal than the aforementioned likelihood ratio test because it can be performed even with p > n. John 6] further showed that under the

29 null hypothesis of 3.1), the limiting distribution of the test statistic U, as the sample size n goes to infinity while the dimension p remains fixed, is given by 19 np U d χ pp+1)/ ) Ledoit and Wolf 9] re-examined the limiting distribution of the test statistic U in the high-dimensional situation where p/n c 0, ). They proved that under the null hypothesis of 3.1), nu n p Ledoit and Wolf further argued that since d N1, 4). 3.9) p χ pp+1)/ 1 p d N1, 4), 3.10) John s n-asymptotic results assuming p is fixed) of test statistic U still remains valid for practice in the high-dimensional case i.e. both p and n are large). Most recently, Chen, Zhang and Zhong 9] extended Ledoit and Wolf s asymptotic result to the non-normal distributions with certain conditions on their covariance matrices. 3. High-dimensional Likelihood Ratio Test for Sphericity In this section, we focus on the likelihood ratio test for sphericity in the high dimensional case p/n y 0, 1]) and develop a central limit theorem for the likelihood ratio test statistic log V n as given in 3.3). The proof of this theorem is deferred until the next section. THEOREM 3.1 Assume that p := p n is a series of positive integers depending on n such that n > 1 + p for all n 3 and p/n y 0, 1] as n. Let V n be defined as in 3.3). Then under H 0 : Σ = λi p λ unknown), log V n µ n )/σ n converges in distribution to N0, 1) as n, where µ n = p n p 1.5) log p σn = n 1 + log ), n 1 )] > 0. 1 p 1 p n 1

30 0 We compare the performance of sphericity test using the Theorem 3.1 against the traditional LRT using 3.5). Based on the calculated critical values of the test statistic at α = 0.05, we conduct a simulation study with 10, 000 replicates from normal distribution to obtain the realized size of the three tests i.e. the probability of rejecting the null hypothesis) for different pairs of p, n). Our results are summarized in Table 3.1 and charted in Figure 3.1. p, n) Table 3.1: Sizes and Powers of LRT for Sphericity Traditional LRT log V n Critical Value α = 0.05) Size Power High-dimensional LRT log V n Critical Value α = 0.05) Size Power 5, 00) , 00) , 00) , 00) , 00) , 00) , 00) Sizes of the sphericity likelihood ratio test are computed based on 10, 000 independent replications of the tests with n samples drawn from N pe, 0.5I p), where e = 1,..., 1) R p. The powers are estimated under the alternative hypothesis that Σ = diag0.5, 0.09, 0.09,, 0.09). It can be seen from Table 3.1 and Figure 3.1 that when p is small, e.g. p = 5, 10, and 50 in our simulation, the traditional LRT test using the Chi-square approximation of 3.5) has the size that matches the test significant level of 0.05 very well. In these cases, the high-dimensional LRT using Theorem 3.1 also demonstrates a good size, yet slightly greater than When p is large, our simulation shows that the traditional LRT test using 3.5) will reject H 0 with a much higher probability than Particularly, when p = 190 and p = 198, the traditional LRT always rejects H 0 in our simulation, leading to a 100% alpha error. However, the high-dimensional LRT using Theorem 3.1 outperforms

31 1 Figure 3.1: Sizes and Powers of LRT for Sphericity the traditional one with its sizes still very close to 0.05 in these cases. On the power side, our simulation shows that the traditional LRT and the high-dimensional one have comparable powers in case p is small. However, when p becomes large, the power of the traditional LRT goes up to 100% due to the failure of the test 100% alpha error) in these cases, while the power of the high-dimensional LRT stays valid. In summary, our simulation indicates that in regard to the test size or alpha error), the proposed high-dimensional LRT using Theorem 3.1 shows non-inferiority to the traditional LRT when p is small, yet a significant improvement over the traditional one when p becomes large. 3.3 Proof of Theorem 3.1 Recall that a sufficient condition See, e.g., page 408 from 6]) for a sequence of random variables {Z n ; n 1} converges to Z in distribution as n is that lim n EetZn = Ee tz < 3.11)

32 for all t t 0, t 0 ), where t 0 > 0 is a constant. Thus, to prove the theorem, it suffices to show that there exists δ 0 > 0 such that { log Vn µ n E exp as n for all s < δ 0. σ n } s e s / 3.1) First, by the fact that x + log1 x) < 0 for all x 0, 1), we know that σ n > 0 for each n and p with n 3, and σ n y + log1 y)] > 0 as n for y 0, 1), and σ n + as n for y = 1. Therefore, let δ 0 := inf{σ n : n 3} > 0. Fix s < δ 0 / and set t = t n = s/σ n. Then {t n ; n 3} is bounded and t n < 1/ for all n 3. By the moment result of 3.4), Ee t log Vn = EV t n = p pt Γ Γ n 1)p n 1)p ] + pt ] Γ n 1 p + t ] Γ n 1 ] p for all n 3. By Lemma.1 and the assumption that p/n y 0, 1], ] ] Γ n 1)p Γ n 1)p + pt log ] = log ] Γ n 1)p + pt Γ n 1)p ] ) n 1)p = pt log p t pt n 1)p + O 1 n 1)p ] n 1)p = pt log pt 1 ) n 1 + O 3.13) n as n. Set r x := log 1 p/x)] 1/ for x > p 1 and notice t rn 1 = s σn log 1 p )] n 1 s log1 y) y+log1 y), if y 0, 1) s, if y = 1 as n. Thus, t = O1/r n 1 ) as n. By Proposition., log Γ n 1 p + t ] Γ n 1 ] = pt logn 1) 1 log ] + r n 1 t p n + 1.5)t ] + o1) p

33 as n. This together with 3.13) and 3.13) gives that ] Γ n 1)p log Ee t log Vn = pt log p + log Γ n 1)p + pt ] + log Γ n 1 p + t ] ] Γ p n 1 n 1)p = pt log p pt log pt + pt logn 1) 1 log ] n 1 +rn 1 t p n + 1.5)t ] + o1) = rn 1 p ) t n 1 + p + n p 1.5)rn 1] t + o1) 3 as n. Reviewing the notations of µ n, σ n and t = t n = s/σ n, then the above says that { log Vn } log E exp s = log Ee t log Vn = σ nt + µ n t + o1) = s + µ n s + o1) σ n σ n as n for all s < δ 0 /. This implies 3.1). The proof is completed.

34 Chapter 4 Testing Independence of Components of Normal Distribution 4.1 Introduction Suppose that random vector X follows a p dimensional normal distribution N p µ, Σ) where X, µ, and Σ are partitioned as X = X 1, X,..., X k ), µ = µ 1, µ,..., µ k ) and Σ 11 Σ 1 Σ 1k Σ 1 Σ Σ k Σ = Σ k1 Σ k Σ kk where X i and µ i are p i dimensional vector and Σ ij is p i p j matrix with i, j = 1,..., k and k p i = p. Consider the hypothesis test that the sub-vector X 1, X,..., X k are pairwise independent: H 0 : Σ ij = 0 i, j = 1,..., k; i j) 4.1) 4

35 Let x 1,..., x n be n independent random samples of X, n x = x i = x 1, x,..., x k ) be the sample mean vector partitioned into k sub-vectors with x i is k i 1, and A 11 A 1 A 1k n A = x i x i )x i x i ) A 1 A A k = A k1 A k A kk 5 4.) with A ij is p i p j. Wilks 46] showed that the likelihood ratio statistic for testing 4.1) is: Λ n = deta) n/ k deta ii) n/ 4.3) and the likelihood ratio test will reject the H 0 when Λ n c α where c α is chosen so that the significant level is equal to α. Notice that this likelihood ratio test only exists when p n, because otherwise the matrix A is not of full rank and consequently its determinant will be equal to zero. Distribution of the likelihood ratio statistic Λ n can be studied through its moment. Define W n = Λ /n n = deta k deta ii 4.4) When the null hypothesis H 0 : Σ ij = 0 i, j = 1,..., k, i j) is true, the following moment result is from Theorem of Muirhead 36]: ) E Wn h = Γ 1 p n 1) + h] Γ 1 p n 1)] k Γ 1 pi n 1)] Γ 1 pi n 1) + h] for h > ) It was also shown from Theorem of 36] that under the null hypothesis H 0, the test statistic W n has the same distribution as k p i V ij 4.6) i= j=1

36 where V ij are independent random variables that follow beta 0.5n p i j), 0.5p i ] distribution with p i = i 1 l=1 p l. 6 Let ρ = 1 ω = f = 1 p 3 k p3 i 6n 1 1 ρ n p 4 48 p k p i ) + 9 p k p k p i k p 4 i ) 5 p 96 p i ) ), k ), and γ = ρn) ω. p i ) p3 ] k p3 i 7 p k ), p i When n goes to infinity while all p i remain fixed. The traditional Chi-square approximation to the limiting distribution of Λ n was referenced from Section 9.5 of Anderson ]: Pr ρ log Λ n x) = Prχ f x) + γ M Prχ f+4 x) Prχ f x) ] + O M 3) 4.7) The upper 100α% points of the distribution of ρ log Λ n for α = 0.05 and 0.01 were tabulated by Davis and Field 1]. In the high-dimensional case, Schott 41] studied a relevant hypothesis test for complete independence of a multivariate normal distribution, i.e. all off-diagonal entries of the covariance matrix Σ are zero. Schott studied the test statistic T np = p i 1 rij i= j=1 pp 1), 4.8) n where r ij is the i,j entry of the sample correlation matrix. He proved that if the complete independence holds true and p/n c 0, ), T np converges in distribution to a normal distribution with mean 0 and variance c.

37 7 4. High-dimensional LRT for Testing Independence of Components of a Normal Distribution In this section, we develop a high-dimensional likelihood ratio test for testing independence of components of a normal distribution. Our proposed test is based on the following central limit theorem on log W n that is a function of the likelihood ratio s- tatistic Λ n. The proof of this theorem is deferred until the next section. THEOREM 4.1 Assume that p i := p i n) i = 1,..., k) are series of positive integers depending on n and p = p p k < n 1 for all n 3. Suppose p i /n y i 0, 1) as n for each 1 i k. Let W n be defined as in 4.4). Then under H 0 : Σ ij = 0 i, j = 1,..., k; i j), log W n µ n ) /σ n converges in distribution to N0, 1) as n, where µ n = p n + 1.5) log σn = log 1 p ) + n 1 1 p ) n 1 k k log 1 p ) i n 1 p i n + 1.5) log 1 p ) i ; n 1 We compare the performance of the high-dimensional likelihood ratio test derived from Theorem 4.1 against the traditional LRT using Chi-square approximation 4.7). Based on the calculated critical values of the test statistic log W n under both cases at α = 0.05, we run a simulation study with 10, 000 replicates from three independent normal distribution components to estimate the size of the test i.e. the probability of rejecting the null hypothesis or alpha error) for different pairs of p, n). Our results are summarized in Table 4.1 and charted in Figure 4.1. > 0. It can be seen from Table 4.1 and Figure 4.1 that the traditional LRT for independence using the Chi-square approximation only performs well in the low-dimensional cases. When the dimension goes large, the size of the test will raise significantly, leading to much higher than 0.05 alpha error. On the contrary, the proposed high-dimensional LRT test using Theorem 4.1 always returns a good test size regardless of the dimension. This suggests the superiority of the proposed high-dimensional LRT over the traditional one.

38 8 Table 4.1: Sizes and Powers of LRT for Independent Normal Components p 1, p, p 3, n) Traditional LRT log W n Critical Value α = 0.05) Size Power High-dimensional LRT log W n Critical Value α = 0.05) Size Power,, 1, 00) , 4,, 00) , 0, 10, 00) , 40, 0, 00) , 60, 30, 00) , 76, 38, 00) Sizes of the likelihood ratio test of 4.1) are computed based on 10, 000 independent applications of the tests with n = 00 samples drawn from N p0, I p). The powers are estimated under the alternative hypothesis that all entries of Σ ij i, j = 1,..., k; i j) are equal to Proof of Theorem 4.1 For convenience purpose, set m = n 1. Recall the notation r x := log 1 p/x)] 1/ for x > p 1. Then we need to prove as n, where log W n µ m σ m converges in distribution to N0, 1) 4.9) µ m = r m p m + 0.5) + σ m = r m k r m,i. First, by the assumptions in the theorem, p m = k p i m k rm,i p i m + 0.5) k y i := y 0, 1] 4.10) as n. Secondly, it is known k 1 x i) > 1 k x i for x i 0, 1), 1 i k, see, e.g., p.60 from 19]. Taking the logarithm on both sides and then taking x i = p i /m,

39 9 Figure 4.1: Sizes and Powers of LRT for Independent Normal Components we see that 1 σ m = r m k rm,i = k log 1 p i m ) log 1 p ) > ) m Now, by the assumptions and 4.10), it is easy to see lim n σ m = log1 y) + k log1 y i), if y < 1; +, if y = 1. By the same argument as in the last inequality in 4.11), we know the limit above is always positive. Reviewing that m = n 1 > p, we then set δ 0 := inf{ σ m ; m } > 0. Fix s < δ 0 /. Set t = t m = s/ σ m. Then {t m ; m } is bounded and t m < 1/ for all m. In particular, as n, we have 1 ) t = t m = O, 1 i k, 4.1) r m,i

40 due to the fact that lim n r m,i = log1 y i ) 0, ) for 1 i k. On the other hand, notice k rm,i = k log 1 p ) i m k log1 y i ) as n. It follows from 4.10) that r 1 m lim n σ m = log1 y) log1 y) k, if y 0, 1); log1 y i) 1, if y = This implies that t = s 1 ) = O σ m r m 4.13) as n. Now, use the moment results as given in 4.5), Ee t log Wn = EW t n = Γ p m + t) Γ p m ) since t = t m < 1/. By Lemma. and 4.13), k Γ pi m ) Γ pi m + t) 4.14) log Γ p m + t) Γ p m ) = ptlog m 1 log ) + rm t p m + 0.5) t ] + o1) 4.15) as p. Similarly, by Lemma. and 4.1), log Γ p i m + t) Γ pi m ) = p i tlog m 1 log ) + rm,i t p i m + 0.5) t ] + o1) 4.16) as n for 1 i k. Therefore, use the identity p = p p k to have log k Γ pi m + t) Γ pi m ) = k log Γ p i m + t) Γ pi m ) = ptlog m 1 log ) + t k t r m,i k rm,i p i m + 0.5) + o1)

41 as n. This together with 4.14) and 4.15) gives 31 log EW t n = t r m k r m,i = s + µ m σ m s + o1) ) + t r m p m + 0.5) + ] k rm,i p i m + 0.5) + o1) as n by the definitions of µ m and σ m as well as the fact t = s/ σ m. We then arrive at { log Wn µ } m E exp s = e µms/ σm EWn t e s / σ m as n for all s < δ 0 /. This implies 4.9) by using the moment generating function method stated in 3.11).

42 Chapter 5 Testing Equality of Multiple Normal Distributions 5.1 Introduction Let x i1,..., x ini be i.i.d. R p -valued random vectors from k p-variate normal distributions N p µ i, Σ i ) for i = 1,..., k, where k is an fixed integer. Consider the hypothesis test that these k normal distributions are identical, i.e. H 0 : µ 1 = = µ k and Σ 1 = = Σ k. 5.1) Let x i1,..., x ini be an i.i.d random sample from the N p µ i, Σ i ) distribution i = 1,..., k). Define A = and B i = B = with x i = 1 n i n i j=1 x ij, k n i x i x) x i x), 5.) n i j=1 x ij x i )x ij x i ), 5.3) k B i, 5.4) x = 1 n k k n i x i, n = n i. 5.5) 3

43 The following likelihood ratio statistic for testing 5.1) is first derived by Wilks 45] Λ n = k detb i) n i/ deta + B) n/ n np/ k nn ip/ i ) and the likelihood ratio test will reject the null hypothesis H 0 if Λ n c α, where the critical value c α is determined so that the significant level of the test is equal to α. Note that when p > n, the matrices B i i = 1,,..., k) are not of full rank and consequently their determinants are equal to zero, so is the likelihood ratio statistic Λ n. Therefore, this likelihood ratio test of 5.1) only exists when p n. Anderson suggested see Section 10.3 of ]) using a modified test statistic Λ n which is very similar to the likelihood ratio statistic Λ n of equation 5.6) except that the n i in that equation had been replaced by n i 1 and n had been replaced by n 1, or Λ n = k detb i) n i 1)/ deta + B) n k)/ n n k)p/ k n i 1). 5.7) n i 1)p/ However, according to Perlman 39], it is the likelihood ratio statistic Λ n, not the modified test statistic Λ n, that gives a unbiased test of 5.1). The distribution of Λ n can be studied through its moments. For notation convenience, define λ n := Λ n k nn ip/ k i n np/ = detb i) n i/. 5.8) deta + B) n/ Then the general expression of the moment of λ n was derived in the Theorem of 36] with the use of Hypergeometic Function also see section 10.4 of ] for the moment results of the modified statistic Λ n). When the null hypothesis 5.1) is true, this expression can be much simplified to ) E λ h n = for h > max 1 i k {p/n i } 1. Γ 1 p n 1)] Γ p 1 n1 + h) 1 ] k Γ 1 p n i1 + h) 1 ] Γ 1 p n i 1) ] 5.9)

44 When the dimension p is considered fixed, the following asymptotic expansion of the distribution function of log Λ n under the null hypothesis 5.1) was from Theorem in 36]. Let f = 1 pk 1)p + 3) 34 and Then p ) ρ = 1 p + 9p + 11 n 1. 6k 1)p + 3)n n i Pr ρ log Λ n x) = Prχ f x) + γ M Prχ f+4 x) where M = ρn and γ = 1 k 6pp + 1)p + )p + 3) 88 p + 9p + 11) p 1) k 1)pp + 3) Prχ f x) ] + O M 3) 5.10) n n i p 1 ) n ) ] 1. n i A similar type of expansion of the distribution function of ρ log Λ n was given by section 10.5 of Anderson ], with a higher order approximation of M. The modified test statistic Λ n was studied more thoroughly by Lee, Chang and Krishnaiah 30] with the upper percentage points of its limiting distribution tabulated. 5. High-dimensional Likelihood Ratio Test for Equality of Multiple Normal Distributions In this section, we develop the likelihood ratio test for testing equality of multiple normal distributions in the high-dimensional case i.e. p/n y 0, 1]). Our proposed test is based on the following central limit theorem for the likelihood ratio statistic log Λ n under the null hypothesis 5.1). The proof of this theorem is deferred until the next section.

45 THEOREM 5.1 Assume that n i := n i p) i = 1,..., k) are series of positive integers depending on p such that min 1 i k n i > 1 + p for all p 1 and p/n i y i 0, 1] as p for each 1 i k. Let n = n n k and Λ n be defined as in 5.6). Then under H 0 : µ 1 = = µ k and Σ 1 = = Σ k, log Λ n µ n )/nσ n ) converges in distribution to N0, 1) as p, where µ n = 1 4 σ n = 1 kp log k y i + nn p 3) log 1 p ) n k n i n p 3) log 1 p ) ], n i 1 1 p ) n k ni ) log 1 p ) ] > 0. n n i 1 35 We compare the performance of likelihood ratio test of 5.1) using the Central Limit Theorem 5.1 against the traditional Chi-square approximation of 5.10). The critical values of the test statistic at 0.05 significant level are calculated under both cases and a simulation study with 10, 000 replicates from 3 normal distributions were performed to estimate the actual sizes and power of the test i.e. the probability of rejecting the null hypothesis) for different pairs of p, n i ). Our results are summarized in Table 5.1 and charted in Figure 5.1. It can be seen from Table 5.1 and Figure 5.1 that when p is small, e.g. p = 5, 10, and 50 in our simulation, the traditional LRT using the Chi-square approximation of 5.10) has the size that matches the test significant level of 0.05 very well. In these cases, the proposed high-dimensional LRT using Theorem 5.1 also shows acceptable sizes, yet slightly higher than When p is large, our simulation shows that the traditional LRT test using 6.7) will reject H 0 with a much higher probability than Particularly, when p = 150, 190, and 198, the traditional LRT always rejects H 0 in our simulation, leading to a 100% alpha error. However, in these cases, the size of the proposed high-dimensional LRT using Theorem 5.1 is still close to On the power side, our simulation shows that the traditional LRT and the high-dimensional one have comparable powers in case p is small. However, when p becomes large, the power of

46 Table 5.1: Sizes and Powers of LRT for Equality of Multiple Normal Distributions k = 3 Traditional LRT High-dimensional LRT p, n i ) log Λ n Critical Value α = 0.05) Size Power 36 log Λ n Critical Value α = 0.05) Size Power 5, 00) , 00) , 00) , 00) , 00) , 00) , 00) Sizes or alpha errors) are computed based on 10, 000 independent applications of the tests with n i = 00 samples drawn from three normal distributions with zero mean and covariance matrix whose ondiagonal entries equal to 1 and off-diagonal entries equal to 0.5. The powers were estimated under the alternative hypothesis that µ 1 = 0,..., 0), Σ 1 = 0.51 p +0.5I p; µ = 0.,..., 0.), Σ = 0.41 p +0.6I p; µ 3 = 0.4,..., 0.4), Σ 1 = 0.31 p I p. the traditional LRT goes up to 100% due to the failure of the test 100% alpha error) in these case, while the power of the high-dimensional LRT stays valid. In summary, our simulation indicates that the proposed high-dimensional LRT using Theorem 5.1 shows non-inferiority to the traditional LRT when p is small, but outperforms the the traditional one when p becomes large. 5.3 Proof of Theorem 5.1 According to 5.8), Evidently, as p, log Λ n = log λ n + 1 pn log n 1 p n = 1 k n i p Recall that r n = log1 p/n)] 1/. Therefore, as p, k pn i log n i. 5.11) 1 k 1 y i := y 0, 1). 5.1) r n log1 y) 0, ). 5.13)

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained