Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Size: px
Start display at page:

Download "Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions"

Transcription

1 Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained from p-variate normal distributions, we consider the classical likelihood ratio tests LRT for their means and covariance matrices in the high-dimensional setting. These test statistics have been extensively studied in multivariate analysis and their limiting distributions under the null hypothesis were proved to be chi-square distributions as n goes to infinity and p remains fixed. In this paper, we consider the high-dimensional case where both p and n go to infinity with p/n y 0, 1]. We prove that the likelihood ratio test statistics under this assumption will converge in distribution to normal distributions with explicit means and variances. We perform the simulation study to show that the likelihood ratio tests using our central limit theorems outperform those using the traditional chi-square approximations for analyzing high-dimensional data. Keywords: Likelihood ratio test, central limit theorem, high-dimensional data, multivariate normal distribution, hypothesis test, covariance matrix, mean vector, multivariate Gamma function. AMS 000 Subject Classification: Primary 6H15; secondary 6H10. 1 School of Statistics, University of Minnesota, 4 Church Street, S.E., MN55455, USA, jiang040@umn.edu. The research of Tiefeng Jiang was supported in part by NSF FRG Grant DMS and NSF Grant DMS Boston Scientific, 1 Scimed Place, Maple Grove, MN 55311, USA, yangf@bsci.com. 1

2 1 Introduction Traditional statistical theory, particularly in multivariate analysis, does not contemplate the demands of high dimensionality in data analysis due to technological limitations and/or motivations. Consequently, tests of hypotheses and many other modeling procedures in many classical textbooks of multivariate analysis such as Anderson 1958, Muirhead 198, and Eaton 1983 are well developed under the assumption that the dimension of the dataset, denoted by p, is considered a fixed small constant or at least negligible compared with the sample size n. However, this assumption is no longer true for many modern datasets, because their dimensions can be proportionally large compared with the sample size. For example, the financial data, the consumer data, the modern manufacturing data and the multimedia data all have this feature. More examples of high-dimensional data can be found in Donoho 000 and Johnstone 001. Recently, Bai et al. 009 develop corrections to the traditional likelihood ratio test LRT to make it suitable for testing a high-dimensional normal distribution N p µ, Σ with H 0 : Σ = I p vs H a : Σ I p. The test statistic is chosen to be L n := trs log S p, where S is the sample covariance matrix from the data. In their derivation, the dimension p is no longer considered a fixed constant, but rather a variable that goes to infinity along with the sample size n, and the ratio between p = p n and n converges to a constant y, i.e., lim n p n n = y 0, Jiang et al. 01 further extend Bai s result to cover the case of y = 1. In this paper, we study several other classical likelihood ratio tests for means and covariance matrices of high-dimensional normal distributions. Most of these tests have the asymptotic results for their test statistics derived decades ago under the assumption of a large n but a fixed p. Our results supplement these traditional results in providing alternatives to analyze high-dimensional datasets including the critical case p/n 1. We will briefly introduce these likelihood ratio tests next. In Section, for each LRT described below, we first review the existing literature, then give our central limit theorem CLT results when the dimension and the sample size are comparable. We also make graphs and tables on the sizes and powers of these CLTs based on our simulation study to show that, as both p and n are large, the traditional chi-square approximation behaves poorly and our CLTs improve the approximation very much. In Section.1, for the normal distribution N p µ, Σ, we study the sphericity test H 0 : Σ = λi p vs H a : Σ λi p with λ unspecified. We derive the central limit theorem for the LRT statistic when p/n y 0, 1]. Its proof is given at Section 5.. In Section., we derive the CLT for the LRT statistic in testing that several components of a vector with distribution N p µ, Σ are independent. The proof is presented at Section 5.3.

3 In Section.3, we consider the LRT with H 0 : N p µ 1, Σ 1 = = N p µ k, Σ k, that is, several normal distributions are identical. We prove a CLT for the LRT statistic with the assumption p/n i y i 0, 1] where n i is the sample size of a data set from N p µ i, Σ i for i = 1,,, k. The proof of the theorem is arranged at Section 5.4. In Section.4, the test of the equality of the covariance matrices from several normal distributions are studied, that is, H 0 : Σ 1 = = Σ k. The LRT statistic is evaluated under the assumption p/n i y i 0, 1] for i = 1,, k. This generalizes the work of Bai et al. 009 and Jiang et al. 01 from k = to any k. The proof of our result is given at Section 5.5. In Section.5, we investigate LRT with H 0 : µ = 0, Σ = I p for the population distribution N p µ, Σ. With the dimension p and the sample size n satisfying p/n y 0, 1], we derive the CLT for the LRT statistic. The corresponding theorem is proved at Section 5.6. In Section.6, we study the test that the population correlation matrix of a normal distribution is equal to an identity matrix, that is, all of the components of a normal vector are independent but not necessarily identically distributed. This is different from the test in Section. that several components of a normal vector are independent. The proof is presented at Section 5.7. In Sections 3 and 4, we show some simulation results, state our method of the proofs and conclude by offering some open problems. One can see the value of y = limp/n or y i = limp/n i introduced above is restricted to the range that y 1. In fact, when y > 1, some matrices involved in the LRT statistics do not have a full rank, and consequently their determinants are equal to zero. As a result, the LRT statistics are not defined, or do not exist. To our knowledge the central limit theorem of the LRT statistics mentioned above in the context of p/n y 0, 1] are new in the literature. Similar research are Bai et al. 009 and Jiang et al. 01. The methods of the proofs in the three papers are different: the Random Matrix Theory is used in Bai et al. 009; the Selberg integral is used in Jiang et al. 01. Here we obtain the central limit theorems by analyzing the moments of the LRT statistics. The organization of the rest of the paper is stated as follows. In Section, we give the details for each of the six tests described above. A simulation study on the sizes and powers of these tests is presented in Section 3. A discussion is given in Section 4. The theorems appearing in each section are proved in Section 5. An auxiliary result on complex analysis is proved in the Appendix. 3

4 Main Results In this section we present the central limit theorems of six classical LRT statistics mentioned in the Introduction. The six central limit theorems are stated in the following six subsections..1 Testing Covariance Matrices of Normal Distributions Proportional to Identity Matrix For distribution N p µ, Σ, we consider the spherical test H 0 : Σ = λi p vs H a : Σ λi p.1 with λ unspecified. Let x 1,, x n be i.i.d. R p -valued random variables with normal distribution N p µ, Σ. Recall x = 1 n n x i and S = 1 n n x i xx i x.. The likelihood ratio test statistic of.1 is first derived by Mauchly 1940 as trs p. V n = S.3 p By Theorem 3.1. and Corollary from Muirhead 198, under H 0 in.1, n λ S and Z Z have the same distribution.4 where Z := z ij n 1 p and z ij s are i.i.d. with distribution N0, 1. This says that, with probability one, S is not of full rank when p n, and consequently S = 0. This indicates that the likelihood ratio test of.1 only exists when p n 1. The statistic V n is commonly known as the ellipticity statistic. Gleser 1966 shows that the likelihood ratio test with the rejection region {V n c α } where c α is chosen so that the test has a significance level of α is unbiased. A classical asymptotic result shows that in distribution as n with p fixed, where n 1ρ log V n converges to χ f.5 ρ = 1 p + p + 6n 1p and f = 1 p 1p +..6 This can be seen from, for example, Theorem from Muirhead 198, the Slutsky lemma and the fact that ρ = ρ n 1 as n and p is fixed. The quantity ρ is a correction term to improve the convergence rate. Now we consider the case when both n and p are large. For clarity of taking limit, let p = p n, that is, p depends on n. 4

5 THEOREM 1 Let n > p + 1 for all n 3 and V n be as in.3. Assume lim n p/n = y 0, 1], then, under H 0 in.1, log V n µ n /σ n converges in distribution to N0, 1 as n, where µ n = p n p 3 p log1 n 1 and [ σn p = n 1 + log1 p ] n 1. As discussed below.4, the LRT exists as n p + 1, however, we need a slightly stronger condition that n > p+1 because of the definition of σn. Though λ in.1 is unspecified, the limiting distribution in Theorem 1 is pivotal, that is, it does not depend on λ. This is because λ is canceled in the expression of V n in.3: αs = α p S and trαs p = α p trs p for any α > 0. Simulation is run on the approximation in.5 and the CLT in Theorem 1. The summary is given in Figure 1. It is seen from Figure 1 that the approximation in.5 becomes poorer as p becomes larger relative to n, and at the same time the CLT in Theorem 1 becomes more precise. In fact, the chi-square approximation in.5 is far from reasonable when p is large: the χ curve and the histogram, which are supposed to be matched, separate from each other with the increase of the value of p. See the caption in Figure 1 for more details. The sizes and powers of the tests by using.5 and by Theorem 1 are estimated from our simulation and summarized in Table 1 at Section 3. A further analysis on this results is presented in the same section. Finally, when p n, we know the LRT does not exist as mentioned above. There are some recent works on choosing other statistics to study the spherical test of.1, see, for example, Ledoit and Wolf 00 and Chen, Zhang and Zhong Testing Independence of Components of Normal Distributions Let k, p 1,, p k be positive integers. Denote p = p p k and Σ = Σ ij p p.7 be a positive definite matrix, where Σ ij is a p i p j sub-matrix for all 1 i, j k. Let N p µ, Σ be a p-dimensional normal distribution. We are testing H 0 : Σ ij = 0 for all 1 i < j k vs H a : H 0 is not true..8 In other words, H 0 is equivalent to that ξ 1,, ξ k are independent, where ξ 1,, ξ k has the distribution N p µ, Σ and ξ i R p i for 1 i k. Let x 1,, x N be i.i.d. with distribution N p µ, Σ. Set n = N 1. Let S be the covariance matrix as in.. Now we 5

6 Figure 1: Comparison between Theorem 1 and.5. We choose n = 100 with p = 5, 30, 60, 90. The pictures in the top row show that the χ curves stay farther away from the histogram of n 1ρ log V n as p grows. The bottom row shows that the N0, 1-curve fits the histogram of log V n µ n /σ n better as p becomes larger. partition A := ns in the following way: A 11 A 1 A 1k A A = 1 A A k.. A k1 A k A kk where A ij is a p i p j matrix. Wilks 1935 shows that the likelihood ratio statistic for testing.8 is given by Λ n = A n+1/ k A ii n+1/ := W n n+1/,.9 see also Theorem from Muirhead 198. Notice that W n = 0 if p > N = n + 1, since the matrix A is not of full rank. From.9, we know that the LRT of level α for testing H 0 in.8 is {Λ n c α } = {W n c α}. Set f = 1 p p i p 3 k p3 i and ρ = 1 6n p k p i. p k p i 6

7 When n goes to infinity while all p i s remain fixed, the traditional chi-square approximation to the distribution of Λ n is referenced from Theorem in Muirhead 198: ρ log Λ n converges to χ f in distribution.10 as n. Now we study the case when p i s are proportional to n. For convenience of taking limit, we assume that p i depends on n for each 1 i k. THEOREM Assume n > p + 1 for all n 3 and p i /n y i 0, 1 as n for each 1 i k. Recall W n as defined in.9. Then, under H 0 in.8, log W n µ n /σ n converges in distribution to N0, 1 as n, where µ n = rn 1 p n r n 1,i p i n + 3 and σn = rn 1 r n 1,i and r x = log1 p x 1/ for x > p and r x,i = log1 p i x 1/ for x > p i and 1 i k. Though H 0 in.8 involves with unknown Σ ii s, the limiting distribution in Theorem is pivotal. This actually can be quickly seen by transforming y i = Σ 1/ x i µ for 1 i N. Then y 1,, y N are i.i.d. with distribution N p 0, I p. Put this into.9, the Σ ii s are then canceled in the fraction under the null hypothesis. See also the interpretation in terms of group transformations on p. 53 from Muirhead 198. We simulate the two cases in Figure : i the classical chi-square approximation.10; ii the central limit theorem based on Theorem. The results show that when p becomes large, the classical approximation in.10 is poor, however, log W n µ n /σ n in Theorem fits the standard normal curve very well. In Table from Section 3, we compare the sizes and powers of the two tests under the chosen H a explained in the caption. See the detailed explanations in the same section..3 Testing that Multiple Normal Distributions Are Identical Given normal distributions N p µ i, Σ i, i = 1,,, k, we are testing that they are all identical, that is, H 0 : µ 1 = = µ k, Σ 1 = = Σ k vs H a : H 0 is not true..11 Let {y ij ; 1 i k, 1 j n i } be independent p-dimensional random vectors, and {y ij ; 1 j n i } be i.i.d. from Nµ i, Σ i for each i = 1,,, k. Set A = B = n i n i ȳ i ȳȳ i ȳ, B i = y ij ȳ i y ij ȳ i and B i = n i y ij ȳ i y ij ȳ i j=1 7 j=1

8 Figure : Comparison between Theorem and.10. We choose k = 3, n = 100 and p = 5, 30, 60, 90 with p 1 : p : p 3 = : : 1. The pictures in the top row show that the histogram of ρ log Λ n move away gradually from χ curve as p grows. The pictures in the bottom row indicate that log W n µ n /σ n and N0, 1-curve match better as p becomes larger. where ȳ i = 1 n i n i j=1 y ij, ȳ = 1 n n i ȳ i, n = n i. The following likelihood ratio test statistic for.11 is first derived by Wilks 193: Λ n = k B i n i/ A + B n/ n pn/ k npn i/ i..1 See also Theorem from Muirhead 198. The likelihood ratio test will reject the null hypothesis if Λ n c α, where the critical value c α is determined so that the significance level of the test is equal to α. Note that when p > n i, the matrix B i is not of full rank for i = 1,, k, and consequently their determinants are equal to zero, so is the likelihood ratio statistic Λ n. Therefore, to consider the test of.11, one needs p min{n i ; 1 i k}. Perlman 1980 shows that the LRT is unbiased for testing H 0. Let f = 1 pk 1p + 3 and ρ = 1 p + 9p + 11 n k 1p + 3n n i 8

9 When the dimension p is considered fixed, the following asymptotic distribution of log Λ n under the null hypothesis.11 is a corollary from Theorem in Muirhead 198: ρ log Λ n converges to χ f.14 in distribution as min 1 i k n i. When p grows with the same rate of n i, we have the following theorem. THEOREM 3 Let n i = n i p > p + 1 for all p 1 and lim p p/n i = y i 0, 1] for all 1 i k. Let Λ n be as in.1. Then, under H 0 in.11, as p, where µ n = 1 4 σ n = 1 log Λ n µ n nσ n converges in distribution to N0, 1 [ kp y i + nrnp n + 3 n i n r n r n i > 0, n i = n i 1 and r x = log 1 p x 1/ for x > p. ] n i rn p n i + 3, i The limiting distribution in Theorem 3 is independent of µ i s and Σ i s. This can be seen by defining z ij = Σ 1/ 1 y ij µ 1, we then know z ij s are i.i.d. with distribution N p 0, I p under the null. It can be easily verified that the µ i s and Σ i s are canceled from the numerator and the denominator of Λ n in.1, and hence the right hand side only depends on z ij s. From the simulation shown in Figure 3, we see that when p gets larger, the chi-square curve and the histogram are moving farther apart as p becomes large, however, the normal approximation in Theorem 3 becomes better. The sizes and powers are estimated and summarized in Table 3 at Section 3. See more detailed explanations in the same section..4 Testing Equality of Several Covariance Matrices Let k be an integer. For 1 i k, let x i1,, x ini random vectors. We are considering be i.i.d. N p µ i, Σ i -distributed H 0 : Σ 1 = = Σ k vs H a : H 0 is not true..15 Denote x i = 1 n i n i n i x ij and A i = x ij x i x ij x i, 1 i k, j=1 j=1 9

10 Figure 3: Comparison between Theorem 3 and.14. We choose n 1 = n = n 3 = 100 with p = 5, 30, 60, 90. The pictures in the top row show that the χ curves stay farther away gradually from the histogram of ρ log Λ n as p grows. The pictures in the bottom row show that the N0, 1-curve fits the histogram of log Λ n µ n /nσ n very well as p becomes large. and A = A A k and n = n n k. Wilks 193 gives the likelihood ratio test of.15 with a test statistic Λ n = k deta i n i/ n np/ deta n/ k nn ip/ i.16 and the test rejects the null hypothesis H 0 at Λ n c α, where the critical value c α is determined so that the test has the significance level of α. Note that A i does not have a full rank when p > n i for any i = 1,..., k, and hence their determinants are equal to zero. So the test statistic Λ n is not defined. Therefore, we assume p n i for all i = 1,..., k when study the likelihood ratio test of.15. The drawback of the likelihood ratio test is on its bias see Section 8.. of Muirhead 198. Bartlett 1937 suggests using a modified likelihood ratio test statistic Λ n by substituting every sample size n i with its degree of freedom n i 1 and substituting the total sample size n with n k: Λ n = k deta i n i 1/ n k n kp/ deta n k/ k n i n i 1p/ 10

11 The unbiased property of this modified likelihood ratio test is proved by Sugiura and Nagao 1968 for k = and by Perlman 1980 for a general k. Let f = 1 pp + 1k 1 and ρ = 1 p + 3p 1 n k 6p + 1k 1n k n i 1 1. Box 1949 shows that when p remains fixed, under the null hypothesis.15, ρ log Λ n converges to χ f.18 in distribution as min 1 i k n i See also Theorem 8..7 from Muirhead 198. Now, suppose p changes with the sample sizes n i s. We have the following CLT. THEOREM 4 Assume n i = n i p for all 1 i k such that min 1 i k n i > p + 1 and lim p p/n i = y i 0, 1]. Let Λ n be as in.17. Then, under H 0 in.15, log Λ n µ n /n kσ n converges in distribution to N0, 1 as p, where µ n = 1 4 σ n = 1 [ n kn p k 1 log1 p n k [ log1 p n k n i 1n i p 3 log1 p ] n i 1, ni 1 n k log1 p n i 1 ] > 0. The limiting distribution in Theorem 4 is independent of µ i s and Σ i s. This is obvious: let y ij = Σ 1/ i x ij µ i, then y i s are i.i.d. with distribution N p 0, I p under the null. From the cancelation of Σ i s in Λ n from.17 we see that the distribution of Λ n is free of µ i s and Σ i s under H 0. Bai et al. 009 and Jiang et al. 01 study Theorem 4 for the case k =. Theorem 4 generalizes their results for any k. Further, the first four authors impose the condition max{y 1, y } < 1 which excludes the critical case max{y 1, y } = 1. There is no such a restriction in Theorem 4. Figure 4 presents our simulation with k = 3. It is interesting to see that the chi-square curve and the histogram almost separate from each other when p is large, and at the same time the normal approximation in Theorem 4 becomes very good. In Table 4 from Section 3, we estimate the sizes and powers of the two tests. The analysis is presented in the same section..5 Testing Specified Values for Mean Vector and Covariance Matrix Let x 1,, x n be i.i.d. R p -valued random vectors from a normal distribution N p µ, Σ, where µ R p is the mean vector and Σ is the p p covariance matrix. Consider the 11

12 Figure 4: Comparison between Theorem 4 and.18. We chose n 1 = n = n 3 = 100 with p = 5, 30, 60, 90. The pictures in the top row show that the χ curves goes away quickly from the histogram of ρ log Λ n as p grows. The pictures in the second row show that the N0, 1-curve fits the histogram of log Λ n µ n /[n kσ n ] better as p becomes larger. hypothesis test: H 0 : µ = µ 0 and Σ = Σ 0 vs H a : H 0 is not true, where µ 0 is a specified vector in R p and Σ 0 is a specified p p non-singular matrix. By applying the transformation x i = Σ 1/ x i µ 0, this hypothesis test is equivalent to the test of: H 0 : µ = 0 and Σ = I p vs H a : H 0 is not true..19 Recall the notation x = 1 n n x i and A = n x i xx i x..0 The likelihood ratio test of size α of.19 rejects H 0 if Λ n c α, where e np/ Λ n = A n/ e tra/ e n x x/..1 n See, for example, Theorem from Muirhead 198. Note that the matrix A does not have a full rank when p n as discussed below.4, therefore its determinant is equal 1

13 to zero. This indicates that the likelihood ratio test of.19 only exists when p < n. Sugiura and Nagao 1968 and Das Gupta 1969 show that this test with a rejection region {Λ n c α } is unbiased, where the critical value c α is chosen so that the test has the significance level of α. Theorem from Muirhead 198 implies that when the null hypothesis H 0 : µ = 0, Σ = I p is true, the statistic as n with p being fixed, where ρ log Λ n converges to χ f. ρ = 1 p + 9p np + 3 and f = 1 pp + 3. Obviously, ρ = ρ n 1 in this case. Davis 1971 improves the above result with a second order approximation. Nagarsenker and Pillai 1973 study the exact null distribution of log Λ n by using its moments. Now we state our CLT result when p grows with n. THEOREM 5 Assume that p := p n such that n > 1 + p for all n 3 and lim n p/n = y 0, 1]. Let Λ n be defined as in.1. Then under H 0 : µ = 0 and Σ = I p, log Λ n µ n /nσ n converges in distribution to N0, 1 as n, where µ n = 1 [ nn p 3 log1 4 σn = 1 p n 1 + log1 p ] n 1 + n + 1p p n 1 > 0. The simulations shown in Figure 5 confirm that it is good to use Theorem 5 when p is large and proportional to n rather than the traditional chi-square approximation in.. In Table 5 from Section 3, we study the sizes and powers for the two tests based on the χ approximation and our CLT. The understanding of the table is elaborated in the same section. and.6 Testing Complete Independence In this section, we study the likelihood ratio test of the complete independence of the coordinates of a high-dimensional normal random vector. Precisely, let R = r ij p p be the correlation matrix generated from N p µ, Σ and x = x 1,, x p N p µ, Σ. The test is H 0 : R = I vs H a : R I..3 The null hypothesis H 0 is equivalent to that x 1,, x p are independent or Σ is diagonal. To study the LRT, we need to understand the determinant of a sample correlation matrix generated by normal random vectors. In fact we will have a conclusion on the class of 13

14 Figure 5: Comparison between Theorem 5 and.. We chose n = 100 with p = 5, 30, 60, 90. The pictures in the top row show that the χ -curve stays away gradually from the histogram of log Λ n as p grows, whereas the N0, 1-curve fits statistic log Λ n µ n /nσ n better as shown from the bottom row. spherical distributions, which is more general than that of the normal distributions. Let us first review two terminologies. Let x = x 1,, x n R n and y = y 1,, y n R n. Recall the Pearson correlation coefficient r defined by r = r x,y = n x i xy i ȳ n x i x n y i ȳ.4 where x = 1 n n x i and ȳ = 1 n n y i. We say a random vector x R n has a spherical distribution if Ox and x have the same probability distribution for all n n orthogonal matrix O. Examples include the multivariate normal distribution N n 0, σ I n, the ϵ-contaminated normal distribution 1 ϵn n 0, I n +ϵn n 0, σ I n with σ > 0 and ϵ [0, 1], and the multivariate t distributions. See page 33 from Muirhead 198 for more discussions. Let X = x ij n p = x 1,, x n = y 1,, y p be an n p matrix such that y 1,, y p are independent random vectors with n-variate spherical distributions and P y i = 0 = 0 for all 1 i p these distributions may be different. Let r ij = r yi,y j, that is, the Pearson correlation coefficient between y i and y j for 1 i j n. Then, R n := r ij p p.5 14

15 is the sample correlation matrix. It is known that R n can be written as R n = U U where U is an n p matrix see, for example, Jiang 004a. Thus, R n does not have a full rank and hence R n = 0 if p > n. According to Theorem from Muirhead 198, the density function of R n is given by Constant R n n p / d R n..6 In the aspect of Random Matrix Theory, the limiting behavior of the largest eigenvalues of R n and the empirical distributions of the eigenvalues of R n are investigated by Jiang 004a. For considering the construction of compressed sensing matrices, the statistical testing problems, the covariance structures of normal distributions, high dimensional regression in statistics and a wide range of applications including signal processing, medical imaging and seismology, the largest off-diagonal entries of R n are studied by Jiang 004b, Li and Rosalsky 006, Zhou 007, Liu, Lin and Shao 008, Li, Liu and Rosalsky 009, Li, Qi and Rolsalski 010 and Cai and Jiang 011, 01. Let s now focus on the LRT of.3. According to p. 40 from Morrison 004, the likelihood ratio test will reject the null hypothesis of.3 if R n n/ c α.7 where c α is determined so that the test has significant level of α. It is also known see, for example, Bartlett 1954 or p. 40 from Morrison 005 that when the dimension p remains fixed and the sample size n, n 1 p log R n d χ pp 1/..8 This asymptotic result has been used for testing the complete independence of all the coordinates of a normal random vector in the traditional multivariate analysis when p is small relative to n. Now we study the LRT statistic when p and n are large and at the same scale. First, we give a general CLT result on spherical distributions. THEOREM 6 Let p = p n satisfy n p + 5 and lim n p/n = y 0, 1]. Let X = y 1,, y p be an n p matrix such that y 1,, y p are independent random vectors with n-variate spherical distribution and P y i = 0 = 0 for all 1 i p these distributions may be different. Recall R n in.5. Then log R n µ n /σ n converges in distribution to N0, 1 as n, where µ n = p n + 3 log1 p n 1 n n 1 p ; [ σn p = n 1 + log 1 p ]. n 1 15

16 In the definition of σ n above, we need the condition n p +. However, the assumption n p + 5 still looks a bit stronger than n p +. In fact, we use the stronger one as a technical condition in the proof of Lemma 5.10 which involves the complex analysis. Notice that when the random vectors x 1,..., x n are i.i.d. from a p-variate normal distribution N p µ, Σ with complete independence i.e., Σ is a diagonal matrix or the correlation matrix R = I p. Write X = x ij n p = x 1,, x n = y 1,, y p. Then, y 1,, y p are independent random vectors from n-variate normal distributions these normal distributions may differ by their covariance matrices. It is also obvious that in this case P y i = 0 = 0 for all 1 i p. Therefore, we have the following corollary. COROLLARY 1 Assume that p := p n satisfy that n p + 5 and lim n p/n = y 0, 1]. Let x 1,, x n be i.i.d. from N p µ, Σ with the Pearson sample correlation matrix R n as defined in.5. Then, under H 0 in.3, log R n µ n /σ n converges in distribution to N0, 1 as n, where µ n = p n + 3 log 1 p n n 1 n 1 p; [ p σn = n 1 + log 1 p ] > 0. n 1 According to Corollary 1, the set {log R n µ n /σ n z α } is the rejection region with an asymptotic 1 α confidence level for the LRT of.3, where the critical value z α > 0 satisfies that P N0, 1 > z α = α for all α 0, 1. Figure 6 shows that the chi-square approximation in.8 is good when p is small, but behaves poorly as p is large. At the same time, the normal approximation in Corollary 1 becomes better. We simulate the sizes and powers of the two tests according to the chi-square approximation in.8 and the CLT in Corollary 1 in Table 6 at Section 3. See more analysis in the same section. As mention earlier, when p > n, the LRT statistic log R n is not defined. So one has to choose other statistics rather than log R n to study.3. See, for example, Schott 005 and Cai and Ma 01 for recent progress. 3 Simulation Study: Sizes and Powers In this part, for each of the six LRTs discussed earlier, we run simulation with 10,000 iterations to estimate the sizes and the powers of the LRTs using the CLT approximation and the classical χ approximation. An analysis for each table is given. In the following discussion, the notation J p stands for the p p matrix whose entries are all equal to 1 and [x] stands for the integer part of x > 0. 1 Table 1. This table corresponds to the sphericity test that, for N p µ, Σ, H 0 : Σ = λi p vs H a : Σ λi p with λ unspecified. It is studied in Section.1. As expected, the χ 16

17 Figure 6: Comparison between Corollary 1 and.8. We choose n = 100 with p = 5, 30, 60, 90. The pictures in the first row show that, as p becomes large, the χ -curve fits the histogram of n 1 p+5 6 log R n poorly. Those in the second row indicate that the N0,1-curve fits the histogram of log R n µ n /σ n very well as p becomes large. approximation is good when p is small relative to n, but not when p is large. For example, at n = 100 and p = 60, the size type I error or alpha error for our normal approximation is and power is , but the size for χ approximation is , which is too large to be used in practice. It is very interesting to see that our normal approximation is also as good as the χ approximation even when p is small. Moreover, for n = 100 and p = 90 where the ratio y = 0.9 is close to 1, the type I error in the CLT case is close to 5% and the power is still decent at Further, the power for the case of CLT drops as the ratio p/n increases to 1. This makes sense because the convergence rate of the CLT becomes slow. This can be seen from Theorem 1 that σ n as p/n 1. Table. In this table, we compare the sizes and powers of two tests under the chosen H a explained in the caption. The first one is the classical χ -approximation in.10 and the second is the CLT in Theorem for the hypothesis that some components of a normal distribution are independent. We observe from the chart that our CLT approximation and the classical χ approximation are comparable for the small values of p i s. However, when p i s are large, noticing the last two rows in the table, our test is good whereas the χ approximation is no longer applicable because of the large sizes type I errors. The power for the CLT drops when the values of p i s become large. This follows from Theorem that σ n as p i /n 1, and hence the CLT-approximation does not perform well. 17

18 3 Table 3. We create this table for testing that several normal distributions are identical in Section.3. It is easily seen that our CLT is good in all cases except at the case of p = 5 where the type I error in our test is 0.061, slightly higher than in the classical case. But when p = 60 and n 1 = n = n 3 = 100, the size in the classical case is 0.454, too large to be used. It is worthwhile to notice that the power on the CLT becomes smaller as the value of p becomes larger. This is easily understood from Theorem 3 that the standard deviation diverges to infinity when p/n 1. Equivalently, the convergence rate is poorer when p gets closer to n. 4 Table 4. This table relates to the test of the equality of the covariance matrices from k normal distributions studied in Section.4. We take k = 3 in our simulations. The sizes and powers of the chi-square approximation and the CLT in Theorem 4 are summarized in the table. When p = 5 and n 1 = n = n 3 = 100, our CLT approximation gives a reasonable size of the test while the classical χ approximation is a bit better. However, for the same values of n i s, when p = 30, 60, 90, the size for the χ approximation is 0.607, and 1, respectively, which are not recommended to be used in practice. Similar to the previous tests, σ n as p/n 1, where σ n is as in Theorem 4. This implies that the convergence of the CLT is slow in this case. So it is not surprised to see that the power of the test based on the CLT in the table reduces as p/n 1. 5 Table 5. We generate this table by considering the LRT with H 0 : µ = 0, Σ = I p for the population distribution N p µ, Σ. The CLT is developed in Theorem 5. In this table we study the sizes and powers for the two cases based on the χ approximation and the CLT. At n = 100, p = 5 p is small, the χ test outperforms ours. The two cases are equally good at n = 100, p = 30. When the values of p are large at 60 and 90, our CLT is still good but the χ approximation is no longer useful. At the same time, it is easy to spot from the fourth column of the table that the power for the CLT-test drops as the ratio p/n becomes large. It is obvious from Theorem 5 that the standard deviation σ n goes to infinity as the ratio approaches one. This causes the less precision when the sample size is not large. 6 Table 6. This chart is created on the test that all of the components of a normal vector are independent but not necessarily identically distributed. It is studied in Corollary 1. The sizes and powers of the two tests are estimated from simulation using the chi-square approximation in.8 and the CLT in Corollary 1 from Section 3 the H a is explained in the caption. At all of the four cases of n = 100 with p = 5, 30, 60 and 90, the performance of our CLT-test is good, and it is even comparable with the classical χ -test at the small value of p = 5. When p = 60 and 90, the sizes of the χ -test are too big, while those of the CLT-test keep around For the CLT-test itself, looking at the third and fourth rows of the table, though the performance corresponding to y = p/n = 0.6 is better than that corresponding to the high value of y = p/n = 0.9 as expected, they are quite close. The only difference is the declining of the power as the rate p/n increases. Again, this is easily seen from Corollary 1 that the standard deviation σ n is divergent as p is close to n. 18

19 Table 1: Size and Power of LRT for Sphericity in Section.1 Size under H 0 Power under H a CLT χ approx. CLT χ approx. n = 100, p = n = 100, p = n = 100, p = n = 100, p = The sizes alpha errors are estimated based on 10, 000 simulations from N p 0, I p. The powers are estimated under the alternative hypothesis that Σ = diag1.69,, 1.69, 1,, 1, where the number of 1.69 on the diagonal is equal to [p/]. Table : Size and Power of LRT for Independence of Three Components in Section. Size under H 0 Power under H a CLT χ approx. CLT χ approx. n = 100, p 1 =, p =, p 3 = n = 100, p 1 = 1, p = 1, p 3 = n = 100, p 1 = 4, p = 4, p 3 = n = 100, p 1 = 36, p = 36, p 3 = The sizes alpha errors are estimated based on 10, 000 simulations from N p 0, I p. The powers are estimated under the alternative hypothesis that Σ = 0.15J p I p. Table 3: Size and Power of LRT for Equality of Three Distributions in Section.3 Size under H 0 Power under H a CLT χ approx. CLT χ approx. n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = The sizes alpha errors are estimated based on 10, 000 simulations from three normal distributions of N p 0, I p. The powers were estimated under the alternative hypothesis that µ 1 = 0,..., 0, Σ 1 = 0.5J p + 0.5I p; µ = 0.1,..., 0.1, Σ = 0.6J p + 0.4I p; µ 3 = 0.1,..., 0.1, Σ 3 = 0.5J p I p. 19

20 Table 4: Size and Power of LRT for Equality of Three Covariance Matrices in Section.4 Size under H 0 Power under H a CLT χ approx. CLT χ approx. n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = n 1 = n = n 3 = 100, p = The sizes alpha errors are estimated based on 10, 000 simulations from N p 0, I p. The powers are estimated under the alternative hypothesis that Σ 1 = I p, Σ = 1.1I p, and Σ 3 = 0.9I p. Table 5: Size and Power of LRT for Specified Normal Distribution in Section.5 Size under H 0 Power under H a CLT χ approx. CLT χ approx. n = 100, p = n = 100, p = n = 100, p = n = 100, p = Sizes alpha errors are estimated based on 10, 000 simulations from N p0, I p. The powers are estimated under the alternative hypothesis that µ = 0.1,..., 0.1, 0,..., 0 where the number of 0.1 is equal to [p/] and Σ = {σ ij } where σ ij = 1 for i = j, σ ij = 0.1 for 0 < i j 3, and σ ij = 0 for i j > 3. Table 6: Size and Power of LRT for Complete Independence in Section.6 Size under H 0 Power under H a CLT χ approx. CLT χ approx. n = 100, p = n = 100, p = n = 100, p = n = 100, p = Sizes alpha errors are estimated based on 10, 000 simulations from N p 0, I p. The powers are estimated under the alternative hypothesis that the correlation matrix R = r ij where r ij = 1 for i = j, r ij = 0.1 for 0 < i j 3, and r ij = 0 for i j > 3. 0

21 4 Conclusions and Discussions In this paper, we consider the likelihood ratio tests for the mean vectors and covariance matrices of high-dimensional normal distributions. Traditionally, these tests were performed by using the chi-square approximation. However, this approximation relies on a theoretical assumption that the sample size n goes to infinity, while the dimension p remains fixed. As many modern datasets discussed in Section 1 feature high dimensions, these traditional likelihood ratio tests were shown to be less accurate in analyzing those datasets. Motivated by the pioneer work of Bai et al. 009 and Jiang et al. 01, who prove two central limit theorems of the likelihood ratio test statistics for testing the highdimensional covariance matrices of normal distributions, we examine in this paper other LRTs that are widely used in the multivariate analysis and prove the central limit theorems for their test statistics. By using the method developed in Jiang et al. 01, that is, the asymptotic expansion of the multivariate Gamma function with high dimension p, we are able to derive the central limit theorems without relying on concrete random matrix models as demonstrated in Bai et al Our method also has an advantage that the central limit theorems for the critical cases limp/n = y = 1 or limp/n i = y i = 1 are all derived, which is not the case in Bai et al. 009 because of the restriction of their tools from the Random Matrix Theory. In real data analysis, as long as n > p + 1 or n i > p + 1 in Theorems 1-5, or n p + 5 in Theorem 6, we simply take y = p/n or y i = p/n i to use the theorems. As Figures 1-6 and Tables 1-6 show, our CLT-approximations are all good even though p is relatively small. The proofs in this paper are based on the analysis of the moments of the LRT statistics five of six such moments are from literature and the last one is derived by us as in Lemma The moment method we use here is different from that of the Random Matrix Theory employed in Bai et al. 009 and the Selberg integral used in Jiang et al. 01. Our research also brings out the following four interesting open problems: 1. All our central limit theorems in this paper are proved under the null hypothesis. As people want to assess the power of the test in many cases, it is also interesting to study the distribution of the test statistic under an alternative hypothesis. In the traditional case where p is considered to be fixed while n goes to infinity, the asymptotic distributions of many likelihood ratio statistics under the alternative hypotheses are derived by using the zonal polynomials see, e.g., Section 8..6, Section 8.3.4, Section from Muirhead 198. It can be conjectured that in the high-dimensional case, there could be some new results regarding the limiting distributions of the test statistics under the alternative hypotheses. However, this is non-trivial and may require more investigation of the high-dimensional zonal polynomials. Some new understanding about the connection between the random matrix theory and the Jack polynomials the zonal polynomials, the Schur polynomials and the zonal spherical functions are 1

22 special cases is given by Jiang and Matsumoto 011. A recent work by Bai et al. 009 study the high-dimensional LRTs through the random matrix theory. So the connection among the random matrix theory, LRTs and the Jack polynomials is obvious. We are almost sure that the understanding by Jiang and Matsumoto 011 will be useful in exploring the LRT statistics under the alternative hypotheses.. Except Theorem 6 where the condition n p + 5 is imposed due to a technical constraint, all other five central limit theorems in this paper are proved under the condition n > p + 1 or n i > p + 1. This is because when this is not the case in the five theorems, the likelihood ratio statistics will become undefined in these five cases. This indicates that tests other than the likelihood ratio ones shall be developed for analyzing a dataset with p greater than n. For recent progress, see, for example, Ledoit and Wolf 00 and Chen et al. 010 for the sphericity test, and Schott 001, 007 for testing the equality of multiple covariance matrices and Srivastava 005 for testing the covariance matrix of a normal distribution. A power study for sphericity test is tried by Onatski et al. Despite these enlightening work mentioned above, other hypothesis tests for p > n or p > n i are still an open area with many interesting problems to be solved. 3. In this paper we consider the cases when p and n or n i are proportional to each other, that is, lim p/n = y 0, 1] or lim p/n i = y i 0, 1]. In practice, p may be large but may not be large enough to be at the same scale of n or n i. So it is useful to derive the central limit theorems appeared in this paper under the assumption that p such that p/n 0 or p/n i To understand the robustness of the six likelihood tests in this paper, one has to study the limiting behaviors of the LRT statistics without the normality assumptions. This is feasible. For example, in Section. we test the independence of several components of a normal distribution. The LRT statistic W n in.9 can be written as the product of some independent random variables, say, V i s with beta distributions see, e.g., Theorem from Muirhead 198. Therefore, it is possible that we can derive the CLT of W n for general V i s with the same means and variances as those of the beta distributions. Finally, it is worthwhile to mention that some recent works consider similar problems under the nonparametric setting, see, e.g., Cai et al. 013, Cai and Ma, Chen et al. 010, Li and Chen 01, Qiu and Chen 01 and Xiao and Wu 013.

23 5 Proofs This section is divided into some subsections. In each of them we prove a theorem introduced in Section 1. We first develop some tools. The following are some standard notation. For two sequence of numbers {a n ; n 1} and {b n ; n 1}, the notation a n = Ob n as n means that lim sup n a n /b n <. The notation a n = ob n as n means that lim n a n /b n = 0. For two functions fx and gx, the notation fx = Ogx and fx = ogx as x x 0 [, ] are similarly interpreted. Throughout the paper Γz is the Gamma function defined on the complex plane C. 5.1 A Preparation LEMMA 5.1 Let b := bx be a real-valued function defined on 0,. Then, as x +, where log Γx + b Γx = b log x + b b x + cx Ox 1/, if bx = O x ; cx = Ox, if bx = O1. Further, for any constants d > c, as x +, Γx + t sup log t log x 0. Γx c t d Proof. Recall the Stirling formula see, e.g., p. 368 from Gamelin 001 or 37 on p. 04 from Ahlfors 1979: log Γx = x logx x + log π + 1x + O x as x +. We have that log Γx + b Γx = x + b logx + b x log x b 1 logx + b log x x + b O x x 3 as x +. First, use the fact that log1 + t = t t / + Ot 3 as t 0 to get x + b logx + b x log x = x + b log x + log 1 + x b x log x = x + b log x + b x b b 3 x + O x 3 x log x = b log x + b + b b 3 x + O x b3 b 4 x + O x 3 = b log x + b + b x + c 1x 3 5.

24 as x +, where Ox 1/ c 1 x = Ox if bx = O x; if bx = O1. Similarly, as x +, logx + b log x = log 1 + b = x b x + Ox 1 b x + Ox if bx = O x; if bx = O1 and 1 x + b 1 x = b xx + b = Ox 3/ Ox if bx = O x; if bx = O1. Substituting these two assertions in 5., we have log Γx + b Γx = b log x + b b x + cx 5.3 with Ox 1/ cx = Ox if bx = O x; if bx = O1 as x +. For the last part, reviewing the whole proof above, we have from 5.3 that log Γx + t Γx = t log x + t t x + Ox as x + uniformly for all c t d. This implies the conclusion. LEMMA 5. Given a > 0. Define ηt = sup log x a for all t > a. Then lim t 0 ηt = 0. Γx + t Γx t log x Proof. Let d > c > 0 be two constants. Since Γx > 0 is continuous on 0,, then gx := log Γx is uniformly continuous over the compact interval [c/, d]. It then follows that sup c x d log Γx + ϵ Γx = sup c x d 4 gx + ϵ gx 0 5.4

25 as ϵ 0. On the other hand, by the second part of Lemma 5.1, for any ϵ > 0, there exists x 0 > a such that for all x x 0. Therefore, sup log t a sup sup log x x 0 t a Γx + t Γx Γx + t Γx t log x < ϵ t log x ϵ. Then, Γx + t ηt ϵ + sup log a x x 0 Γx t log x Γx + t ϵ + log a + log x 0 t + sup log a x x 0 Γx for all t a. Consequently, we have from 5.4 that lim sup t 0 ηt ϵ for all ϵ > 0, which concludes the lemma. PROPOSITION 5.1 Proposition.1 from Jiang et al. 01 Let n > p = p n and r n = log1 p n 1/. Assume that p/n y 0, 1] and t = t n = O1/r n as n. Then, as n, log n 1 i=n p Γ i t Γ i = pt1 + log log n + rn t + p n + 1.5t + o1. LEMMA 5.3 Let n > p = p n and r n = log1 p n 1/. Assume p n y 0, 1] and t = t n = O1/r n as n. Then as n. [ Γ n log + t Γ n p Γ n ] Γ n p = rnt + o t Proof. We prove the lemma by considering two cases. Case i: y 0, 1. In this case, n p and lim n r n = log1 y 1/ 0,, and hence {t n } is bounded. By Lemma 5.1, log Γ n + t Γ n = t log n 1 + O n log Γ n p n p Γ n p = t log + O + t 1 n p 5

26 as n. Add the two assertions up, we get that the left hand side of 5.5 is equal to t log 1 p + o1 = r n nt + o1 5.6 as n. So the lemma holds for y 0, 1. Case ii: y = 1. In this case, r n + and t n 0 as n. Recalling Lemma 5., we know that n p Γ + t n log Γ n p t n log n p ηt n 0 as n by taking a = 1/ since n p 1. That is, log Γ n p Γ n p + t n = t n log n p + o1 5.7 as n. By Lemma 5.1 and the fact that lim n t n = 0, log Γ n + t n Γ n = t n log n + o1 as n. Adding up the above two terms, then using the same argument as in 5.6, we obtain 5.5. Define Γ p z := π pp 1/4 p Γ z 1 i for complex number z with Rez > 1 p 1. See p. 6 from Muirhead 198. LEMMA 5.4 Let Γ p z be as in 5.8. Let n > p = p n and r n = log1 p n 1/. Assume p n y 0, 1] and s = s n = O1/r n and t = t n = O1/r n as n. Then log Γ p n + t [ Γ p n + s = pt slog n 1 log + r n t s p n + 1 ] t s + o1 as n. Proof. First, Γ p n + t = π pp 1/4 p n 1 = π pp 1/4 j=n p n i Γ + t + 1 j Γ + t

27 It follows that Γ p n + t Γ p n = n 1 j=n p Γ j + t + 1 Γ j + 1 = n j=n p+1 Γ j + t Γ j. 5.9 This implies Γ p n + t Γ p n = Γ n + t Γ n Γ n p Γ n p + t n 1 j=n p Γ j + t Γ j. Now, replacing t in Proposition 5.1 with t we then obtain log n 1 j=n p Γ j + t Γ j = ptlog n 1 log + rn t p n + 1.5t + o1 as n. On the other hand, from Lemma 5.3, [ Γ n log + t Γ n p Γ n ] Γ n p = rnt + o1 + t as n. Combining the last three equalities, we have log Γ p n + t Γ p n = ptlog n 1 log + rn t p n + 1 t + o1 as n. Similarly, log Γ p n + s Γ p n [ = pslog n 1 log + rn s p n + 1 ] s + o1 as n. Taking the difference of the above two assertions, we obtain the desired conclusion. 5. Proof of Theorem 1 LEMMA 5.5 Corollary from Muirhead 198 Assume n > p. Let V n be as in.3. Then, under H 0 in.1, we have EV h n = p ph Γ mp + ph Γp m + h Γ p m Γ mp for h > 1 where m = n 1. Proof of Theorem 1. Recall that a sequence of random variables {Z n ; n 1} converges to Z in distribution as n if lim n EehZ n = Ee hz <

28 for all h h 0, h 0, where h 0 > 0 is a constant. See, e.g., page 408 from Billingsley Thus, to prove the theorem, it suffices to show that there exists δ 0 > 0 such that { log Vn µ } n E exp s e s / 5.11 σ n as n for all s < δ 0. Set m = n 1 and r x := log1 p x 1/ for x > p. By the fact that x + log1 x < 0 for all x 0, 1, we know that σ n > 0 for all n 3, and lim n σ n = y log1 y > 0 for y 0, 1, and lim n σ n = + for y = 1. Therefore, δ 0 := inf{σ n ; n 3} > 0 Fix s < δ 0. Set t = t n = s σ n. Then {t n ; n 3} is bounded and t n < 1 Lemma 5.5, for all n 3. By Ee t log V n = EVn t = p pt Γ mp Γ mp + pt Γp m + t Γ p m 5.1 for all n 3. By Lemma 5.1 for the first case and the assumption p/m y 0, 1], log Γ mp Γ mp + pt = log Γ as n. Notice t log1 p m mp + pt Γ mp = pt log mp p t pt + O mp = pt log mp pt 1 m + O n = s σn log1 p m as n. Thus, t = O1/r m as n. By Lemma 5.4, log Γ p m + t Γ p m = ptlog m 1 log + r m 1 m s log1 y y+log1 y, if y 0, 1; s, if y = 1 as n. This together with 5.1 and 5.13 gives that t p m + 1 t + o1 log Ee t log Γ mp Vn = pt log p + log Γ mp + pt + log Γ m p + t Γ p m = pt log p pt log mp pt + ptlog m 1 log m t p m + 1 t + o1 +r m = rm p t [ m + p + n p 3 ] r m t + o

29 as n. Reviewing the notation µ n, σ n and t = t n = s σ n, the above indicates that { log Vn } log E exp s = log Ee t log Vn = σ nt + µ n t + o1 = s + µ n s + o1 σ n σ n as n for all s < δ 0. This implies The proof is completed. 5.3 Proof of Theorem LEMMA 5.6 Theorem from Muirhead 198 Let p, n = N 1 and W n be as in.9. Then, under H 0 in.8, EWn t = Γ p n 1 + t Γ p n 1 k Γ pi n 1 Γ pi n 1 + t 5.14 for any t > 1/, where Γ p z is as in 5.8. Proof of Theorem. For convenience, set m = n 1. Then we need to prove as n, where µ m = rm p m log W n µ m σ m converges in distribution to N0, r m,i p i m + 1 and σ m = rm First, since m = n 1 > p = p p k and lim n p i /n = y i for each 1 i k, we know p m = p i m r m,i. y i := y 0, 1] 5.16 as n. Second, it is known k 1 x i > 1 k x i for all x i 0, 1, 1 i k, see, e.g., p. 60 from Hardy et al Taking the logarithm on both sides and then taking x i = p i /m, we see that 1 σ m = r m rm,i = log1 p i m log1 p m > for all m. Now, by the assumptions and 5.16, it is easy to see log1 y + k lim p σ m = log1 y i, if y < 1; +, if y = 1. 9

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS The Annals of Statistics 013, Vol. 41, No. 4, 09 074 DOI: 10.114/13-AOS1134 Institute of Mathematical Statistics, 013 CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL

More information

Likelihood Ratio Tests for High-dimensional Normal Distributions

Likelihood Ratio Tests for High-dimensional Normal Distributions Likelihood Ratio Tests for High-dimensional Normal Distributions A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Fan Yang IN PARTIAL FULFILLMENT OF THE

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY

ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY José A. Díaz-García and Raúl Alberto Pérez-Agamez Comunicación Técnica No I-05-11/08-09-005 (PE/CIMAT) About principal components under singularity José A.

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Correlation 1. December 4, HMS, 2017, v1.1

Correlation 1. December 4, HMS, 2017, v1.1 Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample

More information

Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis

More information

Multivariate Time Series

Multivariate Time Series Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

STAT 730 Chapter 5: Hypothesis Testing

STAT 730 Chapter 5: Hypothesis Testing STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Testing Homogeneity Of A Large Data Set By Bootstrapping

Testing Homogeneity Of A Large Data Set By Bootstrapping Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Multivariate-sign-based high-dimensional tests for sphericity

Multivariate-sign-based high-dimensional tests for sphericity Biometrika (2013, xx, x, pp. 1 8 C 2012 Biometrika Trust Printed in Great Britain Multivariate-sign-based high-dimensional tests for sphericity BY CHANGLIANG ZOU, LIUHUA PENG, LONG FENG AND ZHAOJUN WANG

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting

More information

Chapter 9. Hotelling s T 2 Test. 9.1 One Sample. The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus

Chapter 9. Hotelling s T 2 Test. 9.1 One Sample. The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus Chapter 9 Hotelling s T 2 Test 9.1 One Sample The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus H A : µ µ 0. The test rejects H 0 if T 2 H = n(x µ 0 ) T S 1 (x µ 0 ) > n p F p,n

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Hypothesis Testing Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA An Example Mardia et al. (979, p. ) reprint data from Frets (9) giving the length and breadth (in

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when

More information

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana

More information

A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p.

A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p. Title A new test for the proportionality of two large-dimensional covariance matrices Authors) Liu, B; Xu, L; Zheng, S; Tian, G Citation Journal of Multivariate Analysis, 04, v. 3, p. 93-308 Issued Date

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

Bayesian Nonparametric Point Estimation Under a Conjugate Prior

Bayesian Nonparametric Point Estimation Under a Conjugate Prior University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 5-15-2002 Bayesian Nonparametric Point Estimation Under a Conjugate Prior Xuefeng Li University of Pennsylvania Linda

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Convergence in Distribution

Convergence in Distribution Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Large sample covariance matrices and the T 2 statistic

Large sample covariance matrices and the T 2 statistic Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

Simulating Uniform- and Triangular- Based Double Power Method Distributions

Simulating Uniform- and Triangular- Based Double Power Method Distributions Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Tube formula approach to testing multivariate normality and testing uniformity on the sphere

Tube formula approach to testing multivariate normality and testing uniformity on the sphere Tube formula approach to testing multivariate normality and testing uniformity on the sphere Akimichi Takemura 1 Satoshi Kuriki 2 1 University of Tokyo 2 Institute of Statistical Mathematics December 11,

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five

More information

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

c 2005 Society for Industrial and Applied Mathematics

c 2005 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. XX, No. X, pp. XX XX c 005 Society for Industrial and Applied Mathematics DISTRIBUTIONS OF THE EXTREME EIGENVALUES OF THE COMPLEX JACOBI RANDOM MATRIX ENSEMBLE PLAMEN KOEV

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

TEST FOR INDEPENDENCE OF THE VARIABLES WITH MISSING ELEMENTS IN ONE AND THE SAME COLUMN OF THE EMPIRICAL CORRELATION MATRIX.

TEST FOR INDEPENDENCE OF THE VARIABLES WITH MISSING ELEMENTS IN ONE AND THE SAME COLUMN OF THE EMPIRICAL CORRELATION MATRIX. Serdica Math J 34 (008, 509 530 TEST FOR INDEPENDENCE OF THE VARIABLES WITH MISSING ELEMENTS IN ONE AND THE SAME COLUMN OF THE EMPIRICAL CORRELATION MATRIX Evelina Veleva Communicated by N Yanev Abstract

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Multivariate Linear Models

Multivariate Linear Models Multivariate Linear Models Stanley Sawyer Washington University November 7, 2001 1. Introduction. Suppose that we have n observations, each of which has d components. For example, we may have d measurements

More information