A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p.

Size: px
Start display at page:

Download "A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p."

Transcription

1 Title A new test for the proportionality of two large-dimensional covariance matrices Authors) Liu, B; Xu, L; Zheng, S; Tian, G Citation Journal of Multivariate Analysis, 04, v. 3, p Issued Date 04 URL Rights NOTICE: this is the author s version of a work that was accepted for publication in Journal of Multivariate Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Multivariate Analysis, 04, v. 3, p DOI: 0.06/j.jmva ; This work is licensed under a Creative Commons Attribution-NonCommercial- NoDerivatives 4.0 International License.

2 A new test for the proportionality of two large-dimensional covariance matrices Baisen Liu a, Lin Xu b, Shurong Zheng b, Guo-Liang Tian c a School of Statistics, Dongbei University of Finance and Economics b KLAS and School of Mathematics and Statistics, Northeast Normal University c Department of Statistics and Actuarial Science, The University of Hong Kong Abstract Let x,..., x iid n + N p µ, Σ ) and y,..., y iid n + N p µ, Σ ) be two independent random samples, where n p < n. In this article, we propose a new test for the proportionality of two large p p covariance matrices Σ and Σ. By applying modern random matrix theory, we establish the asymptotic normality property for the proposed test statistic as p, n, n ) together with the ratios p/n y 0, ) and p/n y 0, ) under suitable conditions. We further showed that these conclusions are still valid if normal populations are replaced by general populations with finite fourth moments. Keywords: Covariance matrix, Hypothesis testing, Large-dimensional data, Limiting spectral distribution, Proportionality, Random F -matrices.. Introduction With the rapid development and wide applications of computer techniques, huge data can be collected and stored. This is called as high-dimensional data or large-dimensional data, see Bai and Silverstein []. Many traditional estimation and test tools are no more valid or perform badly for such large-dimensional data, since these traditional methods are often based on the classical central asymptotic theorems assuming a large sample size n and fixed dimension p. In this article, we consider testing proportionality of large-dimensional covariance matrices from two different populations. The proportionality of covariance Corresponding zhengsr@nenu.edu.cn Preprint submitted to Journal of Multivariate Analysis June 0, 04

3 matrices is the simplest form of heteroscedasticity between populations, which has extensive applications in economics, discriminations, etc. As an instance, consider a quantitative genetic experiment, called paternal half-sib design. This experiment is conducted under the hypothesis of equal heritabilities in the two populations, and it corresponds to the hypothesis of proportionality between population covariance matrices which we will discuss in this paper. The goal of this experiment is to model measurements of some quantitative traits in two independent populations of animal offsprings. More detailed description of this experiment can be found in Jensen and Madsen [0]. Other related examples are: Dargahi-Noubary [4] considered the applications of discrimination between two normal populations when covariance matrices are proportional. Nel and Groenewald [4] studied the multivariate Behrens-Fisher problem under the assumption of proportional covariance matrices. Later, Villa and Pérignon [0] investigated the sources of time variation in the covariance matrix of interest rates. In their work, they discussed the similarities among covariance matrices of bond yields including the cases of equality and proportionality. Let x,..., x iid n + N p µ, Σ ) and y,..., y iid n + N p µ, Σ ) be two independent random samples, where n p < n. To describe the proposed new test, we define N k n k + k, ) and consider the joint sufficient statistics: and x N N x i, ˆV x i x)x i x), N ȳ N N y j, ˆV y j ȳ)y j ȳ), ) N for µ, Σ ) and µ, Σ ), respectively, where A denotes the transpose of A. Thus, ˆΣ k ˆV k /n k is the maximum likelihood estimator of Σ i for k,. The null hypothesis of interest is H 0 : Σ σ Σ, ) where σ > 0 is an unknown constant. When σ, Wilks [] suggested a likelihood ratio LR) test statistic Λ N N log ˆV + N log ˆV N log ˆV + ˆV,

4 where N N + N. However, in practice, it is often to use a modified LR test statistic Λ N n log ˆV + n log ˆV n + n log ˆV + ˆV, due to its unbiasedness and monotonicity of the power function see, Srivastava, Khatri and Carter [8]; Sugiura and Nagao [9]). As the sample sizes N, N, the classical central limiting theorem states that under H 0 see [], Sec. 8.) Λ N D χ and Λ D pp+) N χ pp+). 3) Note that the limiting results in 3) are true only when the variable dimension p is assumed to be fixed and p minn, N ). With computer simulations, Bai et al. [] showed that, employing the χ approximation 3) for dimensions like 30 or 40, increases dramatically the size of the test. For dimension and sample sizes p, N, N ) 40, 800, 400), the test size equals.% instead of the nominal 5% level. The result is even worse for the case of p, N, N ) 80, 600, 800) which leads to a 49.5% test size. For the general case of hypothesis ) with unknown constant σ, to our knowledge, the first research was done by Federer [6] who developed a maximum likelihood method for two groups of normal populations with dimension p 3. Kim [] extensively studied the problem of proportionality between covariance matrices, and showed that the solution of the likelihood equations was unique. This result was later published by Guttman et al. [9]. Independently, Rao [5] considered the likelihood ratio test for proportionality of covariance matrices from two normal populations. As an extension, under normality assumptions, Eriksen [5] and Flury [8], independently, discussed maximum likelihood estimation of proportional covariance matrices and provided likelihood ratio tests for testing the hypothesis of proportionality. Furthermore, Schott [6] considered the test problem of proportional covariance matrices from k different groups and obtained a Wald statistic under general conditions without normal population assumptions. However, all of the above work is based on the classical approximating theory which assumes the dimension of data, p to be held fixed. As it will be shown, this classical approximation leads to a test size much higher than the nominal test level in the case of largedimensional data. All such problems and limitations bring up the need for further study on proportionality test of covariance matrices. 3

5 Based on the modern random matrix theory RMT), we propose a new test and further prove that the distribution of the test statistic proposed approximates to a normal distribution as p, n, n ) together with the ratios p/n y 0, ) and p/n y 0, ) under suitable conditions. The proposed new test is valid for both Gaussian data and non-gaussian data with finite fourth moments. The rest of the article is organized as follows. Section briefly reviews some preliminary and useful RMT results. In Section 3, we propose a new test and study its asymptotic normality as p, n, n ). In Section 4, the efficiency of the proposed test is illustrated by simulation studies. Section 5 presents conclusions and some discussions. Some technical derivations are put into the Appendix.. Review of some useful results of random matrix theory We first review several results from RMT, which will be used for the proposed test procedure. For any p p matrix M with real eigenvalues {λ M i } p, the empirical spectral distribution ESD) of M, denoted by Fn M, is defined by F M n x) p [λ M i x], x R. We will consider a random matrix M whose ESD Fn M converges in a sense to be precisely) to a limiting spectral distribution LSD) F M. To make statistical inference about a parameter θ fx)df M x), it is natural to use an estimator ˆθ fx)df M n x) p fλ M i ), which is a so-called linear spectral statistic LSS) of the random matrix M. Let {ξ ki C: i, k,,...} and {η kj C: j, k,,...} be two independent double arrays of i.i.d. complex variables with mean 0 and variance. Writing ξ i ξ i, ξ i,..., ξ pi ) and η j η j, η j,..., η pj ). For any given positive integers n and n, the vectors {ξ,..., ξ n } and {η,..., η n } can be thought as independent samples of size n and n, respectively, from some p-dimensional distribution. Let S and S be the associated sample covariance matrices, i.e., S n ξ i ξ i and S n η j η n n j. 4

6 Then, the following so-called F -matrix generalizes the classical Fisher-statistic for the present p-dimensional case, where n n, n ) and p < n. Define V n S S, 4) y n p n y 0, + ), y n p n y 0, ), 5) y n y n, y n ) and y y, y ). Under suitable moment conditions, the ESD, F Vn n, of V n has a LSD, F y x) which is given by see [], P. 79 or []) F y dx) g y x) [a,b] x)dx + /y ) [y >]δ 0 dx), 6) where δ c ) denotes the Dirac point measure at c, and h y + y y y, a g y x) h) + h), b y ) y ), y ) b x)x a), a < x < b. πxy + y x) Define the empirical process G n p[fn Vn F yn ] and let à be the set of analytic functions in an open region in the complex plane containing the interval [a, b] which is the support of continuous part of the LSD F y defined in 6). The following central limit theorem CLT) of LSS for a high-dimensional F-matrix was established in Zheng [], which will be applied in our work. Theorem. For each p, assume that ξ ij : i,..., p; j,..., n ) and η ij : i,..., p; j,..., n ) are i.i.d. and satisfy Eξ ) Eη ) 0, E ξ E η, E ξ 4, E η 4 <, y n p/n y 0, + ), and y n p/n y 0, ). Furthermore, let f,..., f k Ã. Then, the random vector G n f ),..., G n f k )) weakly converges to a k-dimensional Gaussian 5

7 vector with mean vector mf j ) lim r 4πi ζ + β x y y ) πi h + β y y y ) πi h [ f j zζ)) ζ ζ ζ r + ζ + r ζ + y /hr f j zζ)) ζ + y /h) dζ 3 ] dζ ζ + /h f j zζ)) dζ, 7) ζ + y /h) 3 and covariance function covf j, f l ) f j zζ ))f l zζ )) lim dζ r π ζ ζ ζ rζ ) dζ β xy + β y y ) y ) f j zζ )) 4π h ζ ζ + y /h) dζ f l zζ )) ζ ζ + y /h) dζ, 8) where zζ) y ) [+h +hreζ)], h y + y y y, β x E ξ 4 3, and β y E η 4 3. This CLT allows for both n n, n ) and p approaching infinity. In the next section, based on the above CLT, we will develop a test statistic T n and provide the limiting distribution of T n. 3. Formulation of the new test Let Σ / and Σ / be two p p positive definite matrices such that Σ Σ / ) and Σ Σ / ), respectively. Define ξ i Σ / x i µ ) and η j Σ / y j µ ). Note that all the random vectors {x i }, {y j } and the covariance matrices Σ and Σ depend on the dimension p. However, we do not signify this dependence in notation for ease of statements. After standardizing, the arrays {ξ i } n + and {η j } n + contain i.i.d. variables with mean 0 and variance, for which we can apply Theorem.. Testing the null hypothesis specified by ) is equivalent to testing H 0 : Σ / Σ Σ / σ I p. 9) 6

8 This is, in fact, a sphericity test. Srivastava [7] discussed this sphericity test for a single population and proposed a measure of this sphericity. As stated in Srivastava [7] also see [7]), the above test is invariant for an orthogonal transformation x Gx. This test is also invariant under a scalar transformation x cx. Thus, without loss of generality, we assume that Σ / Σ Σ / diagλ,..., λ p ). It follows from the Cauchy Schwarz inequality that ) λ i p with equality holding if and only if λ λ p λ, for all i,..., p and some constant λ. Thus, following the idea of Srivastava [7], we suggest testing H 0 : Ψ against H : Ψ > with λ i Ψ p λ i /p) p λ i/p). Note that trσ Σ ) p λ i and trσ Σ Σ Σ ) p λ i ; hence, we propose the following test statistic tr ) T n p tr ˆΣ ˆΣ ˆΣ ˆΣ ) [tr ˆΣ p, 0) ˆΣ )] where ˆΣ and ˆΣ are unbiased sample covariance matrices. On the other hand, a scaled distance measure between σ Σ / Σ Σ / and I p can be defined by ) σ Σ / Σ Σ / /p)trσ Σ / Σ Σ / ) I p, p trσ Σ Σ Σ ) [trσ Σ )] p. This scaled distance measure was also discussed by Nagao [3] for the sphericity test of a single population. Thus, it is reasonable to consider the test statistic 0). Define A Σ / CΣ / and B Σ / DΣ /, where C N x i µ )x i µ ) and D N y j µ )y j µ ). n n 7

9 Note that V n AB indeed forms a random F-matrix. Then, we define T n p trcd CD ) [trcd )] p. ) As T n and T n have the same asymptotic distribution and CLT under H 0, in the next, we will consider the CLT of T n. Theorem 3. Suppose that the conditions in Theorem. hold under H 0 and T n is defined in 0), then, under H 0 and as n, n ), with where h y + y y y. v / T n [T n µ Tn ph y ) D ] N0, ), ) µ Tn h + y ) y ) + β x y + β y y, 3) v Tn 4h h + y ) y ) 4, 4) PROOF. For convenience, we define g x) x and g x) x. Let {λ i } p be the eigenvalues of the F-matrix V n. Following Theorem., we have, as n, n ), p /p) p λ j/σ ) g x)f y dx) /p) p λ j/σ D Nµ CLT, Σ CLT ), g x)f y dx) where mg ) ) vg, g, Σ CLT ) vg, g ) vg, g ) vg, g ) ) µ CLT mg ), and F y dx) is defined in 6). Denote δξ) y ) + h + hreξ), then the components of µ CLT are 8

10 given by mg j ) lim r 4πi ξ + β x y y ) πi h + β y y y ) πih g j δξ)) ξ ξ ξ r + ξ + r ξ + y /h ) g j δξ)) ξ + y /h) 3 ξ + /h g j δξ)), j,, ξ + y /h) 3 and the elements of Σ CLT are given by g k δξ ))g l δξ )) vg k, g l ) lim r π ξ ξ ξ rξ ) β xy + β y y ) y ) g k δξ )) 4π h ξ ξ + y /h) g l δξ )) ξ + y /h), k, l {, }, ξ with β x E ξ 4 3 and β y E η 4 3. By the Residue theorem, we have mg ) h y + y + h ) + β xy + 3β y y + h β y y y ) 4 y ) y ), 5) 3 y mg ) y ) + β yy y ), 6) vg, g ) 4h h 4 + 5h + ) y ) 8 + 4β xy + β y y ) + h y ) y ) 6, 7) vg, g ) 4h + h ) y ) 6 + β xy + β y y ) + h y ) y ) 4, 8) vg, g ) h y ) 4 + β xy + β y y y ). 9) Moreover, let b 0 0 x F y dx) and b 0 xf y dx). Then, we have b 0 + h y y ) 3, b y. 0) 9

11 The details of the above derivations are postponed to the Appendix. Define gx, x ) x /x and a 0 /b, b 0 /b 3 ). Applying the Delta method see Casella and Berger [3]), we obtain, as n, n ), [ /p) p p λ ] j/σ ) /p ) p λ j/σ ) b 0 D N a 0µ CLT, a 0Σ ) CLT a b 0, that is T n pb 0 /b ) D N a 0µ CLT, a 0Σ ) CLT a 0. ) Furthermore, substituting the equations 5) 0) into ) and after some manipulations, we obtain T n ph y ) D Nµ Tn, v Tn ), where µ Tn and v Tn are given in 3) and 4). This is equivalent to the statement of ). The proof is completed. Theorem 3. Under the assumptions of Theorem 3. and when Σ and Σ are bounded, β x and β y have consistent estimators ˆβ x and ˆβ y given by ˆβ x and ˆβ y pn ) pn ) N N x ix i N N p k y i y i N N p x ix i ) pn N ) N k N [ N )] x ix i N N x jx j ) x ik N N x jk y i y i ) pn N ) N N where x i x i,..., x ip ) and y i y i,..., y ip ). [ N )] y i y i N N y jy j ) y ik N N y jk 0

12 PROOF. We have x X,..., X p ) Σ ξ, where Σ a ij ), Σ σ kk ) and ξ ξ,..., ξ p ). Hence, EX 4 k) Ea k ξ + + a kp ξ p ) 4 EX kx l ) EX + + X p) where a 4 kj Eξ 4 ) + akj), a kja lj Eξ 4 ) + EX kx l ) a kj EXi 4 ) + k l ) a kj Eξ 4 ) + k ) k a lj ), akj), EX + + X p) Ex x), σ kk k a kj, a kj k σ kk. k That is, p Ex x) p ) σkk Eξ 4 ) + σ kk. p When Σ is bounded, then we have E ˆσ kk σ kk E ˆσ kk + σ kk ˆσ kk σ kk Eˆσ kk + σ kk ) Eˆσ kk σ kk ) o) uniformly for k. Then we have E ˆσ kk p p k k σ kk p E ˆσ kk σkk o). k

13 That is p ˆσ kk p k Let the estimator of p Ex x) p N p N k σ kk P 0. p σ kk) be x ix i ) x ix i x jx j i<j p N ) pn ) x ix i ) i pn N ) ) x ix i, i where pn ) x ix i ) i ) x pn N ) ix i i pn ) x ix i trσ + trσ) i N pn ) trσ) + p trσ N + pn ) x ix i trσ), i x ix i trσ) ) x pn N ) ix i ntrσ + ntrσ i N pn ) trσ) + p trσ x N ix i trσ) i i + [ N x ix i trσ)]. pn N )

14 Then, we have P 0. pn ) x ix i ) i x pn ix i trσ) i ) ) Ex x x) pn N ) ix i trσ) p p i i j x ix i trσ) x jx j trσ)] pn N ) Ex x trσ) p x ix x pn ix i trσ) Ex x trσ) i trσ) x jx j trσ)] i j p pn N ) i Then we obtain the consistent estimator of Eξ 4 ) as follows N pn ) p N x ix i ) x ix i ) pn N ) p k N +, N x ik ) where x i x i,..., x ip ). When Ex) is unknown, we also obtain the consistent estimators of Eξ) as follows: pn ) N x ix i N N p k x ix i ) pn N ) N N [ N )] x ix i N N x jx j ) +. x ik N N x jk 3

15 Then, the consistent estimator of β x Eξ 4 ) 3 is as follows ˆβ x pn ) N x ix i N N p k x ix i ) pn N ) N N [ N )] x ix i N N x jx j ). x ik N N x jk The derivation of ˆβ y is similar. The proof is completed. 4. Simulation studies 4.. Large-dimensional Gaussian data In this section, we perform some simulation studies to evaluate the efficiency of the new test LZ test) proposed in Section 3 for testing hypothesis ) with large-dimensional Gaussian data. For comparison, we also conduct the Bartlett test Bartlett adjusted likelihood ratio test, see Eriksen [5] or Flury [8]) and the Wald test see Schott [6]). We generate x,..., x n + from N p 0, σ I p ) and y,..., y n + from N p 0, Σ ) with Σ ρ i j ) for σ 0.5,.0,.0 and ρ 0.0, 0.4, 0.8, respectively. We intend to test the null hypothesis: H 0 : Σ σ Σ. Notice that ρ 0 corresponds to the realized size Type-I error) of the test. The nominal test level is set as α We simulate 0,000 independent replicates for different values of p, n, n ) under the assumption of p < n and compare the above three test. The simulation results are summarized in Table. It needs to point out that the proposed new test allows for p n, which means that the observations from one of two groups can be fewer than the dimension. Table 4 displays the summary of Gaussian data as well as non-gaussian data for the case of p > n. From Table, we can see that, as the dimension p increases, both Wald test and Bartlett test lead to an increasing test size while the proposed LZ test is so stable. Especially, when the dimension p > 600, both Wald test and Bartlett test are so worse which almost always reject the null hypothesis; in contrast, our proposed test remains even accurate. Figure shows the QQ-plots for the,000 observed values of the LZ test T n with ρ 0.0. The values of p, n, n ) are chosen as 30, 640, 640) and 500, 50, 000). In both cases, the normality result appears to be satisfied by the 4

16 Normal Q Q Plot Normal Q Q Plot Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles Figure : Normal QQ-Plot for T n as in 0) under H 0 based on 000 replicates. The left is for the case of p < n ; the right is for the case of p n. QQ-plots for large p, n, n ) which validates our theoretical asymptotic normality results. We also check the QQ-plots of T n under the alternative hypothesis i.e. ρ > 0). The similar normality property appears to be satisfied, but for the restriction of space, we ignore the QQ-plots at here. 4.. Large-dimensional non-gaussian data In fact, Theorem 3. is valid for general population distributions with finite fourth moments. To see this, two distributions are considered. One is the gamma distribution. We generate x i σz ) i, i,..., n + and y j Γz ) j, j,..., n + with z ) i Z ) i,..., Z) ip ) and z ) j Z ) j,..., Z) jp ) consisting of identical and independent standardized gamma4, 0.5) random variables so that they have mean 0 and variance, and σ σ ) / and Γ p p ρ i j ) / for σ 0.5,.0,.0 and ρ 0.0, 0.4, 0.8, respectively. The nominal test level is set as α We simulate 0,000 independent replications for different values of p, n, n ) and compute the estimated significance level of the LZ test. The simulation Results are summarized in Table. Another multivariate distribution considered is a mixture of multivariate normal distributions. We generate z ) i, i,..., n + and z ) j, j,..., n + with each component is i.i.d. from 0.9N0, ) + 0.N0, 3 ). Let x i σz ) i and 5

17 y j Γz ) j with the same values of σ and Γ as in the case of gamma distribution. The simulation results are summarized in Table 3. From Tables and 3, we can see that, our proposed test is still valid for non- Gaussian data. We also studied the QQ-plots of T n based on,000 replicates under the null and alternative hypotheses for the gamma data and mixture of multivariate normal data. The normality property appears to be satisfied by the QQ-plots for large p, n, n ) More simulations on the LZ test In this section, we continue the simulations of Section 4. and 4. but with another different set of values for p, n, n ). We set p 0, 5, 50, 00, n p, and n p + 5, p + 0,.... The results are displayed in Table 5 and Table 6. Obviously, when p is too small or when p/n is close to, the LZ test is not so efficient. Therefore, we recommend the practitioners to be careful to use the LZ test when p is small or when p/n is close to. 5. Conclusions and future work In this article, we developed a new test for comparing the proportionality of large-dimensional covariance matrices from two different population with finite fourth-moments. Its asymptotic normality is established based on modern random matrix theory as the dimension increases proportionally with the sample sizes. Simulation studies show that our proposed method is valid for both Gaussian data and non-gaussian data. Our proposed method requires that p < n but allows for p n. Generalizing our proposed method to the case of p maxn, n ) will be the future work. 6. Appendix Derivations of 4) Notice that 0 < y, y h given r >, we have ξ <. Applying Cauchy integration formula, for any ) ξ r + ξ + r 0. ) ξ + y /h 6

18 We know that Reξ ξ + ξ)/ and ξ ξ for any ξ with ξ. Then for g x) x and δξ) y ) + h + hreξ), in the formula ) mg ) lim g δξ)) r 4πi ξ ξ r + ξ + r ξ + y /h + β x y y ) g πi h δξ)) ξ + y /h) 3 + β y y y ) πi h ξ ξ ξ + /h g δξ)) ξ + y /h) 3 we have lim g δξ)) r ξ { lim y ) 4 + ξ r ξ ξ r + ξ + r [h + + h + hξ) ] ξ [ h ξ + h + h ) ξ ξ + y /h ) ξ r + ξ + r ] ξ r + ξ + r ξ + y /h πi y ) [h y 4 + y + h )) ] 4πi y ) 4 [h y + y + h )], ) ξ + y /h } { g δξ)) ξ + y /h) h + + h + hξ) 3 y ) 4 [ ξ ξ + y /h) 3 h + ξ ξ + h + ] } h ) ξ ξ + y /h) [ ] 3 h πi y ) h πi y ), 4 ) 7

19 and ξ { ξ + /h g δξ)) ξ + y /h) [h + + h + hξ) ξ + /h) ] 3 y ) 4 ξ ξ + y /h) [ 3 h + ξ ξ + h + ] } h ) ξ + /h) ξ ξ + y /h) [ ] 3 h3 + h 3y ) πi + 0 y ) 4 Thus, we obtain πi h3 + h 3y ) y ) 4. mg ) h y + y + h ) y ) 4 + β xy + 3β y y y ) + h β y y y ) 3. Derivations of 5) For g x) x and δξ) y ) + h + hreξ), in the formula ) mg ) lim g δξ)) r 4πi ξ ξ r + ξ + r ξ + y /h + β x y y ) g πi h δξ)) ξ ξ + y /h) 3 + β y y y ) ξ + /h g δξ)) πi h ξ ξ + y /h) 3 lim + h + hξ + hξ ) r 4πi y ) ξ + β x y + h + hξ + hξ ) πi h ξ ξ + y /h) 3 β y y + + h + hξ + hξ ξ + /h ) πi h y ) ξ ξ + y /h) 3 4πi y ) πi y + 0) + β xy 0 + h y y ) + y β y y ). ξ r + ξ + r ξ + y /h β y y πi h + 0) πi h y ) ) 8

20 Derivations of 6) For g x) x and δξ) y ) + h + hreξ), in the formula, g δξ ))g δξ )) vg, g ) lim r π ξ ξ ξ rξ ) β [ ] xy + β y y ) y ) g δξ )) 4π h ξ + y /h). we have lim r ξ and ξ lim r y ) 8 lim r y ) 8 lim r y ) 8 ξ πi h y ) 8 g δξ ))g δξ )) ξ rξ ) ξ ξ ξ ξ 4π h h 4 + 5h + ) y ) 8, g δξ )) ξ + y /h) y ) 4 ξ [ y ) 4 ξ πi y ) [h + 4 h y ) + 0] πi y ) h + 4 h y ), ξ + h + hξ + hξ ) [ + h + hξ + hξ ) [ ξ ξ + h + hξ + hξ ] ) ξ rξ ) + h + hξ + hξ ) r ξ ξ /r) + h + hξ + hξ ) πi h + h + hξ /r)r [ + h + hξ ) 3 + h + h + hξ ) + h + ] h + hξ ) ξ ξ + h + hξ + hξ ) ξ + y /h) h + + h + hξ ) ξ + y /h) + ξ ] h + h )ξ + h ξ ] ξ + y /h) 9

21 So, we derived that vg, g ) 4h h 4 + 5h + ) y ) 8 + 4β xy + β y y ) + h y ) y ) 6. Derivations of 7) For g x) x and g x) x, we have vg, g ) [ lim g r π δξ )) ξ β xy + β y y ) y ) 4π h Now, we can derive that [ lim g δξ )) r ξ lim r y ) 6 πi lim r y ) 6 and ξ lim r πi h y ) 6 ξ ξ ξ ξ 4π h + h ) y ) 6, g δξ )) ξ + y /h) So, we derived that ξ ξ ] g δξ )) r ξ ξ /r) + h + hξ + hξ ) [ ] g δξ )) r ξ ξ /r) g δξ )) ξ + y /h) g δξ )) ξ + y /h). ξ + h + hξ + hξ ) h + 0) [ h ξ + h + h ) ξ y ) ξ ] ξ + h + hξ + hξ ] r ξ ξ /r) + h + hξ + hξ ξ + y /h) πi h y ). vg, g ) 4h + h ) y ) 6 + β xy + β y y ) + h y ) y ) 4. 0

22 Derivations of 8) For g x) x, we have [ vg, g ) lim g r π δξ )) ξ β [ xy + β y y ) y ) 4π h Since that [ lim g δξ )) r ξ lim r y ) 4 lim r lim r πi y ) 4 πi h y ) 6 4π h y ) 4, ξ ξ ξ ξ ξ ξ ] g δξ )) r ξ ξ /r) [ + h + hξ + hξ ) ] g δξ )) r ξ ξ /r) ] g δξ )) ξ + y /h). ξ + h + hξ + hξ ) h + 0) [ h ξ + h + h ) ξ ] + h + hξ + hξ ] r ξ ξ /r) Then, we have vg, g ) h y ) 4 + β xy + β y y y ). Derivations of 9) Following the definition of F y dx) and using the substitution x y ) + h hcosθ), 0 < θ < π, we have b x)x a) hsinθ hsinθ, dx y ) y ) dθ, x + h hcosθ y ), y + y x h + y hy cosθ y ).

23 Therefore, b 0 b x y ) b x)x a) dx a πxy + y x) 4h π + h hcosθ)sin θ dθ π y ) 3 0 h + y hy cosθ 4h π + h hcosθ)sin θ dθ 4π y ) 3 0 h + y hy cosθ h π y ) 3 ih 4πy y ) 3 ih 4πy y ) 3 z + h hz + z ) h + y hy z + z ) z ) 4z z z z +h z + h )z ) dz z 3 z h +y hy z + ) z /h)z h)z ) z + ) dz z 3 z y /h)z h/y ) dz iz letting z eiθ ) Obviously, there are two poles inside the unit circle: 0 and y /h. Their corresponding residues are Hence, b 0 R0) + h + y)[h + y) y + h )] h y Ry /h) h y) y )y h ). h y ih 4πy y ) 3 πi [R0) + Ry /h)] + h y y ) 3.

24 On the other hand, b b x y ) b x)x a) dx a πxy + y x) 4h π sin θ π y ) 0 h + y hy cosθ dθ 4h π sin θ 4π y ) 0 h + y hy cosθ dθ h π y ) z h + y hy z + z ) z ) 4z h z ) 4πi y y ) z z z h +y hy z + ) dz h z ) 4πi y y ) z z h/y )z y /h) dz z dz iz letting z eiθ ) There are two poles inside the unit circle: 0 and y /h. Their corresponding residues are R0) y + h, Ry /h) y h. hy hy Hence, b Acknowledgments h 4πi y y ) πi [R0) + Ry /h)]. y The authors would like to thank the Editor, an Associate Editor and two referees for their helpful comments and useful suggestions. GL Tian s research was fully supported by a grant HKU 7790M) from the Research Grant Council of the Hong Kong Special Administrative Region. References [] Z.D. Bai, D.D. Jiang, J.F. Yao, S.R. Zheng, Corrections to LRT on largedimensional covariance matrix by RMT., Ann. Statist ) [] Z.D. Bai, J.W. Silverstein, Spectral analysis of large-dimensional random matrices, st ed., Science Press, Beijing, China.,

25 [3] G. Casella, R.L. Berger, Statistical Inference, nd ed., Duxbury Press., 00. [4] G.R. Dargahi-Noubary, An application of discrimination when covariance matrices are proportional, Austral. J. Statist. 3 98) [5] P.S. Eriksen, Proportionality of covariance matrices., Ann. Statist ) [6] W.T. Federer, Testing proportionality of covariance matrices, Ann. Math. Statist. 95) [7] T.J. Fisher, X.Q. Sun, C.M. Gallagher, A new test for sphericity of the covariance matrix for high dimensional data., J. Multivariate Anal. 0 00) [8] B.K. Flury, Proportionality of k covariance matrices., Statist. Prob. Lett ) [9] I. Guttman, D.Y. Kim, I. Olkin, Statistical inference for constants of proportionality, Technical Report 95, Department of Statistics, Stanford University, 983. [0] S.T. Jensen, J. Madsen, Estimation of proportional covariances in the presence of certain linear restrictions, Ann. Statist ) 9 3. [] D.Y. Kim, Statistical inference for constants of proportionality between covariance matrices, Technical Report 59, Department of Statistics, Stanford University, 97. [] R.J. Muirhead, Aspects of Multivariate Statistical Theory. st Edition., John Wiley & Son, New York, 98. [3] H. Nagao, On some test criteria for covariance matrix., Ann. Statist. 973) [4] D.G. Nel, P.C.N. Groenewald, A bayesian approach to the multivariate behrens-fisher problem under the assumption of proportional covariance matrices, Test 993) 4. [5] C.R. Rao, Likelihood ratio tests for relationships between two covariance matrices, in: S. Karlin, T. Amemiya, L.A. Goodman Eds.), Studies in Econometrics, Time Series and Multivariate Statistics, Academic Press, New York, 983, pp

26 [6] J.R. Schott, A test for proportional covariance matrices., Comput. Statist. Data Anal ) [7] M.S. Srivastava, Some tests concerning the covariance matrix in high dimensional data., J. Japan Statist. Soc ) 5 7. [8] M.S. Srivastava, C.G. Khatri, E.M. Carter, On monotonicity of the modified likelihood ratio test for the equality of two covariance., J. Multivariate Anal ) [9] N. Sugiura, H. Nagao, Unbiasedness of some test criteria for the equality of one or two covariance matrices., Ann. Math. Statist ) [0] C. Villa, C. Pérignon, Sources of time variation in the covariance matrix of interest rates, Journal of Business ) [] S.S. Wilks, Sample criteria for testing equality of means, equality of variances, and equality of covariances in a normal multivariate distribution, Ann. Math. Statist ) [] S.R. Zheng, Central limit theorems for linear spectral statistics of large dimensional F-matrices., Annales de l Institut Henri Poincaré Probabilités et Statistiques 48 0)

27 Table : Sizes and powers of the Wald test, Bartlett test, and proposed LZ test based on 0,000 independent replications using real Gaussian variables with y 0.5 and y 0.5. Test sizes are calculated under the null hypothesis H 0 : Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H : Σ σ Σ with ρ > 0, where Σ σ I p and Σ ρ i j ) p p. ρ p, n, n ) Wald Test Bartlett Test LZ Test σ , 80, 80) , 60, 60) , 30, 30) , 640, 640) , 80, 80) , 80, 80) , 60, 60) , 30, 30) , 640, 640) , 80, 80) , 80, 80) , 60, 60) , 30, 30) , 640, 640) , 80, 80)

28 Table : Sizes and powers of the proposed LZ test based on 0,000 independent replications using real Gamma variables with y 0.5 and y 0.5. Test sizes are calculated under the null hypothesis H 0 : Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H : Σ σ Σ with ρ > 0, where Σ σ I p and Σ ρ i j ) p p. p, n, n ) ρ 0.0 ρ 0.4 ρ 0.8 σ , 80, 80) , 60, 60) , 30, 30) , 640, 640) , 80, 80) Table 3: Sizes and powers of the proposed LZ test based on 0,000 independent replications using real mixture of multivariate normal distributions variables with y 0.5 and y 0.5. Test sizes are calculated under the null hypothesis H 0 : Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H : Σ σ Σ with ρ > 0, where Σ σ I p and Σ ρ i j ) p p. p, n, n ) ρ 0.0 ρ 0.4 ρ 0.8 σ , 80, 80) , 60, 60) , 30, 30) , 640, 640) , 80, 80)

29 Table 4: Sizes and powers of the proposed LZ test based on 0,000 independent replications with y 0 and y 0.5. Test sizes are calculated under the null hypothesis H 0 : Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H : Σ σ Σ with ρ > 0, where Σ σ I p and Σ ρ i j ) p p. Multivariate Gaussian data p, n, n ) ρ 0.0 ρ 0.4 ρ 0.8 σ , 5, 00) , 0, 00) , 0, 400) , 50, 000) , 80, 600) Multivariate gamma data p, n, n ) ρ 0.0 ρ 0.4 ρ 0.8 σ , 5, 00) , 0, 00) , 0, 400) , 50, 000) , 80, 600) Mixture of multivariate Gaussian data p, n, n ) ρ 0.0 ρ 0.4 ρ 0.8 σ , 5, 00) , 0, 00) , 0, 400) , 50, 000) , 80, 600)

30 Table 5: Sizes and powers of the proposed LZ test based on 0,000 independent replications using real Gaussian variables with n p and n p + 5, p + 0,.... Test sizes are calculated under the null hypothesis H0: Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H: Σ σ Σ with ρ > 0, where Σ σ Ip and Σ ρ i j )p p. ρ p n n p + k k

31 Table 6: Sizes and powers of the proposed LZ test based on 0,000 independent replications using Gamma variables data with n p and n p + 5, p + 0,.... Test sizes are calculated under the null hypothesis H0: Σ σ Σ with ρ 0; the powers are estimated under the alternative hypothesis H: Σ σ Σ with ρ > 0, where Σ σ Ip and Σ ρ i j )p p. ρ p n n p + k k

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST

TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST Submitted to the Annals of Statistics arxiv:. TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST By Dandan Jiang Jilin University arxiv:151.398v1 [stat.me] 11 Oct 15 This paper

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-2017-0275 Title Testing homogeneity of high-dimensional covariance matrices Manuscript ID SS-2017-0275 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0275

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title itle On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime Authors Wang, Q; Yao, JJ Citation Random Matrices: heory and Applications, 205, v. 4, p. article no.

More information

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Large sample covariance matrices and the T 2 statistic

Large sample covariance matrices and the T 2 statistic Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and

More information

High-dimensional two-sample tests under strongly spiked eigenvalue models

High-dimensional two-sample tests under strongly spiked eigenvalue models 1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School

More information

Tests for Covariance Matrices, particularly for High-dimensional Data

Tests for Covariance Matrices, particularly for High-dimensional Data M. Rauf Ahmad Tests for Covariance Matrices, particularly for High-dimensional Data Technical Report Number 09, 200 Department of Statistics University of Munich http://www.stat.uni-muenchen.de Tests for

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

PATTERNED RANDOM MATRICES

PATTERNED RANDOM MATRICES PATTERNED RANDOM MATRICES ARUP BOSE Indian Statistical Institute Kolkata November 15, 2011 Introduction Patterned random matrices. Limiting spectral distribution. Extensions (joint convergence, free limit,

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Estimation of the Global Minimum Variance Portfolio in High Dimensions

Estimation of the Global Minimum Variance Portfolio in High Dimensions Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results

More information

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

On the Testing and Estimation of High- Dimensional Covariance Matrices

On the Testing and Estimation of High- Dimensional Covariance Matrices Clemson University TigerPrints All Dissertations Dissertations 1-009 On the Testing and Estimation of High- Dimensional Covariance Matrices Thomas Fisher Clemson University, thomas.j.fisher@gmail.com Follow

More information

Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond

Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond arxiv:1707.04010v2 [math.st] 17 Apr 2018 Xinxin Yang Department of ISOM, Hong Kong University of Science and Technology

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X). 4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2 EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity International Mathematics and Mathematical Sciences Volume 2011, Article ID 249564, 7 pages doi:10.1155/2011/249564 Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity J. Martin van

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

More information

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

NEW APPROXIMATE INFERENTIAL METHODS FOR THE RELIABILITY PARAMETER IN A STRESS-STRENGTH MODEL: THE NORMAL CASE

NEW APPROXIMATE INFERENTIAL METHODS FOR THE RELIABILITY PARAMETER IN A STRESS-STRENGTH MODEL: THE NORMAL CASE Communications in Statistics-Theory and Methods 33 (4) 1715-1731 NEW APPROXIMATE INFERENTIAL METODS FOR TE RELIABILITY PARAMETER IN A STRESS-STRENGT MODEL: TE NORMAL CASE uizhen Guo and K. Krishnamoorthy

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS The Annals of Statistics 013, Vol. 41, No. 4, 09 074 DOI: 10.114/13-AOS1134 Institute of Mathematical Statistics, 013 CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana

More information

An Introduction to Large Sample covariance matrices and Their Applications

An Introduction to Large Sample covariance matrices and Their Applications An Introduction to Large Sample covariance matrices and Their Applications (June 208) Jianfeng Yao Department of Statistics and Actuarial Science The University of Hong Kong. These notes are designed for

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

An Encompassing Test for Non-Nested Quantile Regression Models

An Encompassing Test for Non-Nested Quantile Regression Models An Encompassing Test for Non-Nested Quantile Regression Models Chung-Ming Kuan Department of Finance National Taiwan University Hsin-Yi Lin Department of Economics National Chengchi University Abstract

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT)

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT) MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García Comunicación Técnica No I-07-13/11-09-2007 (PE/CIMAT) Multivariate analysis of variance under multiplicity José A. Díaz-García Universidad

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Near-exact approximations for the likelihood ratio test statistic for testing equality of several variance-covariance matrices

Near-exact approximations for the likelihood ratio test statistic for testing equality of several variance-covariance matrices Near-exact approximations for the likelihood ratio test statistic for testing euality of several variance-covariance matrices Carlos A. Coelho 1 Filipe J. Marues 1 Mathematics Department, Faculty of Sciences

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC A Simple Approach to Inference in Random Coefficient Models March 8, 1988 Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC 27695-8203 Key Words

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 Reading Chapter 5 (continued) Lecture 8 Key points in probability CLT CLT examples Prior vs Likelihood Box & Tiao

More information

Default priors and model parametrization

Default priors and model parametrization 1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

SIMULTANEOUS ESTIMATION OF SCALE MATRICES IN TWO-SAMPLE PROBLEM UNDER ELLIPTICALLY CONTOURED DISTRIBUTIONS

SIMULTANEOUS ESTIMATION OF SCALE MATRICES IN TWO-SAMPLE PROBLEM UNDER ELLIPTICALLY CONTOURED DISTRIBUTIONS SIMULTANEOUS ESTIMATION OF SCALE MATRICES IN TWO-SAMPLE PROBLEM UNDER ELLIPTICALLY CONTOURED DISTRIBUTIONS Hisayuki Tsukuma and Yoshihiko Konno Abstract Two-sample problems of estimating p p scale matrices

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp.

K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp. K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ABDELAZIZ NASROALLAH Abstract. Standard Monte Carlo simulation needs prohibitive time to achieve reasonable estimations. for untractable integrals (i.e.

More information

Size Distortion and Modi cation of Classical Vuong Tests

Size Distortion and Modi cation of Classical Vuong Tests Size Distortion and Modi cation of Classical Vuong Tests Xiaoxia Shi University of Wisconsin at Madison March 2011 X. Shi (UW-Mdsn) H 0 : LR = 0 IUPUI 1 / 30 Vuong Test (Vuong, 1989) Data fx i g n i=1.

More information

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model Journal of Multivariate Analysis 70, 8694 (1999) Article ID jmva.1999.1817, available online at http:www.idealibrary.com on On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information