High-dimensional asymptotic expansions for the distributions of canonical correlations

Size: px
Start display at page:

Download "High-dimensional asymptotic expansions for the distributions of canonical correlations"

Transcription

1 Journal of Multivariate Analysis ) Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: High-dimensional asymptotic expansions for the distributions of canonical correlations Yasunori Fujikoshi, Tetsuro Sakurai Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku, , Japan a r t i c l e i n f o a b s t r a c t Article history: Received 14 August 2007 Available online 26 April 2008 AMS 1991 subject classifications: primary 62H10 secondary 62E20 Keywords: Asymptotic distributions Canonical correlations Extended Fisher s z-transformation High-dimensional framework This paper examines asymptotic distributions of the canonical correlations between x 1 ; q 1 and x 2 ; p 1 with q p, based on a sample of size of N = n + 1. The asymptotic distributions of the canonical correlations have been studied extensively when the dimensions q and p are fixed and the sample size N tends toward infinity. However, these approximations worsen when q or p is large in comparison to N. To overcome this weakness, this paper first derives asymptotic distributions of the canonical correlations under a high-dimensional framework such that q is fixed, m = n p and c = p/n c 0 [0, 1), assuming that x 1 and x 2 have a joint q + p)-variate normal distribution. An extended Fisher s z-transformation is proposed. Then, the asymptotic distributions are improved further by deriving their asymptotic expansions. Numerical simulations revealed that our approximations are more accurate than the classical approximations for a large range of p, q, and n and the population canonical correlations Elsevier Inc. All rights reserved. 1. Introduction Let x 1 = x 1,..., x q ) and x 2 = x q+1,..., x q+p ) be two random vectors with a joint q + p)-variate normal distribution with a mean vector µ = µ, 1 µ 2 ) and a covariance matrix ) Σ11 Σ Σ = 12, Σ 21 Σ 22 where Σ 12 is a q p matrix. Without loss of generality, we may assume q p. Let S be the sample covariance matrix formed from a sample of size of N = n + 1 of x = x, 1 x 2 ). Corresponding to a partition of x, we partition S as ) S11 S S = 12. S 21 S 22 Let ρ 1 ρ q 0 and r 1 > > r q > 0 be the population and sample canonical correlations between x 1 and x 2. Note that ρ 2 1 ρ2 0 and q r2 > 1 > r2 1 q > 0 are the characteristic roots of Σ11 Σ 12Σ 1 22 Σ 12 and S 1 11 S 12S 1 22 S 12, respectively. This paper is concerned with asymptotic distributions of the canonical correlations and their transformations. Under the framework of a large sample; A0: p and q are fixed, n, many asymptotic results have been obtained e.g., see [11,1]). Note, however, it may be noted that these results will not work well when the dimensions q and or p become large. Such examples can be seen in [13, pp ]. One of the examples is Corresponding author. address: fujikoshi_y@yahoo.co.jp Y. Fujikoshi) X/$ see front matter 2008 Elsevier Inc. All rights reserved. doi: /j.jmva

2 232 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) concerned with a canonical correlation analysis between the p = 14 explanatory variables and the q = 8 criteria variables based on N = 128 observations. In this example we are also interested in a canonical correlation analysis between the p = 14 explanatory variables and a subset of the q = 8 criteria variables. To overcome this weakness, we first derive the limiting distributions of the canonical correlations and their functions under a high-dimensional framework such that A1: q; fixed, p, n, m = n p, c = p/n c 0 [0, 1). 1.1) Here, the constant c is assumed to be 0 < c < 1. Based on the limiting results, we propose an extended Fisher s z- transformation in which the asymptotic variance does not depend on the unknown parameters. In addition, the limiting results are improved further by deriving their asymptotic expansions. Note that the classical large-sample limiting results are the same for p and q, while the high-dimensional limiting results depend on p through c = p/n. Furthermore, our high-dimensional results can be reduced to the classical large-sample results when c tends toward 0. This means that our approximations contain more information than the classical approximations. Moreover, numerical simulations revealed that our approximations are more accurate than the classical approximations over a large range of p, q, and n and the population canonical correlations. Some papers have described asymptotic distributions under a high-dimensional framework such that both the dimension and the sample size are large. For example, there are some works on the characteristic roots of Wishart matrices, see [2,5]. For some tests of covariance matrices, see [6,9], etc. Raudys and Young [8] gave asymptotic results of linear discriminant functions. 2. Limiting distributions Since we are interested in the distribution of a function regarding the characteristic roots of S 1 11 S 12S 1 22 S 12, without loss of generality we may assume Iq P ) Σ =, P = P P 1 O), P 1 = diagρ 1,..., ρ q ). I p Let A = ns, which is distributed as a Wishart distribution W q+p n, Σ), and partition A as ) A11 A A = 12, A 21 A 22 corresponding to a partition of x. In our derivation, first we consider the distribution of l 2 α l2 > 1 > l2 q ) defined by l 2 α = r2 α /1 r2 α ), α = 1,..., q, which are the characteristic roots of A A 12A 1 22 A 21, instead of r 2 α or r αr 1 > > r q ). Here A 11 2 = A 11 A 12 A 1 22 A 21. The distribution of r 2 α or r α is treated as the distribution of a function fl 2 α ). For example, r2 α and r α are expressed in terms of l α as rα 2 = l2 α /1 + l2 α ) and r α = l α / 1 + l 2 α, respectively. The population quantity corresponding to l2 α is denoted by γα 2 = ρ2 α /1 ρ2 α ), α = 1,..., q. We derive the asymptotic distributions of l 2 α and its function under framework A1 and A2: The αth root ρ α of Σ 1 11 Σ 12Σ 1 22 Σ 12 is simple and is not zero. 2.1) Then, the l α s are the roots of Q = A 1/ A 12A 1 22 A 21A 1/ and Q can be expanded see Appendix) as : Q = A 1/ A 12A 1 22 A 21A 1/ = Θ 2 + O, 1/2 where Θ 2 = diagθ1 2,..., θ2) = 1 q c) 1 ci q + Γ 2 ), i.e., θα 2 = c + γ2 α )/1 c), α = 1,..., q. This shows that l 2 α approaches θ 2 α = 1 c) 1 { ρ 2 α /1 ρ2 α ) + c }, 2.2) not ρ 2 α /1 ρ2 α ), and Theorem 2.1 holds.

3 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Theorem 2.1. Let l 2 α = r2 α /1 r2 α ), where r α is the αth canonical correlation, and let hl 2 α ) be a function of l2 α such that the first derivative is continuous in the neighborhood of l 2 α = θ2 α and h θα 2 ) 0. Then, under A1 and A2, we have y α = nl 2 α θ2 α ) d N0, τα 2), z α = { n hl 2 α ) hθ2 α )} d N 0, { ) h θ 2α )} 2 τ 2α, where d denotes convergence in the distribution, and the asymptotic variance τα 2 of y α is expressed in terms of θα 2 and ρ2 α as τα 2 22 c) = θα 2 1 c + 1) θα 2 c ) 2 c { } 22 c) ρ 2 = 1 c) 1 α 3 ρ2 α ) 1 + c, 2.3) 1 ρ 2 α 2 c respectively. Theorem 2.1 is shown by using a perturbation expansion of l 2 α in terms of U and V see, A.1) and A.4)). Therefore, we find that the limiting distribution of y α is the same as that of q 1) αα = c1 c) 1 u αα θ 2 α v αα, where the limiting distributions of u αα and v αα are independently normal with mean 0. This yields the limiting distribution of y α. In a rigorous terminology of convergence in distribution the results of Theorem 2.1 should be considered for a highdimensional framework that n = pc with a constant 0 < c < 1 or c should be read as c 0. However, for an actual use the statement as in Theorem 2.1 will be more useful. Note that rα 2 = l2 α /1 + l2 α ) and r α = l 2 α /1 + l2 α ). Therefore, rα 2 and r α can be expressed as functions of l 2 α given by hx) = x/1 + x), and hx) = x/1 + x), respectively. From Theorem 2.1, we obtain the following results: nr 2 α ρ 2 α ) d N0, σα 2), nrα ρ α ) d N 0, 1 ) 4 σ2 α ρ 2 α, 2.4) 2.5) where ρ α = {ρ 2 α + c1 ρ2 α )}1/2, σ 2 α = 21 c)1 ρ2 α )2 {2ρ 2 α + c1 2ρ2 α )}. In particular, letting c = 0 in 2.4) and 2.5) we have the well-known large-sample results: nr 2 α ρ 2 α ) d N0, 4ρ 2 α 1 ρ2 α )2 ), nrα ρ α ) d N0, 1 ρ 2 α )2 ). 2.6) 2.7) Here, note that the high-dimensional asymptotic results 2.4) and 2.5) depend on p through c = p/n, but the largesample results 2.6) and 2.7) do not depend on p and are the same for all p. The numerical accuracy of these approximations is examined in Section Extension of Fisher s z-transformation It is of interest to find a transformation such that its asymptotic variance becomes parameter-free. In particular, the transformation whose asymptotic variance becomes 1 is usually used. This is equivalent to finding a function h such that the high-dimensional asymptotic variance of z = nhl 2 α ) hθ2 α )) is equal to 1. From Theorem 2.1, this is equivalent to finding a function h satisfying h 2 22 c) x) x + 1) x 1 c c 2 c ) = 1 for all x > c/2 c). 3.1)

4 234 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) It is easy to find a solution defined by { } 1 c hx) = 22 c) log gx) + gx) 2 1, 3.2) where gx) = 2 c)x + 1 c. The transformation is defined for x > c/2 c) gl 2 α ) 1 r α c/2. We shall see that the transformation is an extension of Fisher s z-transformation. Let u = {2 c)x c}/2, and then gx) + gx) 2 1 = 1 + 2u + 2 uu + 1) = 1 + u + u) 2. The last expression is equal to 1 + u + u 1 + u u = 1 + u/1 + u) 1 u/1 + u). Furthermore, u 1 + u = 1 1 x + 1 c 2 c)x + 1), which is equal to rα 2 c1 r2 α )/2 c) when x = l2 α. Therefore, we have hl 2 α ) = 1 c } {gl 22 c) log 2α ) + gl 2 α )2 1 1 c 1 = 1 c/2) 2 log 1 + rα 2 c1 r2 α )/2 c). 3.3) 1 rα 2 c1 r2 α )/2 c) Let z = 1 2 log 1 + rα 2 c1 r2 α )/2 c), 1 rα 2 c1 r2 α )/2 c) ζ = 1 2 log 1 + ρ 2 α c1 ρ2 α )/2 c). 1 ρ 2 α c1 ρ2 α )/2 c) 3.4) Then, from Theorem 2.1 we have n1 c) 1 c/2) z ζ) d N0, 1). 3.5) Similarly, letting c = 0 in 3.5), we have Fisher s z-transformation and its asymptotic normality in the large-sample case { 1 n 2 log 1 + r α 1 1 r α 2 log 1 + ρ } α d N0, 1). 3.6) 1 ρ α Results 3.5) and 3.6) can be used to construct a confidence interval for ρ. If A1 and A2 are satisfied, we have approximately where Pu 1 ζ u 2 ) 1 δ, u 1 = z u ) 1 2 δ 1, n1 c)/1 c/2) ) 1 u 2 = z + u 2 δ 1 n1 c)/1 c/2) and uδ) is the upper δ point of N0, 1). The relationship u 1 ζ u 2 is converted to the form of an interval as follows: 2u 1 log 1 + ρ 2 α c1 ρ2 α )/2 c) 1 ρ 2 α c1 ρ2 α )/2 c) u 2 tanhu 1 ) ρ 2 α c1 ρ2 α )/2 c) tanhu 2),

5 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Table 3.1 Actual confidence coefficients for the confidence intervals of ρ α with the 95% confidence coefficient N p ρ 1 = 0.9 ρ 2 = 0.5 ρ 3 = Large High Large High Large High Large and High mean the large-sample confidence interval 3.8) and the high-dimensional confidence intervals 3.7), respectively. Table 3.2 Actual confidence coefficients for the large-sample and high-dimensional confidence intervals of ρ 1 with the 95% confidence coefficient Case Large High i) ii) which is equivalent to { 2 c tanh 2 u 1 ) c } ρ α 21 c) 2 c 2 c 21 c) { tanh 2 u 2 ) c 2 c }. 3.7) Letting c = 0 in 3.7), we obtain a confidence interval based on a large-sample framework as follows: tanhu 1 ) ρ α tanhu 2 ). 3.8) Transformation 3.3) is defined for rα r 2 c/2. As a general transformation it is suggested to replace α 2 c1 r2 α )/2 c) by rα 2 c1 r2 α )/2 c). This modification was used in Table 3.1. The other modification will replace it by zero when rα 2 < c/2. These two modifications are almost the same, but the former is more useful. This is examined in the following numerical example. Table 3.1 gives the actual confidence coefficients for the large-sample confidence interval 3.8) and the high-dimensional confidence intervals 3.7) of ρ α with the 95% confidence coefficient. The simulation with 100,000 repetitions was performed when ρ 1 = 0.9, ρ 2 = 0.5, ρ 3 = 0.3, and q = 3. The sample size N and dimension p were chosen as in Table 3.1. The values with denote the ones obtained by replacing rα 2 c1 r2 α )/2 c) by its absolute value. The modifications replaced by zero were almost the same except for ρ 3 = 0.3, N = 50 and p = 3, 7. Table 3.1 shows that the high-dimensional confidence interval is more useful than the large-sample confidence interval in a large range of N, p and the population canonical correlations. The high-dimensional confidence interval is also applicable to a moderate-sample size with p = 2 or 3. Consider two settings i) N = 25, p = q = 2, ρ 1 = , ρ 2 = 0.054, ii) N = 37, q = 2, p = 3, ρ 1 = , ρ 2 = , which are based on real data in [1, p. 505] and [10, p. 208], respectively. Then, we have large-sample and high-dimensional confidence intervals with a 95% confidence coefficient. Using a simulation, the actual confidence coefficients are obtained as follows. Table 3.2 shows that the high-dimensional confidence interval is more useful than the large-sample confidence interval even in a situation where the large-sample asymptotic is suitable.

6 236 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Asymptotic refinement To make our asymptotic results more elaborate, we derive their asymptotic expansions. Let y α be the random variable in Theorem 2.1. Then we see, e.g., [3,4]) that the first four cumulants have the forms κ 1 y α ) = Ey α ) = 1 n a 1α + O 3/2, κ 2 y α ) = Vary α ) = a 2α + O 1, κ 3 y α ) = 1 n a 3α + O 3/2, 4.1) κ 4 y α ) = O 1, where the notation O i denotes a term of the ith order with respect to n 1, p 1, m 1 ). In fact, the coefficients can be expressed as follows see Appendix): a 1α = 1 c) 1 [ q 1) q 3)θ 2 α + cq 1)1 + θ2 α ) θ2 α ) { 2θ 2 α c1 + θ2 α )} d α ], a 2α = 21 c) θ 2 α ) { 2θ 2 α c1 + θ2 α )}, a 3α = 81 c) θ 2 α ) { 3θ 2 α 1 + 2θ2 α ) + c {c 2 + 2c 7)θ 2 α + c 5)θ4 α }}, 4.2) where d α = β αθ 2 α θ2 β ) 1. Using the cumulant formulas, it is possible to give an asymptotic expansion of the distribution function of y α. For a general theory, see [3,4]. In the following we, however, give a general result for the distributions of r 2 α and its function. Consider the transformed variable defined by z α = n {hl 2 α ) hθ2 α )} 4.3) in Theorem 2.1. Here, it is assumed that hx) is three times continuously differentiable in the neighborhood of x = θα 2. Then, we can expand z as z α = n {hl 2 α ) hθ2 α )} = { n h θα 2)l2 α θ2 α ) + 1 } 2 h θα 2)l2 α θ2 α )2 + This implies = h θ 2 α )y α + 1 n 1 2 h θ 2 α )y2 α +. κ 1 z α ) = Ez α ) = 1 n b 1α + O 3/2, κ 2 z α ) = Varz α ) = b 2α + O 1, 4.4) where κ 3 z α ) = 1 n b 3α + O 3/2, b 1α = h θ 2 α )a 1α h θ 2 α )a 2α, b 2α τ 2 α = h θ 2 α )2 a 2α, b 3α = h θ 2 α )3 a 3α + 3h θ 2 α )2 h θ 2 α )a2 2α. 4.5) It is easy to obtain the first two expressions in 4.5). For a derivation of the last expression, see the Appendix. We are especially interested in obtaining an asymptotic expansion of the distribution function of z = n[r 2 α {ρ2 α + c1 ρ2 α )}], which can be expressed as n{hl 2 α ) hθ2 α )} with hx) = x/1 + x). Then, h θ 2 α ) = 1 + θ2 α ) 2 = 1 c) 2 1 ρ 2 α )2, h θ 2 α ) = 21 + θ2 α ) 3 = 21 c) 3 1 ρ 2 α )3.

7 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Using 4.2) the coefficients in 4.5), this can be expressed as b 1α = 1 ρ 2 α ) { } q 1) + 2q 2)ρ 2 α + c 2q 1) 2q 2)ρ 2 α + { 2ρ 2 α + c1 2ρ2 α )} 1 ρ 2 α ) β αρ 2 α ρ2 β ) 1, 4.6) b 2α σ 2 α = 21 c)1 ρ2 α )2 [ 2ρ 2 α + c1 2ρ2 α )], b 3α = 81 c)1 ρ 2 α )3 {3ρ 2 α + c1 3ρ2 α )}{1 3ρ2 α c2 3ρ2 α )}. Theorem 4.1. Let r α be the αth canonical correlation, Then, under A1 and A2, we have P nr 2 α ρ2 α )/σ α x) = Φx) + φx) { b 1α /σ α ) + b 3α /σ 3 α )x2 1) } + O 3/2, 4.7) where Φ and φ are the probability distribution function and probability density function of N0, 1), respectively, ρ α = {ρ 2 α + c1 ρ2 α )}1/2, σ α = b 2α. The coefficients b 1α, b 2α and b 3α are given by 4.6). Furthermore, let hrα 2) be a function of r2 α such that the third derivative is continuous in the neighborhood of rα 2 = ρ2 α and h ρ 2 α ) 0. Then, the distribution function of { n hrα 2) h ρ2 α )} /τ α can be expanded as in 4.7) with the coefficients biα instead of b iα, where τ α = b1/2 2α, b 1α = h ρ 2 α )b 1α h ρ 2 α )b 2α, b 2α = h ρ 2 α )2 b 2α, 4.8) b 3α = h ρ 2 α )3 b 3α + 3h ρ 2 α )2 h ρ 2 α )b2 2α. From Theorem 4.1, the upper percent point of nr 2 α ρ2 α )/σ α is given by u + 1 n { b1α /σ α ) + b 3α /σ 3 α )u2 1) } 4.9) in the term of the upper percent point u of N0, 1). This expansion is called the Cornish Fisher expansion. The distribution of nr α ρ α )/τ α can be obtained from Theorem 2.1 with hx) = x. In this case, h ρ 2 α ) = 1 2 ρ α, h ρ 2 α ) = 1 4 ρ3 α. Now, we shall see that the large-sample asymptotic expansion e.g., see [11]) can be obtained by considering a largesample expansion for the high-dimensional asymptotic expansion 4.7). The statistic can be expanded for a large n as nr 2 α ρ 2 α ) = nr 2 α ρ2 α ) + 1 n 1 ρ 2 α )p + On 1 ). The coefficient b iα can be expanded as g iα + On 1 ) when n is large, where g 1α = 1 ρ 2 α ) q 1) + 2q 2)ρ 2 α + 2ρ2 α 1 ρ2 α ) β αρ 2 α ρ2 β ) 1, g 2α = 41 ρ 2 α )2 ρ 2 α, g 3α = 241 ρ 2 α )3 ρ 2 α 1 3ρ2 α ). 4.10) These equations imply the well-known asymptotic expansion in a large-sample case given by P nr 2 α ρ2 α )/ g 2α x) = Φx) + φx) { g 1α / g 2α ) + g 3α / g 2α ) 3 )x 2 1) } + On 3/2 ). 4.11)

8 238 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Numerical accuracy In this section, we compare our high-dimensional approximations with the classical approximations based on the asymptotic distribution under a large-sample framework such that q and p are fixed and n tends toward infinity. The numerical accuracy is studied for the upper 5% points of the distribution of rα 2 when q = 3. The following values of the population canonical correlations were chosen: ρ 1 = 0.9, ρ 2 = 0.5, ρ 3 = 0.3. The upper 5% points of the distribution of rα 2 were approximated using the Cornish Fisher expansion. The highdimensional expansion is given by 4.9). The approximations using the limiting term and the expansion up to O 1/2 are denoted by a H0 and a H1, respectively. Similarly, the corresponding approximations in the large-sample case are denoted by a L0 and a L1. The Cornish Fisher expansion is obtained from 4.11). The percentage points based on these approximate upper 5% points are examined using the actual percentage points by simulation with 100,000 repetitions. These values are given in Tables From Tables , we can conclude that the following tendencies occur. The large-sample approximations work well only when p is very small and N is large. For the other case, the large-sample approximations will worsen. The approximation a L0 tends to approach 1 as p increases, while the approximation a L1 tends to approach 0 as p increases. If the large-sample approximations work well, the corresponding high-dimensional approximation works well. The high-dimensional approximations are good even when p is one-half of n. The approximation a H1 is better than the approximation a H0, especially when the population canonical correlation is small. The high-dimensional approximation is good even when n is small. Table 5.1 Actual probabilities for the approximate upper 5% points of r 2 1 N p a L0 a L1 a H0 a H Table 5.2 Actual probabilities for the approximate upper 5% points of r 2 2 N p a L0 a L1 a H0 a H

9 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Table 5.3 Actual probabilities for the approximate upper 5 points of r 2 3 N p a L0 a L1 a H0 a H Conclusive remarks and discussion In this paper we obtained asymptotic expansions as well as the limiting distributions of canonical correlations under a high-dimensional framework 1.1). By simulation experiments Tables ), it was shown that the high-dimensional approximations are better than the large-sample approximations in a large range of p, q, N) except for p and q less than 3. The high-dimensional asymptotic expansions are useful for the distributions of the smallest canonical correlations. We proposed a transformation 3.3) whose asymptotic variance is distribution-free, under a high-dimensional framework. It is just an extension of Fisher s z-transformation. The new confidence interval 3.7) of ρ α based on the transformed statistic was compared with a classical confidence interval based on Fisher s z-transformation. As is seen in Table 3.2, the new confidence interval is more useful than the classical one even in the case of i) p = q = 2, N = 25 and ii) q = 2, p = 3 = 3, N = 37. However, it is pointed that the high-dimensional approximations worsen when q is large. An approach for overcoming the fault will be deriving asymptotic distributions of canonical correlations under the following high-dimensional framework. q, p, n, m 1 = n p, m 2 = n q, c 1 = p/n, c 2 = q/n c 01, c 02 [0, 1). This problem and the extension to a class of elliptical distributions, etc. are left as a future research topic. Acknowledgments The authors would like to thank two referees for their valuable comments. Appendix. Derivation of the asymptotic expansions For our derivation, we use the following distributional reduction on the canonical correlations. Lemma A.1. When we consider the distribution of a function of the canonical correlations r 1 > > r q, without loss of generality, we may assume that: 1) A 11 2 is distributed as a central Wishart distribution W q m, I q ), where m = n p. 2) Let B be the first q q submatrix of A 22. Then, given B, A 12 A 1 22 A 21 is distributed as a noncentral Wishart distribution W q p, I p ; Γ BΓ), where Γ = diagγ 1,..., γ q ) and γ α = ρ α / 1 ρ 2 α. 3) A 12 A 1 22 A 21 and A 11 2 are independent. 4) B is distributed as a central Wishart distribution W q n, I q ). The lemma was essentially obtained by Sugiura and Fujikoshi [12], except that 2) and 3) were given under a conditional setup. Note that 3) follows from the independence of A 11 2 and B. Let U = { 1 p p A 12A 1 22 A 21 I q + np )} Γ 2, V = ) 1 m m A 11 2 I q. A.1) Then, it is well known that the limiting distribution of V is normal. To show that the limiting distribution of U is normal under A1, we consider the characteristic function of U. Let T be a real symmetric matrix whose i, j) element is given by

10 240 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) δ ij )t ij /2. Here, δ ij is the Kronecker delta, i.e., δ ii = 1, δ ij = 0 i j). The characteristic function of U can be expressed as e.g., see [7, p. 444]) C U T) = E[expi tr TU)] = E B [E{expi tr TU) B}]. The conditional characteristic function can be evaluated as C U T B) = E[expi tr TU) B] [ = etr pit I q + n )] p Γ 2 I q 2i T p p/2 etr i p Γ T I q 2i 1 T) Γ B. p Therefore, [ C U T) = etr pit I q + n )] p Γ 2 I q 2i T p p/2 I q 2i T I q 2i 1 T) Γ p p Under framework A1 we can expand log C U T) as log C U T) = p tr T I q + np ) ) Γ 2 + p 2i 2 tr T + 1 ) 2 2i p 2 tr T i p 3 tr T) + p + n 2 tr 2i T I q 2i 1 T) Γ + 1 p p 2 tr 2i T I q 2i 1 2 T) Γ p p This implies the following lemma tr 2i T I q 2i 1 3 T) Γ + p p. Lemma A.2. Let U be the random matrix defined by A.1). Then, under framework A1 we can expand the characteristic function of U as { C U T) = exp i 2 φ 0 T) } [ ] 1 + i3 φ 1 T) + O 3/2, A.2) n n/2. where φ 0 T) = c 1 { c tr T tr Γ 2 T 2 + tr Γ 2 T) 2}, φ 1 T) = 4 3 c 3/2 { c tr T tr Γ 2 T tr Γ 2 TΓ 2 T 2 + tr Γ 2 T) 3}. Note A 1/ = 1 n 1 c) 1/2 {I q + 1 n 1 c) 1/2 V } 1/2 = 1 { 1 c) 1/2 I q 1 n 2 n 1 c) 1/2 V + 3 } 8n 1 c) 1 V 2 + O 3/2. Similarly, expanding A 1 22, under A1 we have the following perturbation expansion of Q = A 1/ A 12A 1 22 A 21A 1/ = Θ n Q 1) + 1 n Q2) + O 3/2, A.3) where Θ 2 = diagθ1 2,..., θ2) = 1 q c) 1 ci q + Γ 2 ), i.e., θα 2 = c + γ2 α )/1 c), α = 1,..., q, Q 1) c = 1 c U c) 1/2 Θ 2 V + VΘ 2), Q 2) = c) 3/2 c UV + VU) c) 1 Θ 2 V 2 + V 2 Θ 2) c) 1 VΘ 2 V.

11 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) Using a perturbation expansion A.3) of Q and a general result e.g., see [11]) for a perturbation expansion of its characteristic root, we can obtain y α = nl 2 α θ2 α ) = q1) αα + 1 n q2) αα + β α θαβ 2 q1) αβ )2 + O 1, where Q i) = q i) αβ ), i = 1, 2, and the elements are expressed as q 1) αβ = c1 c) 1 u αβ 1 2 θ2 α + θ2 β )v αβ, q 2) αα = c1 c) 3/2 q v αβ u αβ + 3 q 4 1 c) 1 θα 2 v 2 αβ + 1 q 1 c) 1 4 β=1 The limiting distribution of y α is the same as that of q 1) αα = c1 c) 1 u αα θ 2 α v αα. β=1 Since the limiting distributions of u αα and v αα are independently normal with mean 0, then the limiting distribution of y α is normal with mean 0. To compute its asymptotic variance, we use Ev αβ ) = 0, Eu αβ ) = 0, Ev 2 αβ ) = 1 + δ αβ, Eu 2 αβ ) = 1 + δ αβ)c 1 { c + γ 2 α + γ2 β + γ2 α γ2 β β=1 v 2 αβ θ2 β. A.4) } + O 1. A.5) The first two formulas in A.5) are easily derived. The last one is obtained by differentiating both sides of A.2) in Lemma A.2. These imply Theorem 2.1. To prove Theorem 4.1, it is sufficient to show 4.2) and 4.6). Before deriving them, we derive κ 3 z α ) in 4.4). We can write κ 3 z α ) = Ezα 3) 3Ez2 α )Ez α) + 2Ez α ) 3 = h θα 2)3 Ey 3 α ) n h θα 2)2 h θα 2)Ey4 α ) 3 { h θα 2)2 a 2α h θα 2)a 1α + 1 } n 2 h θα 2)a 2α + O. 3/2 Furthermore, Ey 3 α ) = κ 3y α ) + 3κ 1 y α )κ 2 + κ 1 y α ) 3 = 1 n {a 3 + 3a 1 a 2 } + O 3/2, Ey 4 α ) = κ 4y α ) + 4κ 3 y α )κ 1 y α ) + 3κ 2 y α ) 2 + 6κ 2 y α )κ 1 y α ) 2 + κ 1 y α ) 4 = 3a 2α + O 1. The formulas for a 1α and a 3α are derived by using A.4) with the help of the moment formulas as in A.5). However, to compute a 3α, we need the following moment formulas: Ev 3 αα ) = 8/ m = 81 c) 1/2 / n, Ev 4 αα ) = 12, Eu αα u αβ ) = O 1 α β), Ev2 αα v2 αβ ) = 2α β), Eu 3 αα ) = 8c 3/2 {c + 3γα 2 + 3γ3 α + γ6 α }/ n + O, 3/2 Eu 2 αα u2 αβ ) { } { } = 2c 2 c + 2γα 2 + γ4 α c + γα 2 + γ2 β + γ2 α γ2 β + O 1 α β). Note that the formulas for u αβ as in A.6) are obtained by by differentiating both sides of A.2) in Lemma A.2. The formulas 4.6) are obtained by using 4.2) and 4.5) with hx) = x/1 + x). A.6) References [1] T.W. Anderson, An Introduction to Multivariate Statistical Analysis, third ed., John Wiley & Sons, New York, [2] Z.D. Bai, Methodologies in spectral analysis of large dimensional random matrices: A review, Statist. Sinica ) [3] R.N. Bhattacharya, J.K. Ghosh, On the validity of the formal Edgeworth expansion, Ann. Statist ) [4] Y. Fujikoshi, Asymptotic expansions for the distributions of the sample roots under nonnormality, Biometrika ) [5] I.M. Johnstone, On the distribution of the largest eigenvalue in principal component analysis, Ann. Statist ) [6] O. Ledoit, M. Wolf, Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size, Ann. Statist. 30 4) 2002) [7] R.J. Muirhead, Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York, NY, [8] S. Raudys, D.M. Young, Results in statistical discriminant analysis: A review of the former Soviet Union literature, J. Multivariate Anal ) [9] J.R. Schott, Testing for complete independence in high dimensions, Biometrika )

12 242 Y. Fujikoshi, T. Sakurai / Journal of Multivariate Analysis ) [10] M. Siotani, An Introduction to Multivariate Analysis, Asakura-shoten, Tokyo, 1990 in Japanese). [11] M. Siotani, T. Hayakawa, Y. Fujikoshi, Modern Multivariate Statistical Analysis: A Graduate Course and Handbook, American Sciences Press, Columbus, OH, [12] N. Sugiura, Y. Fujikoshi, Asymptotic expansions of the non-null distributions of the likelihood ratio criteria for multivariate linear hypothesis and independence, Ann. Math. Statist. 40 3) 1969) [13] M.S. Srivastava, Methods of Multivariate Statistics, John Wiley & Sons, New York, 2002.

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate

More information

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information

More information

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School

More information

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,

More information

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

Persistence and global stability in discrete models of Lotka Volterra type

Persistence and global stability in discrete models of Lotka Volterra type J. Math. Anal. Appl. 330 2007 24 33 www.elsevier.com/locate/jmaa Persistence global stability in discrete models of Lotka Volterra type Yoshiaki Muroya 1 Department of Mathematical Sciences, Waseda University,

More information

ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY

ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY José A. Díaz-García and Raúl Alberto Pérez-Agamez Comunicación Técnica No I-05-11/08-09-005 (PE/CIMAT) About principal components under singularity José A.

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population

Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population (Last Modified: April 24, 2008) Hirokazu Yanagihara 1 and Ke-Hai Yuan 2 1 Department of Mathematics, Graduate

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Supplement to the paper Accurate distributions of Mallows C p and its unbiased modifications with applications to shrinkage estimation

Supplement to the paper Accurate distributions of Mallows C p and its unbiased modifications with applications to shrinkage estimation To aear in Economic Review (Otaru University of Commerce Vol. No..- 017. Sulement to the aer Accurate distributions of Mallows C its unbiased modifications with alications to shrinkage estimation Haruhiko

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

Testing Equality of Natural Parameters for Generalized Riesz Distributions

Testing Equality of Natural Parameters for Generalized Riesz Distributions Testing Equality of Natural Parameters for Generalized Riesz Distributions Jesse Crawford Department of Mathematics Tarleton State University jcrawford@tarleton.edu faculty.tarleton.edu/crawford April

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

HIGHER THAN SECOND-ORDER APPROXIMATIONS VIA TWO-STAGE SAMPLING

HIGHER THAN SECOND-ORDER APPROXIMATIONS VIA TWO-STAGE SAMPLING Sankhyā : The Indian Journal of Statistics 1999, Volume 61, Series A, Pt. 2, pp. 254-269 HIGHER THAN SECOND-ORDER APPROXIMATIONS VIA TWO-STAGE SAMPING By NITIS MUKHOPADHYAY University of Connecticut, Storrs

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

An Unbiased C p Criterion for Multivariate Ridge Regression

An Unbiased C p Criterion for Multivariate Ridge Regression An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments CIRJE-F-466 Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments Yukitoshi Matsushita CIRJE, Faculty of Economics, University of Tokyo February 2007

More information

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan

More information

THE EFFICIENCY OF THE ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF THE CANONICAL VECTOR UNDER NONNORMALITY

THE EFFICIENCY OF THE ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF THE CANONICAL VECTOR UNDER NONNORMALITY J. Japan Statist. Soc. Vol. 38 No. 3 2008 451 474 THE EFFICIENCY OF THE ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF THE CANONICAL VECTOR UNDER NONNORMALITY Tomoya Yamada* In canonical correlation analysis,

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Likelihood Ratio Tests in Multivariate Linear Model

Likelihood Ratio Tests in Multivariate Linear Model Likelihood Ratio Tests in Multivariate Linear Model Yasunori Fujikoshi Department of Mathematics, Graduate School of Science, Hiroshima University, 1-3-1 Kagamiyama, Higashi Hiroshima, Hiroshima 739-8626,

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Random Eigenvalue Problems Revisited

Random Eigenvalue Problems Revisited Random Eigenvalue Problems Revisited S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. Email: S.Adhikari@bristol.ac.uk URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

EXPANSIONS FOR QUANTILES AND MULTIVARIATE MOMENTS OF EXTREMES FOR HEAVY TAILED DISTRIBUTIONS

EXPANSIONS FOR QUANTILES AND MULTIVARIATE MOMENTS OF EXTREMES FOR HEAVY TAILED DISTRIBUTIONS REVSTAT Statistical Journal Volume 15, Number 1, January 2017, 25 43 EXPANSIONS FOR QUANTILES AND MULTIVARIATE MOMENTS OF EXTREMES FOR HEAVY TAILED DISTRIBUTIONS Authors: Christopher Withers Industrial

More information

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Minimax design criterion for fractional factorial designs

Minimax design criterion for fractional factorial designs Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:

More information

Quadratic forms in skew normal variates

Quadratic forms in skew normal variates J. Math. Anal. Appl. 73 (00) 558 564 www.academicpress.com Quadratic forms in skew normal variates Arjun K. Gupta a,,1 and Wen-Jang Huang b a Department of Mathematics and Statistics, Bowling Green State

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana

More information

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT)

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT) MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García Comunicación Técnica No I-07-13/11-09-2007 (PE/CIMAT) Multivariate analysis of variance under multiplicity José A. Díaz-García Universidad

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Truncations of Haar distributed matrices and bivariate Brownian bridge

Truncations of Haar distributed matrices and bivariate Brownian bridge Truncations of Haar distributed matrices and bivariate Brownian bridge C. Donati-Martin Vienne, April 2011 Joint work with Alain Rouault (Versailles) 0-0 G. Chapuy (2007) : σ uniform on S n. Define for

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2 EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut

More information

High-dimensional two-sample tests under strongly spiked eigenvalue models

High-dimensional two-sample tests under strongly spiked eigenvalue models 1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained

More information

CONDITIONS FOR ROBUSTNESS TO NONNORMALITY ON TEST STATISTICS IN A GMANOVA MODEL

CONDITIONS FOR ROBUSTNESS TO NONNORMALITY ON TEST STATISTICS IN A GMANOVA MODEL J. Japan Statist. Soc. Vol. 37 No. 1 2007 135 155 CONDITIONS FOR ROBUSTNESS TO NONNORMALITY ON TEST STATISTICS IN A GMANOVA MODEL Hirokazu Yanagihara* This paper presents the conditions for robustness

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Department of Econometrics and Business Statistics

Department of Econometrics and Business Statistics ISSN 440-77X Australia Department of Econometrics and Business Statistics http://wwwbusecomonasheduau/depts/ebs/pubs/wpapers/ The Asymptotic Distribution of the LIML Estimator in a artially Identified

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''.

More information

MATH5745 Multivariate Methods Lecture 07

MATH5745 Multivariate Methods Lecture 07 MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction

More information

Effects of data dimension on empirical likelihood

Effects of data dimension on empirical likelihood Biometrika Advance Access published August 8, 9 Biometrika (9), pp. C 9 Biometrika Trust Printed in Great Britain doi:.9/biomet/asp7 Effects of data dimension on empirical likelihood BY SONG XI CHEN Department

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho M. de Carvalho and F. J. Marques Abstract We discuss jackknife Euclidean likelihood-based inference methods, with a special focus on the

More information

ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION

ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION Yu-Hang Xiao, Lei Huang, Junhao Xie and H.C. So Department of Electronic and Information Engineering, Harbin Institute of Technology,

More information

The Third International Workshop in Sequential Methodologies

The Third International Workshop in Sequential Methodologies Area C.6.1: Wednesday, June 15, 4:00pm Kazuyoshi Yata Institute of Mathematics, University of Tsukuba, Japan Effective PCA for large p, small n context with sample size determination In recent years, substantial

More information

University of Lisbon, Portugal

University of Lisbon, Portugal Development and comparative study of two near-exact approximations to the distribution of the product of an odd number of independent Beta random variables Luís M. Grilo a,, Carlos A. Coelho b, a Dep.

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Elliptically Contoured Distributions

Elliptically Contoured Distributions Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal

More information

A Note on Generalized Topology

A Note on Generalized Topology International Mathematical Forum, Vol. 6, 2011, no. 1, 19-24 A Note on Generalized Topology Gh. Abbaspour Tabadkan and A. Taghavi Faculty of Mathematics and Computer Sciences Damghan University, Damghan,

More information

Lecture 20: Linear model, the LSE, and UMVUE

Lecture 20: Linear model, the LSE, and UMVUE Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;

More information

Tube formula approach to testing multivariate normality and testing uniformity on the sphere

Tube formula approach to testing multivariate normality and testing uniformity on the sphere Tube formula approach to testing multivariate normality and testing uniformity on the sphere Akimichi Takemura 1 Satoshi Kuriki 2 1 University of Tokyo 2 Institute of Statistical Mathematics December 11,

More information

ON COMBINING CORRELATED ESTIMATORS OF THE COMMON MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION

ON COMBINING CORRELATED ESTIMATORS OF THE COMMON MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION ON COMBINING CORRELATED ESTIMATORS OF THE COMMON MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION K. KRISHNAMOORTHY 1 and YONG LU Department of Mathematics, University of Louisiana at Lafayette Lafayette, LA

More information

Testing equality of two mean vectors with unequal sample sizes for populations with correlation

Testing equality of two mean vectors with unequal sample sizes for populations with correlation Testing equality of two mean vectors with unequal sample sizes for populations with correlation Aya Shinozaki Naoya Okamoto 2 and Takashi Seo Department of Mathematical Information Science Tokyo University

More information

Tests for Covariance Matrices, particularly for High-dimensional Data

Tests for Covariance Matrices, particularly for High-dimensional Data M. Rauf Ahmad Tests for Covariance Matrices, particularly for High-dimensional Data Technical Report Number 09, 200 Department of Statistics University of Munich http://www.stat.uni-muenchen.de Tests for

More information

Yimin Wei a,b,,1, Xiezhang Li c,2, Fanbin Bu d, Fuzhen Zhang e. Abstract

Yimin Wei a,b,,1, Xiezhang Li c,2, Fanbin Bu d, Fuzhen Zhang e. Abstract Linear Algebra and its Applications 49 (006) 765 77 wwwelseviercom/locate/laa Relative perturbation bounds for the eigenvalues of diagonalizable and singular matrices Application of perturbation theory

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables International Journal of Statistics and Probability; Vol. 7 No. 3; May 208 ISSN 927-7032 E-ISSN 927-7040 Published by Canadian Center of Science and Education Decomposition of Parsimonious Independence

More information

The LIML Estimator Has Finite Moments! T. W. Anderson. Department of Economics and Department of Statistics. Stanford University, Stanford, CA 94305

The LIML Estimator Has Finite Moments! T. W. Anderson. Department of Economics and Department of Statistics. Stanford University, Stanford, CA 94305 The LIML Estimator Has Finite Moments! T. W. Anderson Department of Economics and Department of Statistics Stanford University, Stanford, CA 9435 March 25, 2 Abstract The Limited Information Maximum Likelihood

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Multivariate-sign-based high-dimensional tests for sphericity

Multivariate-sign-based high-dimensional tests for sphericity Biometrika (2013, xx, x, pp. 1 8 C 2012 Biometrika Trust Printed in Great Britain Multivariate-sign-based high-dimensional tests for sphericity BY CHANGLIANG ZOU, LIUHUA PENG, LONG FENG AND ZHAOJUN WANG

More information

Univariate Normal Probability Density Function

Univariate Normal Probability Density Function Statistical Distributions Univariate Normal Probability Density Function A random variable, x, is normally distributed if, and only if, its probability density function has the following form: Prob(x θ,

More information

EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS. Haruo Yanai*

EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS. Haruo Yanai* J. Japan Statist. Soc. Vol. 11 No. 1 1981 43-53 EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS Haruo Yanai* Generalized expressions of canonical correlation

More information

Estimation of the Global Minimum Variance Portfolio in High Dimensions

Estimation of the Global Minimum Variance Portfolio in High Dimensions Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results

More information

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Likelihood Ratio Tests for High-dimensional Normal Distributions

Likelihood Ratio Tests for High-dimensional Normal Distributions Likelihood Ratio Tests for High-dimensional Normal Distributions A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Fan Yang IN PARTIAL FULFILLMENT OF THE

More information

arxiv: v1 [math.st] 2 Apr 2016

arxiv: v1 [math.st] 2 Apr 2016 NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher

More information

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June

More information

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration Ryoya Oda, Yoshie Mima, Hirokazu Yanagihara and Yasunori Fujikoshi Department of Mathematics, Graduate

More information

Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification

Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification (Last Modified: May 22, 2012) Yusuke Hashiyama, Hirokazu Yanagihara 1 and Yasunori

More information

Reduced rank regression in cointegrated models

Reduced rank regression in cointegrated models Journal of Econometrics 06 (2002) 203 26 www.elsevier.com/locate/econbase Reduced rank regression in cointegrated models.w. Anderson Department of Statistics, Stanford University, Stanford, CA 94305-4065,

More information

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine

More information

Probability and Stochastic Processes

Probability and Stochastic Processes Probability and Stochastic Processes A Friendly Introduction Electrical and Computer Engineers Third Edition Roy D. Yates Rutgers, The State University of New Jersey David J. Goodman New York University

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Probability Lecture III (August, 2006)

Probability Lecture III (August, 2006) robability Lecture III (August, 2006) 1 Some roperties of Random Vectors and Matrices We generalize univariate notions in this section. Definition 1 Let U = U ij k l, a matrix of random variables. Suppose

More information