Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices
|
|
- Rosemary Sims
- 5 years ago
- Views:
Transcription
1 Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical Information Science, Tokyo University of Science, 1-3, Kagurazaka, Shinjuku-ku, Tokyo, , Japan b Graduate School of Economics, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, , Japan, JSPS Research Fellow c Department of Mathematics, KTH Royal Institute of Technology, SE-100, 44, Stockholm, Sweden Abstract We propose a new test procedure for testing linear hypothesis on the mean vectors of normal populations with unequal covariance matrices when the dimensionality, p exceeds the sample size N, i.e. p/n c <. Our procedure is based on the Dempster trace criterion and is shown to be power and size consistent in high dimensions. The asymptotic null and non-null distributions of the proposed test statistic are established in the high dimensional setting and improved estimator of the critical point of the test is derived using Cornish-Fisher expansion. As a special case, our testing procedure is applied to multivariate Behrens-Fisher problem. We illustrate the relevance and benefits of the proposed approach via Monte-Carlo simulation which show that our new test is comparable to, and in many cases is more powerful than, the tests for equality of means presented in the recent literature. Keywords: Cornish-Fisher transform, Dempster trace criterion, High dimensionality, Multivariate Behrens-Fisher problem, (N, p)-asymptotics. Corresponding author. Tel.: ; Fax: address: nishiyam@rs.kagu.tus.ac.jp (Takahiro Nishiyama )
2 1. Introduction The problem of testing mean vectors is a part of many procedures of multivariate statistical analysis, such as multiple comparisons, MANOVA and classification. The standard testing technique is based on classical Hotelling s T test which is known to have optimal performance properties in a large sample case, i.e assuming that the number of feature variables, p is fixed and is much smaller the sample size, N. However, in many practical applications of modern multivariate statistics (e.g. DNA microarray data) the number of feature p exceeds N, so that a straightforward use of T statistics is impossible due to singularity of the sample covariance matrix. Thus, to cope with this high dimensional situation, it would be desirable to develop new tests for N p, and investigate their asymptotic properties when both N and p are going to infinity; this asymptotic framework is also known as (N, p)-asymptotics. There have been a series of important results on this testing problem. In particular, Bai and Saranadasa (1996) focus on (normal) the two-sample case with equal covariance matrix, and propose to use an estimator of Euclidean norm of the shift vector instead of T statistics; they also establish the asymptotic normality of the test statistics assuming that p and N is of the same order. Later, by using the same approach as Bai and Saranadasa (1996), Aoshima and Yata (011) derive the test for which the significance levels can be controlled with or without the assumption of normality of the data, that is robust for the model assumption. Unlike the above approach, Srivastava (007) suggests F -type test statistics based on the Moore-Penrose inverse of the singular sample covariance matrix, and Srivastava and Du (008) have developed the test procedure based on the Dempster s trace criterion (D-criterion) (Dempster (1958,1960)) under the assumption of variable independence. Also it is important to note, that both these procedures are less restrictive than that of Bai and Saranadasa (1996) in a sense that they allow p grow faster than N according to (N, p)-asymptotic framework. Motivated by the previous literature and as part of effort in developing testing procedures with stable characteristics in high-dimensions, we focus on testing linear hypotheses of mean vectors for high-dimensional data with unequal covariance matrices. Our main objective in this paper is to show that our newly derived test statistics based on the Dempster trace criterion has a number of attractive asymptotic properties and demonstrates good performance in large (N, p). We state the asymptotic distribution of the derived test statistics under both the null and the local alternative hypotheses, and
3 provide the explicit expressions for asymptotic power of the test in terms of δ when N = O(p δ ) and 0 < δ < 1. To further improve the test performance, Cornish-Fisher approximation of the upper 100α percentile of the test is provided in (N, p)-asymptotics. We also apply our new test procedure to the multivariate Behrens-Fisher problem and compare its performance with the above-mentioned testing procedures from resent literature. The rest of the paper is organized as follows. Section provides description of the new test and main asymptotic results. Section 3 considers the application to the Behrens-Fisher problem. In Section 4, the level hypothesis is tested and the attained significance levels of the newly derived test is analyzed for a number of high dimensional scenarios. At last, we provide some concluding remarks. The proofs of theorems and lemmas stated in Section are given in the Appendix A and the Appendix B.. Description of the new test and asymptotic properties Let x (i) 1,..., x (i) N i, i = 1,..., k be N i samples from N p (µ i, Σ i ), respectively. We are interesting in the linear testing the hypothesis H 0 : k β i µ i = 0 vs. H 1 : H 0, (.1) i=1 where β 1,..., β k are given scalars and covariance matrices Σ i s are assumed to be unequal. In this study, we consider Bennett (1951) s transform derived Anderson (003) for k-sample case (see, Bennett (1951), Anderson (003)). For convenience, let N 1 be the smallest. Then, for j = 1,..., N 1, we define y j = β 1 x (1) j + k l= β l N1 N l ( x (l) j 1 N 1 N 1 m=1 x (l) m + 1 N1 N l N l Especially, when N 1 = = N k, y j = k l=1 β lx (l) j. The expected value of y j and the covariance matrix of y l and y m are n=1 x (l) n ). E ( ) k y j = β i µ i, i=1 Cov (y l, y m ) = δ lm ( k i=1 βi N 1 Σ i N i ), 3
4 respectively, where δ lm is Kronecker s delta. We further set k β i µ i µ, i=1 k i=1 β i N 1 N i Σ i Σ, respectively, and note that y 1,..., y N1 are independent and identically distributed as N p (µ, Σ). Thus, we may consider testing H 0 : µ = 0 vs. H 1 : µ 0, (.) which is equivalent to (.1). For testing (.) under the assumption that p > n 1 = N 1 1, we define a new test statistic as follows where T D = N 1y y trs, (.3) y = 1 N 1 y N i, S = 1 N 1 (y 1 n i y)(y i y). 1 i=1 T D is based on Dempster trace criterion and does not require any restrictions on the relation between the dimension and sample size. At first, we derive the null distribution of the statistic (.3) under the following (N, p)-asymptotic framework: (A.1) : p, n 1, i=1 Further, in addition to (A1), we assume that trσ i (A.) : lim p p trσ i (A.3) : lim p p p n 1 c (0, ). a i (0, ), i = 1,..., 4, a i (0, ), i = 1,..., 8. Let T D = { } N1 y y p trs 1. (.4) 4
5 The following theorem provides asymptotic null distribution of T D /σ D where a trσ /p σ D = =. a 1 trσ/p Theorem.1. When the null hypothesis H 0 : µ = 0 is true, the distribution function of T D / σ D can be expanded in the asymptotic framework (A.1) and under assumption (A.3) as P ( ) TD z σ D [ 1 = Φ(z) φ(z) p c 3 h (z) + 1 p {c 4h 3 (z) + c 6 h 5 (z)} + 1 ] c h 1 (z) + O(p 3/ ), (.5) n 1 where σ D is obtained by replacing a 1 and a with their estimators, Φ(z) is the distribution function of the standard normal distribution, c = 1, c 3 = a3 3 a 3, c 4 = a 4, c a 6 = a 3, 9a 3 and h i (z) s (i = 1,..., 6) are the Hermite polynomials given by h 1 (z) = z, h (z) = z 1, h 3 (z) = z 3 3z, h 4 (z) = z 4 6z + 3, h 5 (z) = z 5 10z z, h 6 (z) = z 6 15z z 15. Proof. See, Appendix A.1. In practice, c i s and a i s are unknown. Hence, to use the result of Theorem, we need their estimators that are expected to be good in high-dimension setting. As sample counterparts of a i s, we use their (N, p)-consistent and unbiased estimators derived in Srivastava(005), Srivastava and Yanagihara (010) and Hyodo, Takahashi and Nishiyama (01) as â 1 = trs p, 5
6 â = n 1 p(n 1 1)(n 1 + ) n 1 {trs (trs) â 3 = (n 1 1)(n 1 )(n 1 + )(n 1 + 4) { trs 3 } 3(n 1 + )(n 1 1)â 1 â n 1 p â 3 1, p â 4 = 1 ( trs 4 ) pb 1 â 1 p b â b 0 p 1â pb 3 â n 1 p 3 â 4 1, where n 1 }, b 0 = n 1 (n n 1 + 1n ), b 1 = n 1 (n 1 + 6n 1 + 9), b = n 1 (3n 1 + ), b 3 = n 1 (n 1 + 5n 1 + 7). Type 1 error of the asymptotic test based on using the main term of (.5) can be essentially improved by the corrected estimator of the upper 100α-percentile of T D / σ. This correction is achieved by using Cornish-Fisher expansion. Corollary.. Under the asymptotic framework (A.1) and assumption (A.3), Cornish-Fisher expansion of the estimated upper percentile of T D / σ is derived as ẑ(α) = z α + 1 â3 p 3 (z â 3 α 1) + 1 p { â4 â z α (zα 3) â 3 z 9â 3 α (zα 5) } + 1 n 1 z α + O p (p 3/ ), (.6) where z α is the upper 100α% percentile of the standard normal distribution. Proof. See, Appendix A.. Next, we state the asymptotic distribution of the test statistic T D under the local alternative. We state H L(δ) 1 : Nµ µ = O(p δ ), Nµ Σµ = O(p δ ), 0 < δ < 1, (.7) 6
7 where N = k i=1 N i and assume that N 1 /N i c i (0, ) (i = 1,..., k). Then, under the hypothesis H L(δ) 1, the test statistic can be expressed as TD = N 1y y trs 1 N 1µ µ trσ. (.8) Theorem.3 provides the limiting distribution of TD /σ D, where σd trσ = + 4N 1 µ Σµ, (trσ) under the hypothesis H L(δ) 1. Theorem.3. When the local alternative hypothesis H L(δ) 1 is true, the distribution of TD /σ D is asymptotically normal, i.e. TD = N 1y y/trs 1 N 1 µ µ/trσ d N(0, 1), σd (trσ + 4N 1 µ Σµ)/trΣ in the asymptotic framework (A.1) and under assumption (A.). Proof. See, Appendix A.3. Using the asymptotic distribution under the alternative hypothesis, we are able to describe the (N, p)-asymptotic behavior of the power function of our test statistic which is collaborated in the following theorem. Theorem.4. Let Power α ( T D, δ) = P ( ) TD z α H L(δ) 1, σ D be the power function of T D. Then, in the asymptotic framework (A.1) and assumptions (A.), (i) Power α ( T D, δ) α, if 0 < δ < 1/, ( ) (ii) Power α ( T N1 µ µ D, δ) Φ z α, if δ = 1/, trσ (iii) Power α ( T D, δ) 1, if 1/ < δ < 1. 7
8 Proof. See, Appendix A.4. In words, this theorem claims that the test statistic T D is (N, p)-consistent. 3. New test procedure for multivariate Behrens-Fisher problem In this section, we focus on an important special case of the testing problem (.1). We consider testing equality of mean vectors of two normal populations with unequal covariance matrices, that is, we consider testing the hypothesis H 0 : µ 1 = µ vs. H 1 : µ 1 µ. (3.1) We note that (3.1) is the special case that k =, β 1 = 1 and β = 1 in (.1). This problem is known as the multivariate Behrens-Fisher problem, and many authors have discussed (see, e.g., Bennett (1951), Johnson and Weerahandi (1988) and Yanagihara and Yuan (005)). Also, for highdimensional data, testing equality of mean vectors of two populations have been discussed by Bai and Saranadasa (1996), Srivastava (007), Chen and Qin (010), Aoshima and Yata (011), and so on. Especially, Chen and Qin (010) and Aoshima and Yata (011) gave a test statistic for the multivariate Behrens-Fisher problem in high-dimension setting. We propose a new test statistic for this problem by using the idea stated in Section, that is, we consider the following statistic where, for j = 1..., N 1, y j = x (1) j N1 N x () j + T D = N 1y y trs, 1 N 1 N1 N m=1 x () m 1 N N n=1 x () n. (3.) Then we note that y 1,..., y N1 are independent and identically distributed as N p (µ, Σ) where µ = µ 1 µ, Σ = Σ 1 + N 1 N Σ, (3.3) 8
9 respectively (see, Bennett (1951)), that is, (3.1) is equivalent to the following hypothesis: H 0 : µ = 0 vs. H 1 : µ 0. Therefore from Theorem.1, Corollary. and Theorem.4, we have following corollary. Corollary 3.1. Suppose that T D, â i s, c i s and Power α ( T D, δ) are defined according to (3.) and (3.3). Then, under the asymptotic framework (A.1) and assumption (A.3) the asymptotic distribution of T D under the null hypothesis H 0 : µ 1 = µ and Cornish-Fisher expansion of the upper percentile are derived as follows ( ) TD [ 1 P z = Φ(z) φ(z) p c 3 h (z) σ D + 1 p {c 4h 3 (z) + c 6 h 5 (z)} + 1 ] c h 1 (z) + O(p 3/ ), n 1 ẑ(α) = z α + 1 â3 p 3 (z â 3 α 1) + 1 p { â4 â z α (zα 3) â 3 z 9â 3 α (zα 5) respectively, and for the asymptotic power of T D we have } + 1 n 1 z α + O p (p 3/ ), (3.4) (i) Power α ( T D, δ) α, if 0 < δ < 1/, ( ) (ii) Power α ( T N1 µ µ D, δ) Φ z α, if δ = 1/, trσ (iii) Power α ( T D, δ) 1, if 1/ < δ < Simulation study A simulation study shows the effectiveness of the suggested test statistics in high dimension. We first provide a study justifying accuracy of the approximation for the critical point derived in Corollary. of our testing procedure 9
10 by simulating the Attained Significance Level (ASL), or size of the test. We draw k independent samples of size N i = 10(1 + i) and N i = 0(1 + i), where i = 1,..., k valid p-dimensional normal distributions under the null hypothesis (i.e. (.1)). We replicate this r = 10 5 times, and using T D from (.4) calculate ( ( ) of TD / σ > z α H 0 is true) ASL 1 α TD =, r and ( ( ) of TD / σ > ẑ(α) H 0 is true) ASL α TD =, r denoting the ASL of T D where z α is the upper 100α percentile of the standard normal distribution and ẑ(α) is the corrected value of the upper 100α percentile given by (.6). Tables 1-4 provide the results for a number of high-dimensional scenarios and an assortment of null hypothesis H 0 specified by the choice of β i s for i = 1,..., k. Also, we set up the covariance structures Σ 1 = I, Σ = (0. i j ), Σ 3 = (0.5 i j ) for the case k = 3 and Σ 1 = I, Σ = I, Σ 3 = (0. i j ), Σ 4 = (0.5 i j ) for the case k = 4, respectively. For each table, the simulated size of T D is systematically lower for the suggested corrected percentile ẑ(α), and ẑ(α) is closer to the selected nominal level α. Furthermore, in all the settings of H 0, the size of the test remains essentially the same when both p and k grows thereby validating our asymptotic results. Further, we perform a series of power simulations to investigate consistency of our test and to demonstrate its improved performance under certain alternative hypotheses. From now on we focus on the simulations for the case of k =, representing multivariate Behrens-Fisher problem. We provide examples for two cases of H L(δ) 1, = 5 and = 10 for the following settings of Σ i : Table 5 : Σ 1 = I, Σ = (0.5 i j ) and = 5, Table 6 : Σ 1 = I, Σ = (0.5 i j ) and = 10, Table 7 : Σ 1 = (0. i j ), Σ = (0.5 i j ) and = 5, Table 8 : Σ 1 = (0. i j ), Σ = (0.5 i j ) and = 10, 10
11 where = µ 1 µ, and we recall that in our notations, see (.7), N = O(p δ ). Using the same number of replicates as above, we draw k = 1, independent samples of size N k under H L(δ) 1 and using TD from (.8) calculate the empirical power as ( ) EP α TD, δ = of ( ) TD / σ D > ẑ(α) H L(δ) 1 is true. r To make comparisons of T D with the test statistics defined in (Srivastava (007) (S), Srivastava and Du (008) (SD) and Aoshima and Yata (011) (AY)), the same is repeated for the corresponding tests. Srivastava (007) and Srivastava and Du (008) discussed one and two sample problem, but their procedures for testing equality of two mean vectors are based on the assumption of equal covariance structure. So, in order to adapt these procedures to our approach, we at first consider transformation (3.), and adapted transformed y j (j = 1,..., N 1 ) to their procedures for one sample problem. Aoshima and Yata (011) proposed the test procedure for the multivariate Behrens-Fisher problem with size α and power no less than 1 β when L, where α, β (0, 1/) and L (> 0) are prespecified constant. As they assumed L = o(p 1/ ), we set up β = 0.1 and L = /. Also, procedure by Aoshima and Yata (011) does not need the assumption of normality, but in order to compare it with other procedures we carry out the simulation study under normality. Tables 5 to 8 show that our test statistic appear to be consistent as (N, p). Further, under simulated alternative hypothesis H L(δ) 1 with = 5 our statistic performs better than both S and SD, and is comparable to AY. For the alternative with = 10 our test turns out to be most powerful for almost all the settings of p ans N i. Hence, as increases the newly proposed test appears to dominate that of AY, and both are seen to be consistent as grows. We also see that neither test perform particularly well for = 5, in combination with small N i and large p. Further, a series of simulations is provided to demonstrate the advantage of the correction procedure suggested in (3.4) for the critical point of the test. We simulate the ASL of our test using ẑ(α) from the approximation (3.4), and ASL of AY which uses L z α /(z α + z β ) as the critical point (see for details Aoshima and Yata (011)). Both S and SD use z α. Tables
12 give ASL for the following cases: Table 9 : Σ 1 = I, Σ = (0.5 i j ) N 1 = 10, and N = 0, Table 10 : Σ 1 = I, Σ = (0.5 i j ) N 1 = 0, and N = 30, Table 11 : Σ 1 = (0. i j ), Σ = (0.5 i j ) N 1 = 10, and N = 0, Table 1 : Σ 1 = (0. i j ), Σ = (0.5 i j ) N 1 = 0, and N = 30. Also, AY(.5) and AY(5) denote AY with L =.5 and L = 5, respectively. Tables 9 to 1 show that the our test procedure based on ẑ(α) outperforms all other procedures for both types of alternatives, = 5 and = 10, and for all the settings of p and N i, yielding the value of ASL closest to the nominal level α. Lastly, we study the effect of p on the power of the test. From tables 5 to 1 one can see that for both = 5 and = 10 our test statistic appear to have most stable power when p grows, and this result remain valid for very small sizes. 5. Concluding remarks We have proposed a new test statistic for testing linear hypothesis of mean vectors assuming that covariance matrices are unequal. We also suggested a method for correcting the critical point of the test that leads to better performance in large p and small N i case. Simulations indicate that the newly derived test statistic, T D, in (.4) appear to perform well for a range of settings of H 0 specified by β i s and k >, when ẑ(α) from Corollary. is used as a critical point. For the multivariate Behrens-Fisher problem, our test procedure has a comparable power performance to that of Aoshima and Yata (011), and outperforms both procedures derived in Srivastava (007) and Srivastava and Du (008) for all the high-dimensional settings of p and N i and a number of settings of Σ i s. It is especially important to point out that our procedure performs well for small deviations from H 0, i.e. when = 5. In conclusion, our test can be recommended for testing the mean vectors for both k > case and multivariate Behrens-Fisher problem, when p is much larger than N i, and when a small deviation from H 0 is suspected. 1
13 Appendix A. A.1. Proof of Theorem.1. To derive the asymptotic expansion of the distribution of T D / σ D under null hypothesis we need estimators of a 1 and a. So we consider unbiased and (N, p)-consistent estimators of a 1 and a derived in Srivastava (005) (see, section ). By (N, p) consistency, we obtain T D σ D = N 1y y trs pâ. Let w = n 1 (â a ) which implies that w = O p (1) (see Srivastava (005)) and T D / σ D can be expanded as ( T D 1 = (N 1 y y trs) ) 1/ w σ D pâ n 1 a ( 1 = (N 1 y y trs) 1 1 ) w + O p (n 1 ). pâ n 1 a To further explore the distribution of T D / σ D, we derive the asymptotic expansion of the characteristic function. The following lemma is proved in Appendix B. Lemma A.1. Under the asymptotic framework (A.1), the characteristic function of T D / σ D can be expressed as 1/ (it) n C(t) = I (it) 1 / p Σ pa I p + Σ Ez n 1 pa [g(s, y)], 1,Z where g(s, y) = 1 (it)w n 1 a pa (N 1 y y trs) + O p (n 1 ). We now proceed to show Theorem.1 by analyzing 1/ (it) n (i) I (it) 1 / p Σ pa I p + Σ, (ii) n 1 pa Ez [g(s, y)], 1,Z 13
14 respectively. At first, for (i), the following equality are hold 1/ (it) n log I (it) 1 / p Σ log pa I p + Σ n 1 pa = 1 ( ) (it) trσ + 1 ( ) (it) trσ (it) trσ 3 pa pa 3 pa ( ) { (it) trσ 4 + O(p 3/ ) } n 1 (it) trσ 4 pa n 1 pa (it) trσ + O(p 7/ ) } n 1pa = (it) (it) 3 a (it)4 a 4 pa 3 pa + (it) n 1 + O(n 3/ 1 ). Therefore (i) can be rewritten as 1/ (it) I (it) p Σ pa I p + Σ n 1 pa { (it) (it) 3 a 3 = exp (it)4 a 4 pa 3 pa { } { (it) (it) 3 a 3 = exp (it)4 a 4 pa 3 pa n 1 / } + (it) + O(n 3/ 1 ) n 1 + (it)6 a 3 + (it) 9pa 3 n 1 Secondly, we expand Ez [g(s, y)] in (ii) as follows: 1,Z E (z 1,Z ) [g(s, y)] (it) = 1 E[w(N 1 y y trs) + O p (n 1 )] n 1 a pa (it) [{ n ( 1 tr(σz Z = 1 E ΣZ Z ) n 1 a pa p(n 1 + )(n 1 1) (tr(σz Z )) n 3 1 ) trσ p n 1 } +O(n 3/ 1 ). }( tr(σz 1 z 1) tr(σz Z ) ) + O p (n 1 ) n 1 14 ] (1)
15 (it) [{ 1 ( ) = 1 E n 1 a pa (n 1 + )(n 1 1) tr BZZ BZZ 1 ( ( )) ( ) tr BZ n 1 (n 1 + )(n 1 1) Z trσ }{ tr Az 1z1 1 ( )}] tr BZ n Z + O(n 1 ), () 1 where A = Σ ( I p 1 ( ) 1 (it) (it) Σ), B = Σ I p + Σ, pa n 1 pa respectively. lemma. To calculate the expectation in (), we need the following Lemma A.. Let z 1 and Z be mutually independently and distributed as N p (0, I p ) and N pn1 (0, I p I n1 ), respectively. Then the following expectations are calculated as E[tr(BZZ BZZ )]E[tr(Az 1z 1 )] = tra{n 1 (n 1 + 1)trB + n 1 (tra) }, E[tr(BZZ BZZ )tr(bzz )] = 4n 1 (n 1 + 1)trB 3 + n 1 (n 1 + n 1 + 4) trbtrb + n 1(trB) 3, E[{tr(BZZ )} tr(az 1z 1 )] = tra{n 1 trb + n 1(trB) }, E[{tr(BZZ )} 3 ] = 8n 1 trb 3 + 6n 1trBtrB + n 3 1(trB) 3, E[trΣ tr(az 1z 1 )] = trσ tra, E[trΣ tr(bzz )] = trσ trb. Proof. See, Himeno (007). Now, by applying Lemma A. to (), we obtain: E (z 1,Z ) [g(s, y)] = 1 + O(n 3/ 1 ). Summarizing (1) and (), we obtain the expansion of the characteristic function { (it) } [ (it) 3 a 3 C(t) = exp (it)4 a 4 + (it)6 a ] 3 + (it) + O(n 3/ pa 3 pa 9pa 3 1 ). n 1 15
16 By inverting this characteristic function, we get the following density function of T D / σ D ; f(z) = 1 exp{ itz}c(t)dt π [ = φ(z) 1+ 1 c 3 h 3 (z)+ 1 p p {c 4h 4 (z)+c 6 h 6 (z)}+ 1 ] c h (z) +O(n 3/ 1 ), n 1 where φ(z) is the density function of the standard normal distribution, c = 1, c 3 = a3 3 a 3, c 4 = a 4, c a 6 = a 3, 9a 3 and h i (z) s (i = 1,..., 6) are the Hermite polynomials given by h 1 (z) = z, h (z) = z 1, h 3 (z) = z 3 3z, h 4 (z) = z 4 6z + 3, h 5 (z) = z 5 10z z, h 6 (z) = z 6 15z z 15. Therefore, we obtain Theorem.1. A.. Proof of Corollary.. We prepare following Lemma to derive the Cornish-Fisher expansion of the upper 100α percentiles of T D / σ D. Lemma A.3. Let z(α) be z(α) = z α + 1 p q 1 (z α ) + 1 p q (z α ) + 1 n 1 q 3 (z α ), where z α is the upper 100α% point of the standard normal distribution and q 1 (z α ) = a3 3 (z a 3 α 1), q (z α ) = a 4 z a α (zα 3) a 3 z 9a 3 α (zα 5), q 3 (z α ) = z α. 16
17 Then under the framework (A.1) and assumption (A.3), ( ) TD P z(α) = 1 α + O(p 3/ ). σ D Proof. See, Appendix B.. Now, by replacing a i s in Lemma A.3 with their unbiased and consistent estimators â i s for i = 1,..., 4, we obtain Corollary.. A.3. Proof of Theorem.3. We derive the limiting distribution of the statistic TD /σ D under (A.1), (A.) and H L(δ) 1. TD /σ D is expanded as T D σ D = (trσ/trs)n 1y y trσ N 1 µ µ trσ + 4N 1 µ Σµ 1 = trσ + 4N 1 µ Σµ (z 1Σz 1 + N 1 µ Σ 1/ z 1 trσ) + o p (1). Also, the characteristic function of TD /σ D is calculated as { it(z C(t) = Ez 1 [exp 1Σz 1 + }] N 1 µσ 1/ z 1 trσ) + o(1) trσ + 4N 1 µ Σµ { } { (it)trσ = etr (π) p/ etr 1 ( I p trσ + 4N 1 µ Σµ z 1 (it)σ trσ + 4N 1 µ Σµ ) z 1 z 1 + (it) N 1 µ Σ 1/ z 1 trσ + 4N 1 µ Σµ } dz 1 + o(1). Further, we consider the following transformation z 1 = ( 1/ (it)σ I p z 1 trσ + 4N 1 µ Σµ) ( ) 1/ (it)σ (it) I p N1 Σ trσ + 4N 1 µ Σµ trσ + 4N 1 µ Σµ 1/ µ, 17
18 whose Jacobian is given by I p (it)σ/ trσ + 4N 1 µ Σµ 1/. Also, under (A.), 1/ log I (it)σ p trσ + 4N 1 µ Σµ = (it)trσ trσ + 4N 1 µ Σµ + (it) trσ trσ + 4N 1 µ Σµ + o(1). Therefore, 1/ I (it)σ p trσ + 4N 1 µ Σµ ( ) (it)trσ = exp trσ + 4N 1 µ Σµ + (it) trσ + o(1), trσ + 4N 1 µ Σµ and then we obtain the expansion of the characteristic function ( ) ( (it)trσ (it)trσ C(t) = exp exp trσ + 4N 1 µ Σµ trσ + 4N 1 µ Σµ ) ( ) (it) trσ (it) N 1 µ Σµ + exp + o(1) trσ + 4N 1 µ Σµ trσ + 4N 1 µ Σµ { } (it) = exp + o(1). Therefore, T D /σ D d N(0, 1). A.4. Proof of Theorem.4. Let T = T D T D = p N 1µ µ trσ, then the power of T D with significance level α can be expressed as: Power α ( T D, δ) = P (T D > σ D z α T H L(δ) 1 ). 18
19 Then, by the results of Theorem.3, lim Power α( T D, δ) = lim Φ p p ( T σ D z α By the assumption (A.), Power α ( T D, δ) 1 when 1/ < δ < 1, since T, and Power α ( T D, δ) α when 0 < δ < 1/, since T 0 and σd σ D. When δ = 1/, we obtain lim Power α( T D, δ) = Φ p Appendix B. B.1. Proof of Lemma A.1. ( T σ D z α ) σ D ). ( ) N1 µ µ = Φ z α. trσ The characteristic function of T D / σ D is calculated as [ ( (it) C(t) = E exp T )] [ { } ] D (it) = E exp (N 1 y y trs) g(s, y), σ D pa where g(s, y) = 1 (it)w n 1 a pa (N 1 y y trs) + O p (n 1 ). Let z 1 be a p-dimensional random vector distributed as N p (0, I p ), and Z be a p n 1 random matrix such that vec(z ) is distributed as N pn1 (0, I p I n1 ). Then we note that N 1 y y = tr(σ 1/ z 1 z 1Σ 1/ ), n 1 S = Σ 1/ Z Z Σ 1/, and we can rewrite the characteristic function as [ C(t) = E exp = { ( (it) tr(σ 1/ z 1 z pa 1Σ 1/ ) tr(σ1/ Z Z Σ 1/ ) { ( (π) (n1+1)p/ etr 1 ) } (it) I p Σ z 1 z 1 pa { etr 1 ( ) } 1 I p + Σ Z Z g(s, y)dz 1 dz. n 1 pa 19 n 1 )} ] g(s, y)
20 Here, we consider following transformations z 1 = ( I p ) 1/ ( ) n1 / (it) (it) Σ z pa 1, Z = I p + Σ Z n 1 pa, respectively, and the Jacobians for these transformations are (it) I p Σ pa 1/, (it) I p + Σ n 1 pa n 1 / Therefore the characteristic function can be written as 1/ (it) n C(t) = I (it) 1 / p Σ pa I p + Σ n 1 pa { (π) (n1+1)p/ etr 1 z 1z 1 1 } Z Z g(s, y)dz 1dZ = 1/ (it) I (it) p Σ pa I p + Σ n 1 pa n 1 /. Ez [g(s, y)]. 1,Z B.. Proof of Lemma A.3 Let z(α) be the upper 100α percentile of T D / σ D. We further expand z(α) as z(α) = u + 1 p q 1 (u) + 1 p q (u) + 1 n 1 q 3 (u). Now, from the result of Theorem.1, we derive the following expansion ( ) TD 1 α = P z(α) σ D [ 1 = Φ(z(α)) φ(z(α)) p c 3 h (z(α)) + 1 p {c 4h 3 (z(α)) + c 6 h 5 (z(α))} + 1 ] c h 1 (z(α)) + O(p 3/ ). n 1 0
21 Then, by Taylor expansion of Φ, φ and h i s around u, we obtain ( ) [ TD 1 P z(α) = Φ(u) φ(u) p {q 1 (u) c 3 h (u)} σ D + 1 p {q (u) c 4 h 3 (u) c 6 h 5 (u)} + 1 ] {q 3 (u) c h 1 (u)} + O(p 3/ ). n 1 Therefore, we have q 1 (u) = a3 3 (u 1), a 3 q (u) = a 4 u(u 3) a 3 u(u 5), a 9a 3 q 3 (u) = u. Acknowledgements T. Nishiyama s research was in part supported by the travel grant from The Royal Swedish Academy of Sciences, G.S. Magnuson s fond. M. Hyodo s research was in part supported by Japan Society for the Promotion of Science (JSPS). T. Seo s research was in part supported by Grant-in-Aid for Scientific Research (C)( ). T. Pavlenko s research was in part supported by grant from Swedish Research Council ( ). References [1] Anderson, T. W. (003). An Introduction to Multivariate Statistical Analysis Third Edition. Wiley, New york. [] Aoshima, M. and Yata, K. (011). Two stage procedures for highdimensional data. Sequential Analysis, 30, [3] Bai, Z. and Saranadasa, H. (1996). Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6,
22 [4] Bennett, B. M. (1951). Note on a solution of the generalized Behrens- Fisher problem. Annals of the Institute of Statistical Mathematics,, [5] Chen, S. X. and Qin, Y. L. (010). A two-sample test for highdimensional data with applications to gene-set testing. The Annals of Statistics, 38, [6] Dempster, A. P. (1958). A high dimensional two sample significance test. Annals of Mathematical Statistics, 9, [7] Dempster, A. P. (1960). A significant test for the separation of two highly multivariate small samples. Biometrics, 16, [8] Himeno, T. (007). Asymptotic expansions of the null distributions for the Dempster trace criterion. Hiroshima Mathematical Journal, 37, [9] Hyodo, M., Takahashi, S. and Nishiyama, T. (01). Multiple comparisons among mean vectors when the dimension is larger than the total sample size. Technical Report No.1-01, Statistical Research Group, Hiroshima University. [10] Johnson, R. A. and Weerahendi, S. (1988). A Baysian solution to the multivariate Behrens-Fisher problem. Journal of the American Statistical Association, 83, [11] Srivastava, M. S. (005). Some tests concerning the covariance matrix in high-dimensional data. Journal of Japan Statistical Society, 35, [1] Srivastava, M. S. (007). Multivariate theory for analyzing high dimensional data. Journal of Japan Statistical Society, 37, [13] Srivastava, M. S. and Du, M. (008). A test for the mean vector with fewer observations than the dimension. Journal of Multivariate Analysis, 99, [14] Srivastava, M. S. and Yanagihara, H. (010). Testing the equality of several covariance matrices with fewer observations than the dimension. Journal of Multivariate Analysis, 101,
23 [15] Yanagihara, H. and Yuan, K. H. (005). Three approximate solutions to the multivariate Behrens-Fisher problem. Communications in Statistics - Simulation and Computation, 34,
24 Table 1: ASL in the case of k = 3 and β 1 = β = β 3 = 1 (i = 1,, 3). N i = 10(1 + i) N i = 0(1 + i) p α z α t ASL 1 α ASL α t ASL 1 α ASL α Table : ASL in the case of k = 3, β 1 = 1 and β = β 3 = 1/ (i = 1,, 3). N i = 10(1 + i) N i = 0(1 + i) p α z α t ASL 1 α ASL α t ASL 1 α ASL α
25 Table 3: ASL in the case of k = 4 and β 1 = β = β 3 = β 4 = 1 (i = 1,..., 4). N i = 10(1 + i) N i = 0(1 + i) p α z α t ASL 1 α ASL α t ASL 1 α ASL α Table 4: ASL in the case of k = 4, β 1 = β 3 = 1 and β = β 4 = 1 (i = 1,..., 4). N i = 10(1 + i) N i = 0(1 + i) p α z α t ASL 1 α ASL α t ASL 1 α ASL α
26 Table 5: Empirical powers with Σ 1 = I, Σ = (0.5 i j ) and = 5. N 1 = 10, N = 0 N 1 = 0, N = 30 p α S SD AY NHSP S SD AY NHSP Table 6: Empirical powers with Σ 1 = I, Σ = (0.5 i j ) and = 10. N 1 = 10, N = 0 N 1 = 0, N = 30 p α S SD AY NHSP S SD AY NHSP
27 Table 7: Empirical powers with Σ 1 = (0. i j ), Σ = (0.5 i j ) and = 5. N 1 = 10, N = 0 N 1 = 0, N = 30 p α S SD AY NHSP S SD AY NHSP Table 8: Empirical powers with Σ 1 = (0. i j ), Σ = (0.5 i j ) and = 10. N 1 = 10, N = 0 N 1 = 0, N = 30 p α S SD AY NHSP S SD AY NHSP
28 Table 9: ASL in the case of Σ 1 = I, Σ = (0.5 i j ). N 1 = 10, N = 0 p α S SD AY(.5) AY(5) NHSP Table 10: ASL in the case of Σ 1 = I, Σ = (0.5 i j ). N 1 = 0, N = 30 p α S SD AY(.5) AY(5) NHSP
29 Table 11: ASL in the case of Σ 1 = (0. i j ), Σ = (0.5 i j ). N 1 = 10, N = 0 p α S SD AY(.5) AY(5) NHSP Table 1: ASL in the case of Σ 1 = (0. i j ), Σ = (0.5 i j ). N 1 = 0, N = 30 p α S SD AY(.5) AY(5) NHSP
Testing Block-Diagonal Covariance Structure for High-Dimensional Data
Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationApproximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work
Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto
More informationSIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA
SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,
More informationSHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan
A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,
More informationOn testing the equality of mean vectors in high dimension
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.
More informationTesting equality of two mean vectors with unequal sample sizes for populations with correlation
Testing equality of two mean vectors with unequal sample sizes for populations with correlation Aya Shinozaki Naoya Okamoto 2 and Takashi Seo Department of Mathematical Information Science Tokyo University
More informationTesting Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality
Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality Yuki Yamada, Masashi Hyodo and Takahiro Nishiyama Department of Mathematical Information Science, Tokyo University
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationConsistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,
More informationExpected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples
Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationOn the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors
On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University
More informationEstimation of the large covariance matrix with two-step monotone missing data
Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo
More informationHigh-dimensional asymptotic expansions for the distributions of canonical correlations
Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic
More informationOn the conservative multivariate multiple comparison procedure of correlated mean vectors with a control
On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,
More informationConsistency of test based method for selection of variables in high dimensional two group discriminant analysis
https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai
More informationMULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA
J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets
More informationStatistical Inference On the High-dimensional Gaussian Covarianc
Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional
More informationConsistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis
Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2016-0063R2 Title Two-sample tests for high-dimension, strongly spiked eigenvalue models Manuscript ID SS-2016-0063R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI
More informationOn Selecting Tests for Equality of Two Normal Mean Vectors
MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationTWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA
Statistica Sinica (014): Preprint 1 TWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA Long Feng 1, Changliang Zou 1, Zhaojun Wang 1, Lixing Zhu 1 Nankai University and Hong Kong Baptist University
More informationA new multivariate CUSUM chart using principal components with a revision of Crosier's chart
Title A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Author(s) Chen, J; YANG, H; Yao, JJ Citation Communications in Statistics: Simulation and Computation,
More informationHigh-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi
High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,
More informationThe Third International Workshop in Sequential Methodologies
Area C.6.1: Wednesday, June 15, 4:00pm Kazuyoshi Yata Institute of Mathematics, University of Tsukuba, Japan Effective PCA for large p, small n context with sample size determination In recent years, substantial
More informationTests for Covariance Matrices, particularly for High-dimensional Data
M. Rauf Ahmad Tests for Covariance Matrices, particularly for High-dimensional Data Technical Report Number 09, 200 Department of Statistics University of Munich http://www.stat.uni-muenchen.de Tests for
More informationEPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large
EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information
More informationT 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem
T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School
More informationOn corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR
Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional
More informationTests Using Spatial Median
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 331 338 Tests Using Spatial Median Ján Somorčík Comenius University, Bratislava, Slovakia Abstract: The multivariate multi-sample location problem
More informationSOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM
SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM Junyong Park Bimal Sinha Department of Mathematics/Statistics University of Maryland, Baltimore Abstract In this paper we discuss the well known multivariate
More informationOn the Testing and Estimation of High- Dimensional Covariance Matrices
Clemson University TigerPrints All Dissertations Dissertations 1-009 On the Testing and Estimation of High- Dimensional Covariance Matrices Thomas Fisher Clemson University, thomas.j.fisher@gmail.com Follow
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationA Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data
A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine
More informationHYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX
Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan
More informationA Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report -
A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report - Edgar Brunner and Marius Placzek University of Göttingen, Germany 3. August 0 . Statistical Model and Hypotheses Throughout
More informationJackknife Empirical Likelihood Test for Equality of Two High Dimensional Means
Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.
More informationASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE
J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno
More informationRobustness of Simultaneous Estimation Methods to Varying Degrees of Correlation Between Pairs of Random Deviates
Global Journal of Mathematical Sciences: Theory and Practical. Volume, Number 3 (00), pp. 5--3 International Research Publication House http://www.irphouse.com Robustness of Simultaneous Estimation Methods
More informationApproximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments
CIRJE-F-466 Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments Yukitoshi Matsushita CIRJE, Faculty of Economics, University of Tokyo February 2007
More informationMLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More informationInferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data
Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone
More informationTests of the Present-Value Model of the Current Account: A Note
Tests of the Present-Value Model of the Current Account: A Note Hafedh Bouakez Takashi Kano March 5, 2007 Abstract Using a Monte Carlo approach, we evaluate the small-sample properties of four different
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationOptimal bandwidth selection for the fuzzy regression discontinuity estimator
Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationDepartment of Statistics
Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationHeteroskedasticity-Robust Inference in Finite Samples
Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference
More informationarxiv: v1 [math.st] 2 Apr 2016
NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher
More informationSmooth nonparametric estimation of a quantile function under right censoring using beta kernels
Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth
More informationA Note on Cochran Test for Homogeneity in Two Ways ANOVA and Meta-Analysis
Open Journal of Statistics, 05, 5, 787-796 Published Online December 05 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/0.436/ojs.05.57078 A Note on Cochran Test for Homogeneity in Two Ways
More informationMultivariate-sign-based high-dimensional tests for sphericity
Biometrika (2013, xx, x, pp. 1 8 C 2012 Biometrika Trust Printed in Great Britain Multivariate-sign-based high-dimensional tests for sphericity BY CHANGLIANG ZOU, LIUHUA PENG, LONG FENG AND ZHAOJUN WANG
More informationON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES
ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:
More informationDistribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases
Communications in Statistics Simulation and Computation, 34: 43 5, 005 Copyright Taylor & Francis, Inc. ISSN: 0361-0918 print/153-4141 online DOI: 10.1081/SAC-00055639 Distribution Theory Comparison Between
More informationNEW APPROXIMATE INFERENTIAL METHODS FOR THE RELIABILITY PARAMETER IN A STRESS-STRENGTH MODEL: THE NORMAL CASE
Communications in Statistics-Theory and Methods 33 (4) 1715-1731 NEW APPROXIMATE INFERENTIAL METODS FOR TE RELIABILITY PARAMETER IN A STRESS-STRENGT MODEL: TE NORMAL CASE uizhen Guo and K. Krishnamoorthy
More informationThis paper has been submitted for consideration for publication in Biometrics
BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationMultistage Methodologies for Partitioning a Set of Exponential. populations.
Multistage Methodologies for Partitioning a Set of Exponential Populations Department of Mathematics, University of New Orleans, 2000 Lakefront, New Orleans, LA 70148, USA tsolanky@uno.edu Tumulesh K.
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationComparison of Discrimination Methods for High Dimensional Data
Comparison of Discrimination Methods for High Dimensional Data M.S.Srivastava 1 and T.Kubokawa University of Toronto and University of Tokyo Abstract In microarray experiments, the dimension p of the data
More informationf-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models
IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,
More informationTesting Overidentifying Restrictions with Many Instruments and Heteroskedasticity
Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,
More informationThree Essays on Testing for Cross-Sectional Dependence and Specification in Large Panel Data Models
Syracuse University SURFACE Dissertations - ALL SURFACE July 06 Three Essays on Testing for Cross-Sectional Dependence and Specification in Large Panel Data Models Bin Peng Syracuse University Follow this
More informationA Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common
Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationA sequential hypothesis test based on a generalized Azuma inequality 1
A sequential hypothesis test based on a generalized Azuma inequality 1 Daniël Reijsbergen a,2, Werner Scheinhardt b, Pieter-Tjerk de Boer b a Laboratory for Foundations of Computer Science, University
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationThomas J. Fisher. Research Statement. Preliminary Results
Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these
More informationMore powerful tests for sparse high-dimensional covariances matrices
Accepted Manuscript More powerful tests for sparse high-dimensional covariances matrices Liu-hua Peng, Song Xi Chen, Wen Zhou PII: S0047-259X(16)30004-5 DOI: http://dx.doi.org/10.1016/j.jmva.2016.03.008
More informationAsymptotic Distributions of Instrumental Variables Statistics with Many Instruments
CHAPTER 6 Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments James H. Stock and Motohiro Yogo ABSTRACT This paper extends Staiger and Stock s (1997) weak instrument asymptotic
More informationTHE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED
THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles,
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationRobust Wilks' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA)
ISSN 2224-584 (Paper) ISSN 2225-522 (Online) Vol.7, No.2, 27 Robust Wils' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA) Abdullah A. Ameen and Osama H. Abbas Department
More informationSo far our focus has been on estimation of the parameter vector β in the. y = Xβ + u
Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,
More informationPrincipal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,
Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations
More informationLong-Run Covariability
Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips
More informationGood Exact Confidence Sets for a Multivariate Normal Mean
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 997 Good Exact Confidence Sets for a Multivariate Normal Mean Yu-Ling Tseng Lawrence D. Brown University of Pennsylvania
More informationStatistica Sinica Preprint No: SS
Statistica Sinica Preprint No: SS-2017-0275 Title Testing homogeneity of high-dimensional covariance matrices Manuscript ID SS-2017-0275 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0275
More informationA Bootstrap Test for Conditional Symmetry
ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationScaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.
Scaled and adjusted restricted tests in multi-sample analysis of moment structures Albert Satorra Universitat Pompeu Fabra July 15, 1999 The author is grateful to Peter Bentler and Bengt Muthen for their
More informationEffects of data dimension on empirical likelihood
Biometrika Advance Access published August 8, 9 Biometrika (9), pp. C 9 Biometrika Trust Printed in Great Britain doi:.9/biomet/asp7 Effects of data dimension on empirical likelihood BY SONG XI CHEN Department
More informationOn a General Two-Sample Test with New Critical Value for Multivariate Feature Selection in Chest X-Ray Image Analysis Problem
Applied Mathematical Sciences, Vol. 9, 2015, no. 147, 7317-7325 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.510687 On a General Two-Sample Test with New Critical Value for Multivariate
More informationLinear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions
Linear Systems Carlo Tomasi June, 08 Section characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationA Test of Cointegration Rank Based Title Component Analysis.
A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right
More informationACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION
ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION Yu-Hang Xiao, Lei Huang, Junhao Xie and H.C. So Department of Electronic and Information Engineering, Harbin Institute of Technology,
More informationCentral Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions
Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained
More informationMiscellanea Multivariate sign-based high-dimensional tests for sphericity
Biometrika (2014), 101,1,pp. 229 236 doi: 10.1093/biomet/ast040 Printed in Great Britain Advance Access publication 13 November 2013 Miscellanea Multivariate sign-based high-dimensional tests for sphericity
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More information478 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 56, NO. 2, FEBRUARY 2008
478 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 56, NO. 2, FEBRUARY 2008 On Estimation of Covariance Matrices With Kronecker Product Structure Karl Werner, Student Member, IEEE, Magnus Jansson, Member,
More information