Estimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality
|
|
- Aubrey Richard
- 5 years ago
- Views:
Transcription
1 Estimation of multivariate rd moment for high-dimensional data and its application for testing multivariate normality Takayuki Yamada, Tetsuto Himeno Institute for Comprehensive Education, Center of General Education Kagoshima University --0 Korimoto, Kagoshima Japan Faculty of Data Science Shiga University -- Banba, Hikone, Shiga 5-85 Japan Abstract This paper is concerned with the multivariate rd moment and its estimation. Mardia (970) and Srivastava (984) proposed the multivariate skewness and its estimator, independently. However, these estimators can not defined for the case in which the dimension p is larger than the sample size. In this paper, we treat the multivariate rd moment γ which is defined by using Hadamard product of observation vectors, and propose an estimate of γ which is well defined when p >. Based on the estimator, we propose new test for multivariate normality. Simulation results revealed that our proposed test has good accuracy. Keywords: Multivariate rd moment, Hadamard product, testing multivariate normality, (n, p)- asymptotic AMS 00 subject classification: Primary 6H5, Secondary 6E0 Introduction Let x and y be p-variate random vectors which are independently and identically distributed as a p-dimensional distribution F p (µ, Σ) with mean vector µ and covariance matrix Σ. Mardia (970) defined multivariate skewness and kurtosis as β M,p E[{(x µ) Σ (y µ)} ], β M,p E[{(x µ) Σ (x µ)} ], respectively. When p, β,p and β,p are reduced to the ordinal univariate squared skewness and kurtosis, respectively. Let x,..., x be a random sample drown from a population with distribution F p (µ, Σ). Mardia [0] proposed sample counterparts of β M,p and β M,p as respectively, where b M,p b M,p x j {(x i x) S (x j x)}, {(x i x) S (x i x)}, x i, S (x i x)(x i x). Srivastava [8] (as cited in a more recent book by this author [9]) defined other multivariate skewness β S,p and kurtosis β S,p, which are described as follows: Let Γ (γ,..., γ p ) be an orthogonal matrix such that Γ (γ,..., γ p ), where D λ diag(λ,..., λ p ). Let y l γ l x and θ l γ l µ for l,..., p. address:yamada@gm.kagoshima-u.ac.jp
2 Then β S,p and β S,p are given by β S,p p β S,p p { } E[(y l θ l ) ], λ / l E[(y l θ l ) 4 ] λ, l respectively. Srivastava [8] proposed sample counterparts of β S,p and β S,p as b S,p and b S,p, respectively. To describe them, let {/( )}S HD u H, where H (h,..., h p ) is an orthogonal matrix and D u diag(u,..., u p ). With being y li h lx i (l,..., p, i,..., ), b S,p and b S,p are given by b S,p p b S,p p { u / l u l } (y li ȳ l ), (y li ȳ l ) 4, respectively, where ȳ l y li. These sample counterparts of multivariate skewness and kurtosis are applied for testing multivariate normality, i.e., testing null hypothesis that population distribution is multivariate normal. Under multivariate normality, Mardia [0] showed that (/6)b M,p converges in distribution to χ (f M ), chisquared distribution with f M degrees of freedom, and {b M,p p(p + )}/{8p(p + )} / converges in distribution to (0, ). Here, f M p(p + )(p + )/6. Thus, if 6 b M,p χ f M ( α) () non-normality of the data is expected, where χ f M ( α) is the 00( α)% point of χ (f M ). Similarly, the normality is rejected if 8p(p + ) b M,p p(p + ) z α/, where z α/ is the 00( α/)% point of the standard normal distribution. Srivastava [8] showed that {p/6}b S,p converges in distribution to χ (p) under multivariate normality, and showed that (p/4) / (b S,p ) converges in distribution to (0, ). Based on these convergences, Srivastava [8] proposed testing criterion which rejects the normality of the data if and proposed the testing criterion that the normality is rejected if p 4 (b S,p ) z α/. p 6 b S,p χ p( α), () Some other methods to assess the multivariate normality have been studied. For a review of the results, see, e.g., Henze [] and Mecklin and Mundfrom []. In last years, we encounter more and more problems in applications when p is comparable with or even exceeds it. For example, financial data, consumer data, network data and medical data have this feature. When p >, it is impossible to define b M,p, b M,p, b S,p and b S,p, because the sample covariance matrix S becomes singular.
3 Instead of dealing with β M,p or β S,p, Himeno and Yamada [4] has treated κ E[{(x µ) (x µ)}] tr Σ (tr Σ). When Σ equals to identity matrix I p, κ is the same as the another definition of the Mardia [0] s multivariate kurtosis β M, p(p + ). Himeno and Yamada [4] gave the unbiased estimator of κ as the sample counterpart of κ, which is as follows: where ˆκ ( )( ) { tr S + (tr S) ( + )Q}, Q {(x x) (x x)}. It is noted that ˆκ is well defined for the case in which p >. Himeno and Yamada [4] showed that (/8) / p (ˆκ/â ) converges in distribution to (0, ) as and p go to infinity together under the following assumptions: () ( )/p converges to a positive constant as, p go to infinity; () a i tr Σ i /p converges to a positive constant as p, i,..., 4. Here, â is an unbiased estimator of a, which is as follows: â ( )( )( ) {( )( ) tr S + (tr S) ( ) Q}. They proposed testing criterion which rejects the normality of the data if ˆκ 8 p z α/. â It is reported in Himeno and Yamada [4] that the power of this test gets large as and p become large for local alternative hypothesis that population distribution is multivariate t. In this paper, we treat other multivariate rd moment γ instead of dealing with β M,p or β S,p. An estimator of γ is proposed that is well defined for the case in which p >. Based on ˆγ, we propose a new test for assessing multivariate normality. The present paper is organized as follows. In Section, we propose γ, and gave the estimator ˆγ. Asymptotic normality of ˆγ is shown under the asymptotic framework that, p for the case in which population distribution is multivariate normal. We give a testing criterion for multivariate normality based on ˆγ. Some results of small-scale simulation including to see attained significance level and empirical power are reported in Section. We give concluding remarks in Section 4. All technical proofs are relegated to the Appendix. Multivariate rd moment and its estimator Assume that x,..., x are independently and identically distributed (i.i.d.), and assume the following multivariate linear model: x i µ + Σ / ε i, () where ε,..., ε are i.i.d. as a p-dimensional distribution F p (0, I p ). For random vector x µ+σ / ε with ε F p (0, I p ), let γ E[{ (x µ)} p ] p( Σ) p, where the notation i A stands for the Hadamard product of i matrices A, i.e., i A A A A (i times). We note that i A is positive definite, which is guaranteed by Schur s product theorem (cf. Schott [7]). When p, γ is reduced to the univariate skewness. It is hard to obtain analytic form for
4 4 the unbiased estimator of γ. Instead of getting it, we construct the estimator by taking the ratio of unbiased estimator for the numerator of γ divided by for the denominator, which an estimator we proposed is as follows. E[{ ˆγ (x µ)} p ] T, p ( Σ) S p where T 4 * [ { x i P (x j + x k )}] p, S 8 P 6,k * p {(x k x l )(x k x l ) (x α x β )(x α x β ) (x γ x δ )(x γ x δ ) } p. k,l,α,β, γ,δ Here, the notation stands for the sum of all pairs of indices which are not equal. For example,,k j,j i k,k i,k j.. Asymptotic distribution of ˆγ and a test for γ Firstly, we show the asymptotic normality of ˆγ as, p under the assumption that F p (0, I p ) p (0, I p ). It is noted that T is invariant under the location shift. Hence, without loss of generality, we may assume µ 0. It can be described that T 4 { * [Σ / z i P (z j + z k )}] p.,k For deriving the anaclitic form of Var(T ), we decompose T as T + + (a lz i ) ( )( ) ( ),k l z i ) a lz j * a l z i a lz j a lz k { (a l z i ) a la l a lz i } ( ) ( )( ) T + T + T,,k * a l z i a lz j a lz k l z i ) a lz j ( )a la l a lz i where Σ / A (a,..., a p ). After much algebraic calculations, it is shown that Cov(T, T ) 0, Cov(T, T ) 0 and Cov(T, T ) 0, and so σ T Var(T ) Var(T ) + Var(T ) + Var(T ). We obtain analytic forms of Var(T ), Var(T ) and Var(T ), respectively, which are as follows. Var(T ) 6 p( Σ) p, (4) 8 Var(T ) ( ) p( Σ) p, (5) 4 Var(T ) ( )( ) p( Σ) p. (6)
5 5 These proofs are given in Appendix. From (4), (5) and (6), we have It is found that Var(T ) 6 ( )( ) p( Σ) p. Var(T ) σ T From Chebyshev inequality we have ( ) 0 (, p ), Var(T ) σ T 4 0 (, p ). These probability convergences imply that p p T 0 (, p ), T 0 (, p ). σ T σ T σ T (T T ) p 0 (, p ). And we can express that σ T T [ σ T { (a l z i ) a la l a lz i } ] η i σ T. It is shown that E[η i ] 0, and is shown in Appendix that ( ) ηi Var 6 σ T σt p( Σ) p, (7) which converges to as, p. By virtue of the ordinal central limit theorem, we find that the asymptotic distribution of σ T T is (0, ) as, p. Using Slutsly theorem, the asymptotic normality of T can be proved, which is given as the following theorem. Theorem. As, p, σ T T D (0, ). In Appendix, we prove the rate consistency for S/ p( Σ) p under the condition C; C : lim sup p p( Σ + ) p p( Σ) p <, where Σ + ( σ ij ) for Σ (σ ij ). The result is given as the following theorem. Theorem. Suppose that the condition C holds. Then, as, p, S p( Σ) p p. From these theorems (Theorem and Theorem ) and Slutsky s theorem, we find that 6 ˆγ D (0, ) (8) as, p under the condition that C holds. As an application, assessing the assumption of multivariate normality is considered by testing γ 0. The null hypothesis that γ 0 is rejected with significance level α for the case in which /6ˆγ > z α/, where z α is the 00 α% point of the standard normal distribution.
6 6 umerical results. Simulation In order to see the accuracy of the asymptotic normality (8), we tried a numerical simulation. It can be expressed that S A B + C D, (9) P P 4 P 5 P 6 where A C * p (x k x k x l x l x α x α) p, B k,l,α k,l,α,β,γ * p (x k x k x l x α x β x γ) p, D k,l,α,β k,l,α,β,γ,δ * p (x k x k x l x l x α x β) p, * p (x k x l x α x β x γ x δ) p. To compute A, it is needed to calculate triple sum, which is performed by triple-loop calculations. We see that B, C, D and T are needed to perform multi-loop calculations. Generally, it takes a lot time to carry out multi-loop calculations. To save computing time, we shall provide other expressions for A, B, C, D and T. Put Then, it holds that X ij ( i x k )( j x k ) (i, j,..., ), k { X 0i X i0 ( i x k ) p} (i,..., ), s k k ( k x i ) (k,..., ). T ( )( ) s p ( )( ) (s s ) p + ( )( ) ( s ) p, A [ p ( ] X ) X X + X p, B p [ ( X ) + 5X X 6X 4X 0 X X + 4X 0 X +X X + X 0 X 0 ( ] X ) X 0 X 0 X p, C [ p ( X ) 4X X + 4X + 6X 0 X X 4X 0 X 8X X 4X 0 X 0 ( X ) + 8X 0 X 0 X 4X 0 ( X 0 ) X + 4( X 0 ) X + 4X 0 X X 0 4X X 0 X 0 ( X 0 ) X +X 0 X X 0 + ( X 0 ) ( X 0 ) X ] p, D {( s ) p (s s ) p + s p } 6A 8B 9C. One can see that these expressions have no multi-sum expression. Generate the data based on the model (). Firstly, we checked the asymptotic normality of ˆγ given in (8). Make 0 4 samples of the size, each of which is constructed by p-dimensional vectors which are i.i.d. as p (0, Σ). In this study, the following two cases for Σ are treated: () Σ I p ; () Σ has Toeplitz structure such that Σ Σ T D / P D /, where D diag(d,..., d p ) with d i 5+( ) i (p i+)/p, and P (0. i j ). We note that these two cases satisfy the condition C. Figure shows Q-Q plot of /6ˆγ when Σ I p for the case in which (, p) (60, 60), and Figure shows Q-Q plot when Σ Σ T. We can see the goodness of fit of the standard normal distribution to /6ˆγ from these figures. This numerical simulations were carried out for several other dimensions
7 ormal Q Q Plot Sample Quantiles 0 Sample Quantiles 4 4 ormal Q Q Plot Theoretical Quantiles Figure : Q-Q plot of 0γ when Σ I Theoretical Quantiles Figure : Q-Q plot of 0γ when Σ ΣT as well, p 0, 40, 480, and also for other sample sizes, 0, 40, 480, but due to similarity of the graphs, only a selection is reported here. ext, we checked the attained significance level(asl) for testing the null hypothesis that H0 : Fp (0, I p ) p (0, I p ) under the assumption of model () by using the following criterion: /6γ > z α/ reject the null hypothesis that H0 : Fp (0, I p ) p (0, I p ). (0) To compute ASL, Monte Carlo simulation with 04 replication was done for the nominal level α We computed for the case in which 60,0, 40, 480, and p 60, 0, 40, 480, and wrote these values in Table. The settings of Σ treated in this paper are as follows. Case. Σ I p. Case. Σ ΣT. Case. Generate a sparse positive definite covariance matrix Σ by Algorithm. Set sparsity level Algorithm The algorithm for generating sparse covariance matrix Σ : Construct p p matrix R (rij ) as follows. For each i < j, with probability ( s) 0.75, Unif(0, ) Unif(, 0) with probability ( s) 0.75, rij 0 with probability s, where s is the level of sparsity in R. Then, we set rji rij to obtain symmetry. Furthermore, set the diagonal elements of R to equal. : Prepare a p p diagonal matrix D (dij ) such that d,..., dpp are i.i.d. chi-squared distribution with degree of freedom. : Create A DRD. 4: In order to obtain positive definiteness of Σ, calculate the minimum eigenvalue emin of A. Set { A + ( emin + 0.)I p if emin 0, Σ A otherwise. s 0.7. Case 4. Use the same generating method as Case with sparsity level s 0.4. Case 5. Use the same generating method as Case with sparsity level s 0.. For Case -5, we repeat the Monte Carlo simulation 0 times, and wrote the average of these 0 values in the table. The value in parenthesis is the standard error. 7
8 8 Table : Attained significance level of the proposed test p Case Case Case (S.E) Case 4(S.E) Case 5(S.E) (0.00) 0.056(0.006) 0.056(0.004) (0.004) 0.056(0.00) 0.056(0.000) (0.005) 0.056(0.00) 0.056(0.00) (0.005) 0.056(0.00) 0.056(0.00) (0.00) 0.05(0.004) 0.05(0.00) (0.00) 0.05(0.00) 0.05(0.009) (0.00) 0.05(0.00) 0.05(0.004) (0.00) 0.05(0.00) 0.05(0.00) (0.005) 0.05(0.00) 0.05(0.00) (0.005) 0.05(0.000) 0.05(0.000) (0.00) 0.05(0.008) 0.05(0.000) (0.000) 0.05(0.00) 0.05(0.000) (0.00) 0.05(0.00) 0.05(0.00) (0.00) 0.05(0.00) 0.05(0.00) (0.000) 0.05(0.00) 0.05(0.00) (0.00) 0.05(0.004) 0.05(0.00) We can see from Table that ASL of our statistic is little affeected by the sparsity of the covariance matrix. It is observed that the accuracy of the approximation gets better as and p become large. Thirdly, we checked low-dimensional performance of the proposed test. The setting of dimensionality is p 0. We compare ASL of our proposed test with the one of Mardia [0] s test () and the one of Srivastava [8] s test () for the case in which 40, 50, 60, 70, 80, 90, 00 and Σ I 0. Computing methodology is the same as the one in Table. The symbol P denotes the value of ASL for proposed test (0), M denotes for Mardia [0] s test (), and S denotes for Srivastava [8] s test (). We can see that ASL of our proposed test overestimate, whereas ASLs of Mardia [0] s test and Srivastava [8] s test underestimate. Our approximation seems to be good for the case in which 70, whereas other two approximations are not good. The precisions of all three tests gets better as becomes large. ASL P P P P P P P S S S S S S S M M M M M M M Figure : Comparison of ASL when p 0 Finally, we checked the empirical power(emp) of the proposed test by Monte Carlo simulation.
9 9 The setting of the local alternative hypothesis treated in this paper is as follows: For ε (ε,..., ε p ) F p (0, I p ), ν ( ci ) ε i ν (i,..., p), where c,..., c p are i.i.d. as the chi-squared distribution with ν degrees of freedom. We computed the EMP based on 0 repetition for the case in which α 0.05 and ν The settings of and p are the same as the ones in Table. We carried out for all models of Σ written in Table, but due to similarity of the tendencies, Case -5 are omitted here. We can see from Table that EMP are monotone increasing for p and. Table : Empirical power of the proposed test for Case \ p Real data analysis We applied our test to data-set which is used in Kubokawa and Srivastava [6]. The first applied data is colon data which is constructed by 000 (p) genes expression levels on normal colon tissues and 40 tumor colon tissues. We preprocessed the data by applying 0 logarithmic transformation. The second applied data-set is leukemia data which is constructed by 57(p) gene expressions on 47 patients suffering from acute lymphoblastic leukemia (ALL, 47 cases) and 5 patients suffering from acute myeloid leukemia (AML, 5 cases). The data-set are preprocessed by following protocol written in Dudoit et al. []. For the colon data, the p-value of our test is for normal colon tissues and for tumor colon tissues. For the leukemia data, we observed that both the p-value of our test for ALL and for AML are for the leukemia data. These results indicate that the multivariate normality assumption on both sets of data cannot be rejected at the usual significance level 5%. 4 Concluding remarks This paper treats multivariate rd moment γ which is defined by using Hadamard product of observation vectors. When the dimension equals to, γ becomes univariate skewness. An estimator of γ, denoted as ˆγ, which is well defined for p > is proposed. Based on ˆγ, testing criterion for assessing multivariate normality is given. We prepare the expression of the testing statistic without having multi-sum. Calculating time becomes shorten. The performance of the testing criterion, in terms of the attained significance level and the empirical power, is shown through simulations for several setting of sample sizes and dimensions. We apply our proposed test to microarray data sets, some of the most popular are in high-dimensional data. To assess the normality of high-dimensional data, we recommend to use not only our proposed test but also Himeno and Yamada [4] s test based on ˆκ together. Omnibus test using ˆγ and ˆκ, which imitates Jarque and Bera [5] s test, is a future work. A Moment of statistic In this section, analytic forms of Var(T ), Var(T ) and Var(T ) are proposed. We derive them by using these results (Lemma ), proofs of which are tedious but simple, therefore skipped.
10 0 Lemma. Let z be a random vector distributed as p (0, I p ), and a and b be constant vectors. Then, E[(a z) ] a a, E[(a z) 4 ] (a a), E[(a z) (b z) ] (a b) + a ab b, E[(a z) b z] a aa b, E[(a z) 6 ] 5(a a), E[(a z) (b z) ] 6(a b) + 9a aa bb b. Here, we write analytic forms of Var(T ), Var(T ) and Var(T ) in the following lemma. Lemma. For T, T and T, as defined in Section., Proof. In Section, we expressed that where Var(T ) 6 p( Σ) p, 8 Var(T ) ( ) p( Σ) p, 4 Var(T ) ( )( ) p( Σ) p. η i T η i, { (a l z i ) a la l z } lz i. Since η,..., η are i.i.d., it holds that ( Var(T ) Var(η) Var where z p (0, I p ). We find that E[ζ] 0, and so ) {(a lz) a la l a lz}, Var(ζ) E[ζ ] [ E {(a lz) a la l a lz} + * {(a l z) a la l a lz}{(a αz) a αa α a αz} l,α [ {( )(a la l ) } + * {6(a l a l ) + ( )a la l a la α a αa α } l,α 6 (a la l ) + l a α ) 6 p( A ) p 6 p( Σ) p, l,α
11 where the third equality follows from Lemma. Since E[T ] 0, we have Var(T ) E[T ] 9 ( ) E * (a l z i ) 4 (a lz j ) ( )a la l a lz i + * * (a l z i ) 4 (a lz j ) ( )a la l a lz i l,α * (a α z i ) 4 (a αz j ) ( )a αa α a αz i 9 ( ) E l z i ) 4 (a lz j ) + l z i ) (a lz j ) (a lz k ) l,α +( ) (a la l ) (a lz i ) ( )a * la l (a l z i ) (a lz j ) + * l z i ) (a αz i ) a lz j a αz j + l z i ) a lz j a αz j (a αz k ) ( )a * αa α (a l z i ) a lz j a αz j ( )a * la l (a α z i ) a αz j a lz j +( ) a la l a αa α 9 ( ) a lz i a αz i }],k,k [ { ( ) + ( )( ) + ( ) ( ) } (a la l ) + * { ( ){(a la α ) + a la l a αa α }a la α l,α + {( )( ) ( ) ( ) + ( ) }a la l a la α a αa α ] 9 ( ( ) ) 8 ( ) p( A ) p 8 ( ) p( Σ) p, (a la l ) + ( ) l a α ) where the fourth equality follows from Lemma. l,α
12 Since E[T ] 0, we have Var(T ) E[T ] 4 ( ) ( ) E * a l z i a lz j a lz k l,α,k,k + * * a l z i a lz j a lz k * a α z i a αz j a αz k 4 ( ) ( ) E 6 +6,k,k * * a l z i z ia α a lz j z ja α a lz k z ka α l,α,k 4 ( )( ) (a la l ) + 4 ( )( ) p( A ) p 4 ( )( ) p( Σ) p, where the fourth equality follows from Lemma. B Proof of Theorem l z i ) (a lz j ) (a lz k ) l a α ) In this section, we give a proof of Theorem. Before proving it, we prepare three lemmas. The first two lemmas (Lemma and Lemma 4) treat formula about multi-sum, and the last lemma treats the order of the expectations which is used to prove Theorem. Since proofs of Lemma and Lemma 4 are tedious but simple, we skipped here. Lemma. The following equations hold: ij σstσ is σ jt tr{( Σ)Σ} tr DΣ( Σ)ΣD tr D( Σ)Σ( Σ) p( Σ) p,s,t ij σ st σ is σ jt σ it σ js,s,t + p{d ( Σ)} p j s t l,α ij σis,s ij σ is σ jj σjs,s σ ij σ st σ is σ jt σ it σ js tr D( Σ)Σ( Σ) tr D( Σ)Σ( Σ) + p{d ( Σ)} p ii σijσ isσ js,,s ii σijσ isσ js.,s
13 Lemma 4. The following equations hold: ij σis p( Σ) p p{d ( Σ)} p ii σij 6 ij,,s ii σ ij σ is σjs tr DΣ( Σ)ΣD p{d ( Σ)} p ii σij ii σijσ jj,,s ii σijσ isσ js tr D( Σ)Σ( Σ) p{d ( Σ)} p,s ii σij ii σijσ 4 jj. Lemma 5. Let Q klαβγδ a iz k z l a j a i z α z β a j a iz γ z δa j, j where Σ / A (a,..., a p ), and z k, z l, z α, z β, z γ and z δ are i.i.d. as p (0, I p ). Then, E[Q ] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js, () j s t E[Q 4] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js, () j s t E[Q 45] O ( { p( Σ) p } ), () E[Q 456] O ( { p( Σ) p } ). (4) In addition, under the assumption C, E[Q ] O ( { p( Σ) p } ), E[Q 4] O ( { p( Σ) p } ). Proof. Since (), () and (4) can be proved by using the same way to show (), we only show this, and omit other three here. It can be expressed that [ E[Q ] E (a iz z a i a iz z a i a iz z a i ) + i z z a i a iz z a i a iz z a i )(a jz z a j a jz z a j a jz z a j ) + i z z a j a iz z a j a iz z a j ) + 4 i z z a j a iz z a j a iz z a j )(a iz z a s a iz z a s a iz z a s ),s + i z z a j a iz z a j a iz z a j )(a sz z a t a sz z a t a sz z a t ) + 4 +,s,t i z z a i a iz z a i a iz z a i )(a iz z a s a iz z a s a iz z a s ) i z z a i a iz z a i a iz z a i )(a jz z a s a jz z a s a jz z a s )
14 4 {(a ia i ) } * {a i a i a ja j + (a ia j ) } + * {a i a i a ja j + (a ia j ) } i a i a ja s + a ia j a ia s ) +,s i a j a sa t + a ia s a ja t + a ia t a ja s ),s,t + 4 i a i a ia j ) + i a i a ja s + a ia j a ja s ),s 7 σii 6 + ii σjj + 8 ii σijσ jj + 6 ii σijσ 4 jj ij,s + 08 ii σij + 48 ij σis + 6 ii σjs + 6 ii σ ij σ is σjs + 7 ii σijσ isσ js +,s,s ij σst + 8,s,t,s ij σstσ is σ jt + 6,s,t ij σ is σ it σ js σ jt σ st,s,t { p( Σ) p } + 4 σii ii σijσ jj + 6 ii σijσ 4 jj ij + 96 ii σij + 6 ij σis + 6 ii σ ij σ is σjs + 7 ii σ js σijσ is + 8,s ij σstσ is σ jt + 6,s,t,s ij σ is σ it σ js σ jt σ st,s,t { p( Σ) p } + 8 tr{( Σ)Σ} 8 tr DΣ( Σ)ΣD 6 tr D( Σ)Σ( Σ) 8 p( Σ) p + 48 p{d ( Σ)} p + 4 σii ii σijσ jj + 6 ii σijσ 4 jj ij + 96 ii σij + 8 ij σis + 8 ii σ ij σ is σjs + 6 ii σijσ isσ js + 6 j s t,s σ ij σ is σ it σ js σ jt σ st,s,s { p( Σ) p } + 8 tr{( Σ)Σ} 4 p{d ( Σ)} p j s t σ ij σ is σ it σ js σ jt σ st { p( Σ) p } + 8 tr{( Σ)Σ} + 6 j s t,s σ ij σ is σ it σ js σ jt σ st, σii ii σij where the second equality follows from Lemma, the third equality from bottom follows from Lemma and the second equality from bottom follows from Lemma 4. Since Σ is positive definite, we have tr{( Σ)Σ} {tr( Σ)Σ}, where the right-hand side of the inequality can be written as { p( Σ) p }. From this result, it is found that E[Q ] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js. j s t
15 5 ote that σ ij σ st σ is σ jt σ it σ js σ ij σ st ( σ is ) ( σ jt ) σ js σ it From this inequality, we have σ ij σ st σ is σ jt σ it σ js j s t Hence under the assumption C, j s t σ ij σ st σ is σ jt + σsj σ ti σ is σ jt, j s t σ ij σ st σ is σ jt σ it σ js [ tr{( Σ + )Σ + } + tr{( Σ + )Σ + } ] [ p( Σ + ) p ]. σ ij σ st σ is σ jt σ it σ js O ( { p( Σ) p } ), and so E[Q ] O ( { p( Σ) p } ). Proof of Theorem. Since S is unbiased estimator of p( Σ) p, it is sufficient to show that Var(S/ p( Σ) p ) 0 (5) as, p. Since S is invariant under location shift, without loss of generality, we may assume that µ 0. Each of A, B, C and D in (9) can be written as A B C D * p {(Σ / z k z kσ / ) (Σ / z l z lσ / ) (Σ / z α z ασ / )} p, k,l,α k,l,α,β k,l,α,β,γ k,l,α,β,γ,δ From (9), we can express that ( S/ p ( Σ) p ) * p {(Σ / z k z kσ / ) (Σ / z l z lσ / ) (Σ / z α z βσ / )} p, * p {(Σ / z k z kσ / ) (Σ / z l z ασ / ) (Σ / z β z γσ / )} p, * p {(Σ / z k z lσ / ) (Σ / z α z βσ / ) (Σ / z γ z δσ / )} p. [{ } A P p( B Σ) p P 4 p( + C Σ) p P 5 p( D Σ) p P 6 p( Σ) p [ { } ( ) ( ) ( ) ( A B C 4 P p( + Σ) p P 4 p( + Σ) p P 5 p( Σ) p ( ) ( ) ] D + P 6 p(, Σ) p ] )
16 6 where the inequality follows from Jensen inequality. Thus, it is described that ( ) [ [ { } ] ( ) [ ( ) ] S A B Var p( 4 E Σ) p P p( + E Σ) p P 4 p( Σ) p ( ) [ ( ) ] ( ) [ ( ) ]] C D + E P 5 p( + E Σ) p P 6 p(. (6) Σ) p For the first term in the right-hand side of the inequality (6), expanding {.}, and excluding the term whose expectation becomes 0, we have [ { } ] ( ) A E P p( Σ) p P p( E 6 * Q Σ) kkllαα + 8 * Qkkllαα Q kkllββ p k,l,α k,l,α,β +9 * Qkkllαα Q kkββγγ + * Qkkllαα Q ββγγδδ where k,l,α,β,γ k,l,α,β,γ,δ ( ) [6 P E[A ] + 8 P 4 E[A ] + 9 P 5 E[A ]] P { } P 6 + ( P ) ( ) [6 P E[A ] + 8 P 4 E[A ] + 9 P 5 E[A ]] P ( 5 + 0), (7) P E[A ] E[Q ] { p( Σ) p }, E[A ] E[Q Q 44 ] { p( Σ) p }, E[A ] E[Q Q 4455 ] { p( Σ) p }. Using the same calculation method, the second-fourth terms in the right-hand side of the inequality (6) can be written as [ { } ] ( ) B E P 4 p( [ P 4 E[B ] + P 4 E[B ] + 4 P 5 E[B ] + 4 P 5 E[B 4 ] Σ) p P 4 + P 6 E[B 5 ] + P 6 E[B 6 ]], (8) [ { } ] ( ) C E P 5 p( [4 P 5 E[C ] + 8 P 5 E[C ] + 8 P 5 E[C ] + 4 P 5 E[C 4 ] Σ) p P 5 +4 P 6 E[C 5 ] + 8 P 6 E[C 6 ] + 8 P 6 E[C 7 ] + 4 P 6 E[C 8 ]], (9) [ { } ] ( ) D E P 6 p( [6 P 6 E[D ] + 4 P 6 E[D ] + 4 P 6 E[D ] Σ) p P 6 +6 P 6 E[D 4 ]], (0)
17 7 where E[B ] E[Q 4] { p( Σ) p }, E[B ] E[Q 4Q 4 ] { p( Σ) p }, E[B ] E[Q 4Q 554 ] { p( Σ) p }, E[B 4 ] E[Q 4Q 554 ] { p( Σ) p }, E[B 5 ] E[Q 4Q ] { p( Σ) p }, E[B 6 ] E[Q 4Q ] { p( Σ) p }, E[C ] E[Q 45] { p( Σ) p }, E[C ] E[Q 45Q 54 ] { p( Σ) p }, E[C ] E[Q 45Q 45 ] { p( Σ) p }, E[C 4 ] E[Q 45Q 54 ] { p( Σ) p }, E[C 5 ] E[Q 45Q 6645 ] { p( Σ) p }, E[C 6 ] E[Q 45Q 6654 ] { p( Σ) p }, E[C 7 ] E[Q 45Q 6645 ] { p( Σ) p }, E[C 8 ] E[Q 45Q 6654 ] { p( Σ) p }, E[D ] E[Q 456] { p( Σ) p }, E[D ] E[Q 456Q 465 ] { p( Σ) p }, E[D ] E[Q 456Q 465 ] { p( Σ) p }, E[D 4 ] E[Q 456Q 465 ] { p( Σ) p }. From Lemma 5, we have E[A ] O(), E[B ] O(), E[C ] O(), E[D ] O(). It follows from Cauchy-Schwarz inequality that E[A j ] E[A ] (j,, 4), E[B j ] E[B ] (j,..., 6), E[C j ] E[C ] (j,..., 8), E[D j ] E[D ] (j,, 4). From them, we find that E[A j ] O() (j,, 4), E[B j ] O() (j,..., 6), E[C j ] O() (j,..., 8), E[D j ] O() (j,, 4). Combining these results with (7), (8), (9) and (0), it is found that [ { } ] A E P p( O( ), Σ) p [ { } ] B E P 4 p( O( ), Σ) p [ { } ] C E P 5 p( O( 4 ), Σ) p [ { } ] C E P 6 p( O( 6 ), Σ) p and so which converges to 0 as, p. ( ) S Var p( O( ), Σ) p
18 8 References [] Dudoit, S, Fridlyand, J. and Speed, T.P. (00). Comparison of discrimination methods for the classification of tumors using gene expression data. J. Amer. Statist. Assoc., 97, [] Fujikoshi, Y., Ulyanov, V.V. and Shimizu, R. (00). Multivariate Statistics High-Dimensional and Large-Sample Approximations, John Wiley & Sons, Hoboken, J. [] Henze,. (00). Invariant tests for multivariate normality: a critical review. Statist. Papers, 4, [4] Himeno, T. and Yamada, T. (04). Estimations for some functions of covariance matrix in high dimension under non-normality and its applications. J. Multivariate Anal., 0, [5] Jarque, C.M. and Bera, A.K. (987). A test for normality of observations and regression residuals. Internat. Statist. Rev., 55, 6 7. [6] Kubokawa, T. and Srivastava, M.S. (008). Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data. J. Multivariate Anal., 99, [7] Schott, J.R. (005). Matrix Analysis for Statistics, nd ed., John Wiley & Sons, Hoboken, J. [8] Srivastava, M.S. (984). A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statist. Prob. Lett.,, [9] Srivastava, M.S. (00). Methods of Multivariate Statistics, John Wiley & Sons, Y. [0] Mardia, K.V. (970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, [] Mecklin, C.J. and Mundfrom, D.J. (004). An appraisal and bibliography of tests for multivariate normality. Internat. Statist. Rev., 7, 8.
SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan
A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,
More informationStatistical Inference On the High-dimensional Gaussian Covarianc
Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional
More informationOn testing the equality of mean vectors in high dimension
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.
More informationMULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA
J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets
More informationA Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data
A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationConsistency of test based method for selection of variables in high dimensional two group discriminant analysis
https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai
More informationConsistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis
Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University
More informationConsistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,
More informationTesting Block-Diagonal Covariance Structure for High-Dimensional Data
Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School
More informationJackknife Empirical Likelihood Test for Equality of Two High Dimensional Means
Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.
More informationOn the Testing and Estimation of High- Dimensional Covariance Matrices
Clemson University TigerPrints All Dissertations Dissertations 1-009 On the Testing and Estimation of High- Dimensional Covariance Matrices Thomas Fisher Clemson University, thomas.j.fisher@gmail.com Follow
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationApproximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work
Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto
More informationT 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem
T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School
More informationSOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM
SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM Junyong Park Bimal Sinha Department of Mathematics/Statistics University of Maryland, Baltimore Abstract In this paper we discuss the well known multivariate
More informationTESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST
Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department
More informationComparison of Discrimination Methods for High Dimensional Data
Comparison of Discrimination Methods for High Dimensional Data M.S.Srivastava 1 and T.Kubokawa University of Toronto and University of Tokyo Abstract In microarray experiments, the dimension p of the data
More informationMultivariate Normal-Laplace Distribution and Processes
CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationEPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large
EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationSparse Linear Discriminant Analysis With High Dimensional Data
Sparse Linear Discriminant Analysis With High Dimensional Data Jun Shao University of Wisconsin Joint work with Yazhen Wang, Xinwei Deng, Sijian Wang Jun Shao (UW-Madison) Sparse Linear Discriminant Analysis
More informationHigh-dimensional asymptotic expansions for the distributions of canonical correlations
Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic
More informationOn the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors
On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University
More informationEmpirical Likelihood Tests for High-dimensional Data
Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on
More informationTesting Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality
Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality Yuki Yamada, Masashi Hyodo and Takahiro Nishiyama Department of Mathematical Information Science, Tokyo University
More informationA Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report -
A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report - Edgar Brunner and Marius Placzek University of Göttingen, Germany 3. August 0 . Statistical Model and Hypotheses Throughout
More informationHigh-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi
High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,
More informationTesting linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices
Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical
More informationTesting for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions
Hong Kong Baptist University HKBU Institutional Repository Department of Economics Journal Articles Department of Economics 1998 Testing for a unit root in an ar(1) model using three and four moment approximations:
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationarxiv: v1 [math.st] 2 Apr 2016
NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationOn the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model
Journal of Multivariate Analysis 70, 8694 (1999) Article ID jmva.1999.1817, available online at http:www.idealibrary.com on On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationHYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX
Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan
More information17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients
17: INFERENCE FOR MULTIPLE REGRESSION Inference for Individual Regression Coefficients The results of this section require the assumption that the errors u are normally distributed. Let c i ij denote the
More informationMore Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction
Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order
More informationTesting for Granger Non-causality in Heterogeneous Panels
Testing for Granger on-causality in Heterogeneous Panels Elena-Ivona Dumitrescu Christophe Hurlin December 211 Abstract This paper proposes a very simple test of Granger (1969 non-causality for heterogeneous
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationA Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common
Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationJournal of Multivariate Analysis. Use of prior information in the consistent estimation of regression coefficients in measurement error models
Journal of Multivariate Analysis 00 (2009) 498 520 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Use of prior information in
More informationKRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS
Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting
More informationInferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data
Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone
More informationConfidence Intervals for the Process Capability Index C p Based on Confidence Intervals for Variance under Non-Normality
Malaysian Journal of Mathematical Sciences 101): 101 115 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Confidence Intervals for the Process Capability
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationTesting a Normal Covariance Matrix for Small Samples with Monotone Missing Data
Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2016-0063R2 Title Two-sample tests for high-dimension, strongly spiked eigenvalue models Manuscript ID SS-2016-0063R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI
More informationStochastic Design Criteria in Linear Models
AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationMULTIVARIATE ANALYSIS OF VARIANCE
MULTIVARIATE ANALYSIS OF VARIANCE RAJENDER PARSAD AND L.M. BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 lmb@iasri.res.in. Introduction In many agricultural experiments,
More informationRobust Backtesting Tests for Value-at-Risk Models
Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society
More informationSimulating Uniform- and Triangular- Based Double Power Method Distributions
Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationEmpirical Power of Four Statistical Tests in One Way Layout
International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo
More informationA new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p.
Title A new test for the proportionality of two large-dimensional covariance matrices Authors) Liu, B; Xu, L; Zheng, S; Tian, G Citation Journal of Multivariate Analysis, 04, v. 3, p. 93-308 Issued Date
More informationA Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance
CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200
More informationMembership Statistical Test For Shapes With Application to Face Identification
IEEE PROCEEDIGS, VOL., O. Y, MOTH 004 1 Membership Statistical Test For Shapes With Application to Face Identification Eugene Demidenko Abstract We develop a statistical test to determine if shape belongs
More informationIntermediate Econometrics
Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationCONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES
REVSTAT Statistical Journal Volume 13, Number, June 015, 131 144 CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES Authors: Robert Garthoff Department of Statistics, European University, Große Scharrnstr.
More informationHypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA
Hypothesis Testing Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA An Example Mardia et al. (979, p. ) reprint data from Frets (9) giving the length and breadth (in
More informationUnit Roots in White Noise?!
Unit Roots in White Noise?! A.Onatski and H. Uhlig September 26, 2008 Abstract We show that the empirical distribution of the roots of the vector auto-regression of order n fitted to T observations of
More informationVector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.
Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationarxiv:math/ v1 [math.pr] 9 Sep 2003
arxiv:math/0309164v1 [math.pr] 9 Sep 003 A NEW TEST FOR THE MULTIVARIATE TWO-SAMPLE PROBLEM BASED ON THE CONCEPT OF MINIMUM ENERGY G. Zech and B. Aslan University of Siegen, Germany August 8, 018 Abstract
More informationEstimating Individual Mahalanobis Distance in High- Dimensional Data
CESIS Electronic Working Paper Series Paper No. 362 Estimating Individual Mahalanobis Distance in High- Dimensional Data Dai D. Holgersson T. Karlsson P. May, 2014 The Royal Institute of technology Centre
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationCointegration Lecture I: Introduction
1 Cointegration Lecture I: Introduction Julia Giese Nuffield College julia.giese@economics.ox.ac.uk Hilary Term 2008 2 Outline Introduction Estimation of unrestricted VAR Non-stationarity Deterministic
More informationTest for Parameter Change in ARIMA Models
Test for Parameter Change in ARIMA Models Sangyeol Lee 1 Siyun Park 2 Koichi Maekawa 3 and Ken-ichi Kawai 4 Abstract In this paper we consider the problem of testing for parameter changes in ARIMA models
More informationDiscussion of High-dimensional autocovariance matrices and optimal linear prediction,
Electronic Journal of Statistics Vol. 9 (2015) 1 10 ISSN: 1935-7524 DOI: 10.1214/15-EJS1007 Discussion of High-dimensional autocovariance matrices and optimal linear prediction, Xiaohui Chen University
More informationThomas J. Fisher. Research Statement. Preliminary Results
Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these
More informationSIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA
SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,
More informationOn the conservative multivariate multiple comparison procedure of correlated mean vectors with a control
On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,
More informationLecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationYUN WU. B.S., Gui Zhou University of Finance and Economics, 1996 A REPORT. Submitted in partial fulfillment of the requirements for the degree
A SIMULATION STUDY OF THE ROBUSTNESS OF HOTELLING S T TEST FOR THE MEAN OF A MULTIVARIATE DISTRIBUTION WHEN SAMPLING FROM A MULTIVARIATE SKEW-NORMAL DISTRIBUTION by YUN WU B.S., Gui Zhou University of
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationJournal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error
Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA
More information6-1. Canonical Correlation Analysis
6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another
More informationEmpirical Economic Research, Part II
Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction
More informationTESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION
TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationApplication of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption
Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationMultivariate Time Series
Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationOn the Power of Tests for Regime Switching
On the Power of Tests for Regime Switching joint work with Drew Carter and Ben Hansen Douglas G. Steigerwald UC Santa Barbara May 2015 D. Steigerwald (UCSB) Regime Switching May 2015 1 / 42 Motivating
More informationTESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES
Sankhyā : The Indian Journal of Statistics 2001, Volume 63, Series B, Pt. 3, pp 298-310 TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES By JOACHIM HARTUNG and
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationOn Selecting Tests for Equality of Two Normal Mean Vectors
MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationRegression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,
Regression Analysis The multiple linear regression model with k explanatory variables assumes that the tth observation of the dependent or endogenous variable y t is described by the linear relationship
More information