Estimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality

Similar documents
SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

Statistical Inference On the High-dimensional Gaussian Covarianc

On testing the equality of mean vectors in high dimension

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

On the Testing and Estimation of High- Dimensional Covariance Matrices

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

Comparison of Discrimination Methods for High Dimensional Data

Multivariate Normal-Laplace Distribution and Processes

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Sparse Linear Discriminant Analysis With High Dimensional Data

High-dimensional asymptotic expansions for the distributions of canonical correlations

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors

Empirical Likelihood Tests for High-dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality

A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report -

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions

Analysis of variance, multivariate (MANOVA)

arxiv: v1 [math.st] 2 Apr 2016

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model

Lecture 11. Multivariate Normal theory

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

Testing for Granger Non-causality in Heterogeneous Panels

1 Appendix A: Matrix Algebra

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Asymptotic Statistics-VI. Changliang Zou

Journal of Multivariate Analysis. Use of prior information in the consistent estimation of regression coefficients in measurement error models

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Confidence Intervals for the Process Capability Index C p Based on Confidence Intervals for Variance under Non-Normality

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Quick Review on Linear Multiple Regression

Statistica Sinica Preprint No: SS R2

Stochastic Design Criteria in Linear Models

Multivariate Analysis and Likelihood Inference

MULTIVARIATE ANALYSIS OF VARIANCE

Robust Backtesting Tests for Value-at-Risk Models

Simulating Uniform- and Triangular- Based Double Power Method Distributions

Multivariate Regression

HANDBOOK OF APPLICABLE MATHEMATICS

Empirical Power of Four Statistical Tests in One Way Layout

A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p.

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

Membership Statistical Test For Shapes With Application to Face Identification

Intermediate Econometrics

Estimation of large dimensional sparse covariance matrices

CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Unit Roots in White Noise?!

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Regression and Statistical Inference

arxiv:math/ v1 [math.pr] 9 Sep 2003

Estimating Individual Mahalanobis Distance in High- Dimensional Data

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Cointegration Lecture I: Introduction

Test for Parameter Change in ARIMA Models

Discussion of High-dimensional autocovariance matrices and optimal linear prediction,

Thomas J. Fisher. Research Statement. Preliminary Results

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control

Lecture Note 1: Probability Theory and Statistics

YUN WU. B.S., Gui Zhou University of Finance and Economics, 1996 A REPORT. Submitted in partial fulfillment of the requirements for the degree

Chapter 7, continued: MANOVA

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

6-1. Canonical Correlation Analysis

Empirical Economic Research, Part II

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

On corrections of classical multivariate tests for high-dimensional data

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

1 Least Squares Estimation - multiple regression.

Multivariate Time Series

Multivariate Distributions

On the Power of Tests for Regime Switching

TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES

Large Sample Properties of Estimators in the Classical Linear Regression Model

On Selecting Tests for Equality of Two Normal Mean Vectors

Generalized Linear Models (GLZ)

High-dimensional two-sample tests under strongly spiked eigenvalue models

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Transcription:

Estimation of multivariate rd moment for high-dimensional data and its application for testing multivariate normality Takayuki Yamada, Tetsuto Himeno Institute for Comprehensive Education, Center of General Education Kagoshima University --0 Korimoto, Kagoshima 890-0065 Japan Faculty of Data Science Shiga University -- Banba, Hikone, Shiga 5-85 Japan Abstract This paper is concerned with the multivariate rd moment and its estimation. Mardia (970) and Srivastava (984) proposed the multivariate skewness and its estimator, independently. However, these estimators can not defined for the case in which the dimension p is larger than the sample size. In this paper, we treat the multivariate rd moment γ which is defined by using Hadamard product of observation vectors, and propose an estimate of γ which is well defined when p >. Based on the estimator, we propose new test for multivariate normality. Simulation results revealed that our proposed test has good accuracy. Keywords: Multivariate rd moment, Hadamard product, testing multivariate normality, (n, p)- asymptotic AMS 00 subject classification: Primary 6H5, Secondary 6E0 Introduction Let x and y be p-variate random vectors which are independently and identically distributed as a p-dimensional distribution F p (µ, Σ) with mean vector µ and covariance matrix Σ. Mardia (970) defined multivariate skewness and kurtosis as β M,p E[{(x µ) Σ (y µ)} ], β M,p E[{(x µ) Σ (x µ)} ], respectively. When p, β,p and β,p are reduced to the ordinal univariate squared skewness and kurtosis, respectively. Let x,..., x be a random sample drown from a population with distribution F p (µ, Σ). Mardia [0] proposed sample counterparts of β M,p and β M,p as respectively, where b M,p b M,p x j {(x i x) S (x j x)}, {(x i x) S (x i x)}, x i, S (x i x)(x i x). Srivastava [8] (as cited in a more recent book by this author [9]) defined other multivariate skewness β S,p and kurtosis β S,p, which are described as follows: Let Γ (γ,..., γ p ) be an orthogonal matrix such that Γ (γ,..., γ p ), where D λ diag(λ,..., λ p ). Let y l γ l x and θ l γ l µ for l,..., p. E-mail address:yamada@gm.kagoshima-u.ac.jp

Then β S,p and β S,p are given by β S,p p β S,p p { } E[(y l θ l ) ], λ / l E[(y l θ l ) 4 ] λ, l respectively. Srivastava [8] proposed sample counterparts of β S,p and β S,p as b S,p and b S,p, respectively. To describe them, let {/( )}S HD u H, where H (h,..., h p ) is an orthogonal matrix and D u diag(u,..., u p ). With being y li h lx i (l,..., p, i,..., ), b S,p and b S,p are given by b S,p p b S,p p { u / l u l } (y li ȳ l ), (y li ȳ l ) 4, respectively, where ȳ l y li. These sample counterparts of multivariate skewness and kurtosis are applied for testing multivariate normality, i.e., testing null hypothesis that population distribution is multivariate normal. Under multivariate normality, Mardia [0] showed that (/6)b M,p converges in distribution to χ (f M ), chisquared distribution with f M degrees of freedom, and {b M,p p(p + )}/{8p(p + )} / converges in distribution to (0, ). Here, f M p(p + )(p + )/6. Thus, if 6 b M,p χ f M ( α) () non-normality of the data is expected, where χ f M ( α) is the 00( α)% point of χ (f M ). Similarly, the normality is rejected if 8p(p + ) b M,p p(p + ) z α/, where z α/ is the 00( α/)% point of the standard normal distribution. Srivastava [8] showed that {p/6}b S,p converges in distribution to χ (p) under multivariate normality, and showed that (p/4) / (b S,p ) converges in distribution to (0, ). Based on these convergences, Srivastava [8] proposed testing criterion which rejects the normality of the data if and proposed the testing criterion that the normality is rejected if p 4 (b S,p ) z α/. p 6 b S,p χ p( α), () Some other methods to assess the multivariate normality have been studied. For a review of the results, see, e.g., Henze [] and Mecklin and Mundfrom []. In last years, we encounter more and more problems in applications when p is comparable with or even exceeds it. For example, financial data, consumer data, network data and medical data have this feature. When p >, it is impossible to define b M,p, b M,p, b S,p and b S,p, because the sample covariance matrix S becomes singular.

Instead of dealing with β M,p or β S,p, Himeno and Yamada [4] has treated κ E[{(x µ) (x µ)}] tr Σ (tr Σ). When Σ equals to identity matrix I p, κ is the same as the another definition of the Mardia [0] s multivariate kurtosis β M, p(p + ). Himeno and Yamada [4] gave the unbiased estimator of κ as the sample counterpart of κ, which is as follows: where ˆκ ( )( ) { tr S + (tr S) ( + )Q}, Q {(x x) (x x)}. It is noted that ˆκ is well defined for the case in which p >. Himeno and Yamada [4] showed that (/8) / p (ˆκ/â ) converges in distribution to (0, ) as and p go to infinity together under the following assumptions: () ( )/p converges to a positive constant as, p go to infinity; () a i tr Σ i /p converges to a positive constant as p, i,..., 4. Here, â is an unbiased estimator of a, which is as follows: â ( )( )( ) {( )( ) tr S + (tr S) ( ) Q}. They proposed testing criterion which rejects the normality of the data if ˆκ 8 p z α/. â It is reported in Himeno and Yamada [4] that the power of this test gets large as and p become large for local alternative hypothesis that population distribution is multivariate t. In this paper, we treat other multivariate rd moment γ instead of dealing with β M,p or β S,p. An estimator of γ is proposed that is well defined for the case in which p >. Based on ˆγ, we propose a new test for assessing multivariate normality. The present paper is organized as follows. In Section, we propose γ, and gave the estimator ˆγ. Asymptotic normality of ˆγ is shown under the asymptotic framework that, p for the case in which population distribution is multivariate normal. We give a testing criterion for multivariate normality based on ˆγ. Some results of small-scale simulation including to see attained significance level and empirical power are reported in Section. We give concluding remarks in Section 4. All technical proofs are relegated to the Appendix. Multivariate rd moment and its estimator Assume that x,..., x are independently and identically distributed (i.i.d.), and assume the following multivariate linear model: x i µ + Σ / ε i, () where ε,..., ε are i.i.d. as a p-dimensional distribution F p (0, I p ). For random vector x µ+σ / ε with ε F p (0, I p ), let γ E[{ (x µ)} p ] p( Σ) p, where the notation i A stands for the Hadamard product of i matrices A, i.e., i A A A A (i times). We note that i A is positive definite, which is guaranteed by Schur s product theorem (cf. Schott [7]). When p, γ is reduced to the univariate skewness. It is hard to obtain analytic form for

4 the unbiased estimator of γ. Instead of getting it, we construct the estimator by taking the ratio of unbiased estimator for the numerator of γ divided by for the denominator, which an estimator we proposed is as follows. E[{ ˆγ (x µ)} p ] T, p ( Σ) S p where T 4 * [ { x i P (x j + x k )}] p, S 8 P 6,k * p {(x k x l )(x k x l ) (x α x β )(x α x β ) (x γ x δ )(x γ x δ ) } p. k,l,α,β, γ,δ Here, the notation stands for the sum of all pairs of indices which are not equal. For example,,k j,j i k,k i,k j.. Asymptotic distribution of ˆγ and a test for γ Firstly, we show the asymptotic normality of ˆγ as, p under the assumption that F p (0, I p ) p (0, I p ). It is noted that T is invariant under the location shift. Hence, without loss of generality, we may assume µ 0. It can be described that T 4 { * [Σ / z i P (z j + z k )}] p.,k For deriving the anaclitic form of Var(T ), we decompose T as T + + (a lz i ) ( )( ) ( ),k l z i ) a lz j * a l z i a lz j a lz k { (a l z i ) a la l a lz i } ( ) ( )( ) T + T + T,,k * a l z i a lz j a lz k l z i ) a lz j ( )a la l a lz i where Σ / A (a,..., a p ). After much algebraic calculations, it is shown that Cov(T, T ) 0, Cov(T, T ) 0 and Cov(T, T ) 0, and so σ T Var(T ) Var(T ) + Var(T ) + Var(T ). We obtain analytic forms of Var(T ), Var(T ) and Var(T ), respectively, which are as follows. Var(T ) 6 p( Σ) p, (4) 8 Var(T ) ( ) p( Σ) p, (5) 4 Var(T ) ( )( ) p( Σ) p. (6)

5 These proofs are given in Appendix. From (4), (5) and (6), we have It is found that Var(T ) 6 ( )( ) p( Σ) p. Var(T ) σ T From Chebyshev inequality we have ( ) 0 (, p ), Var(T ) σ T 4 0 (, p ). These probability convergences imply that p p T 0 (, p ), T 0 (, p ). σ T σ T σ T (T T ) p 0 (, p ). And we can express that σ T T [ σ T { (a l z i ) a la l a lz i } ] η i σ T. It is shown that E[η i ] 0, and is shown in Appendix that ( ) ηi Var 6 σ T σt p( Σ) p, (7) which converges to as, p. By virtue of the ordinal central limit theorem, we find that the asymptotic distribution of σ T T is (0, ) as, p. Using Slutsly theorem, the asymptotic normality of T can be proved, which is given as the following theorem. Theorem. As, p, σ T T D (0, ). In Appendix, we prove the rate consistency for S/ p( Σ) p under the condition C; C : lim sup p p( Σ + ) p p( Σ) p <, where Σ + ( σ ij ) for Σ (σ ij ). The result is given as the following theorem. Theorem. Suppose that the condition C holds. Then, as, p, S p( Σ) p p. From these theorems (Theorem and Theorem ) and Slutsky s theorem, we find that 6 ˆγ D (0, ) (8) as, p under the condition that C holds. As an application, assessing the assumption of multivariate normality is considered by testing γ 0. The null hypothesis that γ 0 is rejected with significance level α for the case in which /6ˆγ > z α/, where z α is the 00 α% point of the standard normal distribution.

6 umerical results. Simulation In order to see the accuracy of the asymptotic normality (8), we tried a numerical simulation. It can be expressed that S A B + C D, (9) P P 4 P 5 P 6 where A C * p (x k x k x l x l x α x α) p, B k,l,α k,l,α,β,γ * p (x k x k x l x α x β x γ) p, D k,l,α,β k,l,α,β,γ,δ * p (x k x k x l x l x α x β) p, * p (x k x l x α x β x γ x δ) p. To compute A, it is needed to calculate triple sum, which is performed by triple-loop calculations. We see that B, C, D and T are needed to perform multi-loop calculations. Generally, it takes a lot time to carry out multi-loop calculations. To save computing time, we shall provide other expressions for A, B, C, D and T. Put Then, it holds that X ij ( i x k )( j x k ) (i, j,..., ), k { X 0i X i0 ( i x k ) p} (i,..., ), s k k ( k x i ) (k,..., ). T ( )( ) s p ( )( ) (s s ) p + ( )( ) ( s ) p, A [ p ( ] X ) X X + X p, B p [ ( X ) + 5X X 6X 4X 0 X X + 4X 0 X +X X + X 0 X 0 ( ] X ) X 0 X 0 X p, C [ p ( X ) 4X X + 4X + 6X 0 X X 4X 0 X 8X X 4X 0 X 0 ( X ) + 8X 0 X 0 X 4X 0 ( X 0 ) X + 4( X 0 ) X + 4X 0 X X 0 4X X 0 X 0 ( X 0 ) X +X 0 X X 0 + ( X 0 ) ( X 0 ) X ] p, D {( s ) p (s s ) p + s p } 6A 8B 9C. One can see that these expressions have no multi-sum expression. Generate the data based on the model (). Firstly, we checked the asymptotic normality of ˆγ given in (8). Make 0 4 samples of the size, each of which is constructed by p-dimensional vectors which are i.i.d. as p (0, Σ). In this study, the following two cases for Σ are treated: () Σ I p ; () Σ has Toeplitz structure such that Σ Σ T D / P D /, where D diag(d,..., d p ) with d i 5+( ) i (p i+)/p, and P (0. i j ). We note that these two cases satisfy the condition C. Figure shows Q-Q plot of /6ˆγ when Σ I p for the case in which (, p) (60, 60), and Figure shows Q-Q plot when Σ Σ T. We can see the goodness of fit of the standard normal distribution to /6ˆγ from these figures. This numerical simulations were carried out for several other dimensions

ormal Q Q Plot 4 4 0 Sample Quantiles 0 Sample Quantiles 4 4 ormal Q Q Plot 4 0 4 4 Theoretical Quantiles Figure : Q-Q plot of 0γ when Σ I 0 0 4 Theoretical Quantiles Figure : Q-Q plot of 0γ when Σ ΣT as well, p 0, 40, 480, and also for other sample sizes, 0, 40, 480, but due to similarity of the graphs, only a selection is reported here. ext, we checked the attained significance level(asl) for testing the null hypothesis that H0 : Fp (0, I p ) p (0, I p ) under the assumption of model () by using the following criterion: /6γ > z α/ reject the null hypothesis that H0 : Fp (0, I p ) p (0, I p ). (0) To compute ASL, Monte Carlo simulation with 04 replication was done for the nominal level α 0.05. We computed for the case in which 60,0, 40, 480, and p 60, 0, 40, 480, and wrote these values in Table. The settings of Σ treated in this paper are as follows. Case. Σ I p. Case. Σ ΣT. Case. Generate a sparse positive definite covariance matrix Σ by Algorithm. Set sparsity level Algorithm The algorithm for generating sparse covariance matrix Σ : Construct p p matrix R (rij ) as follows. For each i < j, with probability ( s) 0.75, Unif(0, ) Unif(, 0) with probability ( s) 0.75, rij 0 with probability s, where s is the level of sparsity in R. Then, we set rji rij to obtain symmetry. Furthermore, set the diagonal elements of R to equal. : Prepare a p p diagonal matrix D (dij ) such that d,..., dpp are i.i.d. chi-squared distribution with degree of freedom. : Create A DRD. 4: In order to obtain positive definiteness of Σ, calculate the minimum eigenvalue emin of A. Set { A + ( emin + 0.)I p if emin 0, Σ A otherwise. s 0.7. Case 4. Use the same generating method as Case with sparsity level s 0.4. Case 5. Use the same generating method as Case with sparsity level s 0.. For Case -5, we repeat the Monte Carlo simulation 0 times, and wrote the average of these 0 values in the table. The value in parenthesis is the standard error. 7

8 Table : Attained significance level of the proposed test p Case Case Case (S.E) Case 4(S.E) Case 5(S.E) 60 0.058 0.058 0.056(0.00) 0.056(0.006) 0.056(0.004) 60 0 0.055 0.055 0.056(0.004) 0.056(0.00) 0.056(0.000) 40 0.055 0.05 0.056(0.005) 0.056(0.00) 0.056(0.00) 480 0.056 0.050 0.056(0.005) 0.056(0.00) 0.056(0.00) 60 0.05 0.055 0.05(0.00) 0.05(0.004) 0.05(0.00) 0 0 0.05 0.055 0.05(0.00) 0.05(0.00) 0.05(0.009) 40 0.055 0.05 0.05(0.00) 0.05(0.00) 0.05(0.004) 480 0.054 0.050 0.05(0.00) 0.05(0.00) 0.05(0.00) 60 0.054 0.05 0.05(0.005) 0.05(0.00) 0.05(0.00) 40 0 0.05 0.05 0.05(0.005) 0.05(0.000) 0.05(0.000) 40 0.050 0.049 0.05(0.00) 0.05(0.008) 0.05(0.000) 480 0.049 0.050 0.05(0.000) 0.05(0.00) 0.05(0.000) 60 0.05 0.05 0.05(0.00) 0.05(0.00) 0.05(0.00) 480 0 0.05 0.054 0.05(0.00) 0.05(0.00) 0.05(0.00) 40 0.050 0.049 0.05(0.000) 0.05(0.00) 0.05(0.00) 480 0.049 0.049 0.050(0.00) 0.05(0.004) 0.05(0.00) We can see from Table that ASL of our statistic is little affeected by the sparsity of the covariance matrix. It is observed that the accuracy of the approximation gets better as and p become large. Thirdly, we checked low-dimensional performance of the proposed test. The setting of dimensionality is p 0. We compare ASL of our proposed test with the one of Mardia [0] s test () and the one of Srivastava [8] s test () for the case in which 40, 50, 60, 70, 80, 90, 00 and Σ I 0. Computing methodology is the same as the one in Table. The symbol P denotes the value of ASL for proposed test (0), M denotes for Mardia [0] s test (), and S denotes for Srivastava [8] s test (). We can see that ASL of our proposed test overestimate, whereas ASLs of Mardia [0] s test and Srivastava [8] s test underestimate. Our approximation seems to be good for the case in which 70, whereas other two approximations are not good. The precisions of all three tests gets better as becomes large. ASL 0.00 0.0 0.04 0.06 P P P P P P P S S S S S S S M M M M M M M 40 50 60 70 80 90 00 Figure : Comparison of ASL when p 0 Finally, we checked the empirical power(emp) of the proposed test by Monte Carlo simulation.

9 The setting of the local alternative hypothesis treated in this paper is as follows: For ε (ε,..., ε p ) F p (0, I p ), ν ( ci ) ε i ν (i,..., p), where c,..., c p are i.i.d. as the chi-squared distribution with ν degrees of freedom. We computed the EMP based on 0 repetition for the case in which α 0.05 and ν 0000. The settings of and p are the same as the ones in Table. We carried out for all models of Σ written in Table, but due to similarity of the tendencies, Case -5 are omitted here. We can see from Table that EMP are monotone increasing for p and. Table : Empirical power of the proposed test for Case \ p 60 0 40 480 60 0. 0.7 0.9 0.50 0 0.6 0.9 0.5 0.78 40 0. 0.5 0.78 0.98 480 0.49 0.78 0.98.00. Real data analysis We applied our test to data-set which is used in Kubokawa and Srivastava [6]. The first applied data is colon data which is constructed by 000 (p) genes expression levels on normal colon tissues and 40 tumor colon tissues. We preprocessed the data by applying 0 logarithmic transformation. The second applied data-set is leukemia data which is constructed by 57(p) gene expressions on 47 patients suffering from acute lymphoblastic leukemia (ALL, 47 cases) and 5 patients suffering from acute myeloid leukemia (AML, 5 cases). The data-set are preprocessed by following protocol written in Dudoit et al. []. For the colon data, the p-value of our test is 0.000 for normal colon tissues and 0.000 for tumor colon tissues. For the leukemia data, we observed that both the p-value of our test for ALL and for AML are 0.000 for the leukemia data. These results indicate that the multivariate normality assumption on both sets of data cannot be rejected at the usual significance level 5%. 4 Concluding remarks This paper treats multivariate rd moment γ which is defined by using Hadamard product of observation vectors. When the dimension equals to, γ becomes univariate skewness. An estimator of γ, denoted as ˆγ, which is well defined for p > is proposed. Based on ˆγ, testing criterion for assessing multivariate normality is given. We prepare the expression of the testing statistic without having multi-sum. Calculating time becomes shorten. The performance of the testing criterion, in terms of the attained significance level and the empirical power, is shown through simulations for several setting of sample sizes and dimensions. We apply our proposed test to microarray data sets, some of the most popular are in high-dimensional data. To assess the normality of high-dimensional data, we recommend to use not only our proposed test but also Himeno and Yamada [4] s test based on ˆκ together. Omnibus test using ˆγ and ˆκ, which imitates Jarque and Bera [5] s test, is a future work. A Moment of statistic In this section, analytic forms of Var(T ), Var(T ) and Var(T ) are proposed. We derive them by using these results (Lemma ), proofs of which are tedious but simple, therefore skipped.

0 Lemma. Let z be a random vector distributed as p (0, I p ), and a and b be constant vectors. Then, E[(a z) ] a a, E[(a z) 4 ] (a a), E[(a z) (b z) ] (a b) + a ab b, E[(a z) b z] a aa b, E[(a z) 6 ] 5(a a), E[(a z) (b z) ] 6(a b) + 9a aa bb b. Here, we write analytic forms of Var(T ), Var(T ) and Var(T ) in the following lemma. Lemma. For T, T and T, as defined in Section., Proof. In Section, we expressed that where Var(T ) 6 p( Σ) p, 8 Var(T ) ( ) p( Σ) p, 4 Var(T ) ( )( ) p( Σ) p. η i T η i, { (a l z i ) a la l z } lz i. Since η,..., η are i.i.d., it holds that ( Var(T ) Var(η) Var where z p (0, I p ). We find that E[ζ] 0, and so ) {(a lz) a la l a lz}, Var(ζ) E[ζ ] [ E {(a lz) a la l a lz} + * {(a l z) a la l a lz}{(a αz) a αa α a αz} l,α [ {(5 8 + 9)(a la l ) } + * {6(a l a l ) + (9 9 9 + 9)a la l a la α a αa α } l,α 6 (a la l ) + l a α ) 6 p( A ) p 6 p( Σ) p, l,α

where the third equality follows from Lemma. Since E[T ] 0, we have Var(T ) E[T ] 9 ( ) E * (a l z i ) 4 (a lz j ) ( )a la l a lz i + * * (a l z i ) 4 (a lz j ) ( )a la l a lz i l,α * (a α z i ) 4 (a αz j ) ( )a αa α a αz i 9 ( ) E l z i ) 4 (a lz j ) + l z i ) (a lz j ) (a lz k ) l,α +( ) (a la l ) (a lz i ) ( )a * la l (a l z i ) (a lz j ) + * l z i ) (a αz i ) a lz j a αz j + l z i ) a lz j a αz j (a αz k ) ( )a * αa α (a l z i ) a lz j a αz j ( )a * la l (a α z i ) a αz j a lz j +( ) a la l a αa α 9 ( ) a lz i a αz i }],k,k [ { ( ) + ( )( ) + ( ) ( ) } (a la l ) + * { ( ){(a la α ) + a la l a αa α }a la α l,α + {( )( ) ( ) ( ) + ( ) }a la l a la α a αa α ] 9 ( ( ) ) 8 ( ) p( A ) p 8 ( ) p( Σ) p, (a la l ) + ( ) l a α ) where the fourth equality follows from Lemma. l,α

Since E[T ] 0, we have Var(T ) E[T ] 4 ( ) ( ) E * a l z i a lz j a lz k l,α,k,k + * * a l z i a lz j a lz k * a α z i a αz j a αz k 4 ( ) ( ) E 6 +6,k,k * * a l z i z ia α a lz j z ja α a lz k z ka α l,α,k 4 ( )( ) (a la l ) + 4 ( )( ) p( A ) p 4 ( )( ) p( Σ) p, where the fourth equality follows from Lemma. B Proof of Theorem l z i ) (a lz j ) (a lz k ) l a α ) In this section, we give a proof of Theorem. Before proving it, we prepare three lemmas. The first two lemmas (Lemma and Lemma 4) treat formula about multi-sum, and the last lemma treats the order of the expectations which is used to prove Theorem. Since proofs of Lemma and Lemma 4 are tedious but simple, we skipped here. Lemma. The following equations hold: ij σstσ is σ jt tr{( Σ)Σ} tr DΣ( Σ)ΣD tr D( Σ)Σ( Σ) p( Σ) p,s,t ij σ st σ is σ jt σ it σ js,s,t + p{d ( Σ)} p j s t l,α ij σis,s ij σ is σ jj σjs,s σ ij σ st σ is σ jt σ it σ js tr D( Σ)Σ( Σ) tr D( Σ)Σ( Σ) + p{d ( Σ)} p ii σijσ isσ js,,s ii σijσ isσ js.,s

Lemma 4. The following equations hold: ij σis p( Σ) p p{d ( Σ)} p ii σij 6 ij,,s ii σ ij σ is σjs tr DΣ( Σ)ΣD p{d ( Σ)} p ii σij ii σijσ jj,,s ii σijσ isσ js tr D( Σ)Σ( Σ) p{d ( Σ)} p,s ii σij ii σijσ 4 jj. Lemma 5. Let Q klαβγδ a iz k z l a j a i z α z β a j a iz γ z δa j, j where Σ / A (a,..., a p ), and z k, z l, z α, z β, z γ and z δ are i.i.d. as p (0, I p ). Then, E[Q ] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js, () j s t E[Q 4] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js, () j s t E[Q 45] O ( { p( Σ) p } ), () E[Q 456] O ( { p( Σ) p } ). (4) In addition, under the assumption C, E[Q ] O ( { p( Σ) p } ), E[Q 4] O ( { p( Σ) p } ). Proof. Since (), () and (4) can be proved by using the same way to show (), we only show this, and omit other three here. It can be expressed that [ E[Q ] E (a iz z a i a iz z a i a iz z a i ) + i z z a i a iz z a i a iz z a i )(a jz z a j a jz z a j a jz z a j ) + i z z a j a iz z a j a iz z a j ) + 4 i z z a j a iz z a j a iz z a j )(a iz z a s a iz z a s a iz z a s ),s + i z z a j a iz z a j a iz z a j )(a sz z a t a sz z a t a sz z a t ) + 4 +,s,t i z z a i a iz z a i a iz z a i )(a iz z a s a iz z a s a iz z a s ) i z z a i a iz z a i a iz z a i )(a jz z a s a jz z a s a jz z a s )

4 {(a ia i ) } + + 4 * {a i a i a ja j + (a ia j ) } + * {a i a i a ja j + (a ia j ) } i a i a ja s + a ia j a ia s ) +,s i a j a sa t + a ia s a ja t + a ia t a ja s ),s,t + 4 i a i a ia j ) + i a i a ja s + a ia j a ja s ),s 7 σii 6 + ii σjj + 8 ii σijσ jj + 6 ii σijσ 4 jj + 4 6 ij,s + 08 ii σij + 48 ij σis + 6 ii σjs + 6 ii σ ij σ is σjs + 7 ii σijσ isσ js +,s,s ij σst + 8,s,t,s ij σstσ is σ jt + 6,s,t ij σ is σ it σ js σ jt σ st,s,t { p( Σ) p } + 4 σii 6 + 8 ii σijσ jj + 6 ii σijσ 4 jj + 8 6 ij + 96 ii σij + 6 ij σis + 6 ii σ ij σ is σjs + 7 ii σ js σijσ is + 8,s ij σstσ is σ jt + 6,s,t,s ij σ is σ it σ js σ jt σ st,s,t { p( Σ) p } + 8 tr{( Σ)Σ} 8 tr DΣ( Σ)ΣD 6 tr D( Σ)Σ( Σ) 8 p( Σ) p + 48 p{d ( Σ)} p + 4 σii 6 + 8 ii σijσ jj + 6 ii σijσ 4 jj + 8 6 ij + 96 ii σij + 8 ij σis + 8 ii σ ij σ is σjs + 6 ii σijσ isσ js + 6 j s t,s σ ij σ is σ it σ js σ jt σ st,s,s { p( Σ) p } + 8 tr{( Σ)Σ} 4 p{d ( Σ)} p + 4 + 6 j s t σ ij σ is σ it σ js σ jt σ st { p( Σ) p } + 8 tr{( Σ)Σ} + 6 j s t,s σ ij σ is σ it σ js σ jt σ st, σii 6 + 4 ii σij where the second equality follows from Lemma, the third equality from bottom follows from Lemma and the second equality from bottom follows from Lemma 4. Since Σ is positive definite, we have tr{( Σ)Σ} {tr( Σ)Σ}, where the right-hand side of the inequality can be written as { p( Σ) p }. From this result, it is found that E[Q ] O max { p( Σ) p }, σ ij σ st σ is σ jt σ it σ js. j s t

5 ote that σ ij σ st σ is σ jt σ it σ js σ ij σ st ( σ is ) ( σ jt ) σ js σ it From this inequality, we have σ ij σ st σ is σ jt σ it σ js j s t Hence under the assumption C, j s t σ ij σ st σ is σ jt + σsj σ ti σ is σ jt, j s t σ ij σ st σ is σ jt σ it σ js [ tr{( Σ + )Σ + } + tr{( Σ + )Σ + } ] [ p( Σ + ) p ]. σ ij σ st σ is σ jt σ it σ js O ( { p( Σ) p } ), and so E[Q ] O ( { p( Σ) p } ). Proof of Theorem. Since S is unbiased estimator of p( Σ) p, it is sufficient to show that Var(S/ p( Σ) p ) 0 (5) as, p. Since S is invariant under location shift, without loss of generality, we may assume that µ 0. Each of A, B, C and D in (9) can be written as A B C D * p {(Σ / z k z kσ / ) (Σ / z l z lσ / ) (Σ / z α z ασ / )} p, k,l,α k,l,α,β k,l,α,β,γ k,l,α,β,γ,δ From (9), we can express that ( S/ p ( Σ) p ) * p {(Σ / z k z kσ / ) (Σ / z l z lσ / ) (Σ / z α z βσ / )} p, * p {(Σ / z k z kσ / ) (Σ / z l z ασ / ) (Σ / z β z γσ / )} p, * p {(Σ / z k z lσ / ) (Σ / z α z βσ / ) (Σ / z γ z δσ / )} p. [{ } A P p( B Σ) p P 4 p( + C Σ) p P 5 p( D Σ) p P 6 p( Σ) p [ { } ( ) ( ) ( ) ( A B C 4 P p( + Σ) p P 4 p( + Σ) p P 5 p( Σ) p ( ) ( ) ] D + P 6 p(, Σ) p ] )

6 where the inequality follows from Jensen inequality. Thus, it is described that ( ) [ [ { } ] ( ) [ ( ) ] S A B Var p( 4 E Σ) p P p( + E Σ) p P 4 p( Σ) p ( ) [ ( ) ] ( ) [ ( ) ]] C D + E P 5 p( + E Σ) p P 6 p(. (6) Σ) p For the first term in the right-hand side of the inequality (6), expanding {.}, and excluding the term whose expectation becomes 0, we have [ { } ] ( ) A E P p( Σ) p P p( E 6 * Q Σ) kkllαα + 8 * Qkkllαα Q kkllββ p k,l,α k,l,α,β +9 * Qkkllαα Q kkββγγ + * Qkkllαα Q ββγγδδ where k,l,α,β,γ k,l,α,β,γ,δ ( ) [6 P E[A ] + 8 P 4 E[A ] + 9 P 5 E[A ]] P { } P 6 + ( P ) ( ) [6 P E[A ] + 8 P 4 E[A ] + 9 P 5 E[A ]] P ( 5 + 0), (7) P E[A ] E[Q ] { p( Σ) p }, E[A ] E[Q Q 44 ] { p( Σ) p }, E[A ] E[Q Q 4455 ] { p( Σ) p }. Using the same calculation method, the second-fourth terms in the right-hand side of the inequality (6) can be written as [ { } ] ( ) B E P 4 p( [ P 4 E[B ] + P 4 E[B ] + 4 P 5 E[B ] + 4 P 5 E[B 4 ] Σ) p P 4 + P 6 E[B 5 ] + P 6 E[B 6 ]], (8) [ { } ] ( ) C E P 5 p( [4 P 5 E[C ] + 8 P 5 E[C ] + 8 P 5 E[C ] + 4 P 5 E[C 4 ] Σ) p P 5 +4 P 6 E[C 5 ] + 8 P 6 E[C 6 ] + 8 P 6 E[C 7 ] + 4 P 6 E[C 8 ]], (9) [ { } ] ( ) D E P 6 p( [6 P 6 E[D ] + 4 P 6 E[D ] + 4 P 6 E[D ] Σ) p P 6 +6 P 6 E[D 4 ]], (0)

7 where E[B ] E[Q 4] { p( Σ) p }, E[B ] E[Q 4Q 4 ] { p( Σ) p }, E[B ] E[Q 4Q 554 ] { p( Σ) p }, E[B 4 ] E[Q 4Q 554 ] { p( Σ) p }, E[B 5 ] E[Q 4Q 55664 ] { p( Σ) p }, E[B 6 ] E[Q 4Q 55664 ] { p( Σ) p }, E[C ] E[Q 45] { p( Σ) p }, E[C ] E[Q 45Q 54 ] { p( Σ) p }, E[C ] E[Q 45Q 45 ] { p( Σ) p }, E[C 4 ] E[Q 45Q 54 ] { p( Σ) p }, E[C 5 ] E[Q 45Q 6645 ] { p( Σ) p }, E[C 6 ] E[Q 45Q 6654 ] { p( Σ) p }, E[C 7 ] E[Q 45Q 6645 ] { p( Σ) p }, E[C 8 ] E[Q 45Q 6654 ] { p( Σ) p }, E[D ] E[Q 456] { p( Σ) p }, E[D ] E[Q 456Q 465 ] { p( Σ) p }, E[D ] E[Q 456Q 465 ] { p( Σ) p }, E[D 4 ] E[Q 456Q 465 ] { p( Σ) p }. From Lemma 5, we have E[A ] O(), E[B ] O(), E[C ] O(), E[D ] O(). It follows from Cauchy-Schwarz inequality that E[A j ] E[A ] (j,, 4), E[B j ] E[B ] (j,..., 6), E[C j ] E[C ] (j,..., 8), E[D j ] E[D ] (j,, 4). From them, we find that E[A j ] O() (j,, 4), E[B j ] O() (j,..., 6), E[C j ] O() (j,..., 8), E[D j ] O() (j,, 4). Combining these results with (7), (8), (9) and (0), it is found that [ { } ] A E P p( O( ), Σ) p [ { } ] B E P 4 p( O( ), Σ) p [ { } ] C E P 5 p( O( 4 ), Σ) p [ { } ] C E P 6 p( O( 6 ), Σ) p and so which converges to 0 as, p. ( ) S Var p( O( ), Σ) p

8 References [] Dudoit, S, Fridlyand, J. and Speed, T.P. (00). Comparison of discrimination methods for the classification of tumors using gene expression data. J. Amer. Statist. Assoc., 97, 77 87. [] Fujikoshi, Y., Ulyanov, V.V. and Shimizu, R. (00). Multivariate Statistics High-Dimensional and Large-Sample Approximations, John Wiley & Sons, Hoboken, J. [] Henze,. (00). Invariant tests for multivariate normality: a critical review. Statist. Papers, 4, 467 506. [4] Himeno, T. and Yamada, T. (04). Estimations for some functions of covariance matrix in high dimension under non-normality and its applications. J. Multivariate Anal., 0, 7 44. [5] Jarque, C.M. and Bera, A.K. (987). A test for normality of observations and regression residuals. Internat. Statist. Rev., 55, 6 7. [6] Kubokawa, T. and Srivastava, M.S. (008). Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data. J. Multivariate Anal., 99, 906 98. [7] Schott, J.R. (005). Matrix Analysis for Statistics, nd ed., John Wiley & Sons, Hoboken, J. [8] Srivastava, M.S. (984). A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statist. Prob. Lett.,, 6 67. [9] Srivastava, M.S. (00). Methods of Multivariate Statistics, John Wiley & Sons, Y. [0] Mardia, K.V. (970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 59 50. [] Mecklin, C.J. and Mundfrom, D.J. (004). An appraisal and bibliography of tests for multivariate normality. Internat. Statist. Rev., 7, 8.