Bootstrapping Analogs of the One Way MANOVA Test

Size: px

Start display at page:

Download "Bootstrapping Analogs of the One Way MANOVA Test"

Gervase Mathews
6 years ago
Views:

1 Bootstrapping Analogs of the One Way MANOVA Test Hasthika S Rupasinghe Arachchige Don and David J Olive Southern Illinois University July 17, 2017 Abstract The classical one way MANOVA model is used to test whether the mean measurements are the same or differ across p groups, and assumes that each group has the same population covariance matrix This paper suggests using the Olive (2017abc) bootstrap technique to develop analogs of the one way MANOVA test The new tests can have some outlier resistance, and the tests do not need the population covariance matrices to be equal KEY WORDS: Behrens Fisher problem, bootstrap, prediction region, coordinatewise median David J Olive is Professor, Hasthika S Rupasinghe Arachchige Don is PhD student, Department of Mathematics, Southern Illinois University, Carbondale, IL 62901, USA 1

2 1 INTRODUCTION The multivariate linear model y i = B T x i + ɛ i for i = 1,, n has m 2 response variables Y 1,, Y m and p predictor variables x 1, x 2,, x p The ith case is (x T i, y T i ) = (x i1, x i2,, x ip, Y i1,, Y im ) The model is written in matrix form as Z = XB + E where the matrices are defined below The model has E(ɛ k ) = 0 and Cov(ɛ k ) = Σɛ = (σ ij ) for k = 1,, n Then the p m coefficient matrix B = [ ] β 1 β 2 β m and the m m covariance matrix Σ ɛ are to be estimated, and E(Z) = XB while E(Y ij ) = x T i β j The ɛ i are assumed to be independent and identically distributed (iid) The univariate linear model corresponds to m = 1 response variable, and is written in matrix form as Y = Xβ + e Subscripts are needed for the m univariate linear models Y j = Xβ j + e j for j = 1,, m where E(e j ) = 0 For the multivariate linear model, Cov(e i, e j ) = σ ij I n for i, j = 1,, m where I n is the n n identity matrix The n m matrix Z = [ Y 1 Y 2 Y m ] = The n p design matrix X of predictor variables is not necessarily of full rank p, and where often v 1 = 1 The p m matrix X = [ v 1 v 2 v p ] = B = [ β 1 β 2 β m ] y T 1 y T n x T 1 x T n The n m matrix E = [ e 1 e 2 e m ] = ɛ T 1 ɛ T n Considering the ith row of Z, X, and E shows that y T i = x T i B + ɛt i The multivariate linear regression model and one way MANOVA model are special cases of the multivariate linear model, but using double subscripts will be useful for describing the one way MANOVA model Suppose there are independent random samples of size n i from p different populations (treatments), or n i cases are randomly assigned to p treatment groups where n = p i=1 n i Assume that m response variables y ij = (Y ij1,, Y ijm ) T are measured for the ith treatment group and the jth case (often an individual or thing) in the group Hence i = 1,, p and j = 1,, n i The Y ijk follow different one way ANOVA models for k = 1,, m Assume E(y ij ) = µ i and Cov(y ij ) = 2

3 Σɛ Hence the p treatments have different mean vectors µ i, but common covariance matrix Σɛ The one way MANOVA test is used to test H 0 : µ 1 = µ 2 = = µ p Often µ i = µ + τ i, so H 0 becomes H 0 : τ 1 = = τ p If m = 1, the one way MANOVA model is the one way ANOVA model MANOVA is useful since it takes into account the correlations between the m response variables The Hotelling s T 2 test that uses a common covariance matrix is a special case of the one way MANOVA model with p = 2 Let µ i = µ + τ i where p i=1 n i τ i = 0 The jth case from the ith population or treatment group is y ij = µ + τ j + ɛ ij where ɛ ij is an error vector, i = 1,, p and j = 1,, n i Let y = ˆµ = p ni i=1 j=1 y ij /n be the overall mean Let y i = n i j=1 y ij /n i = ˆµ i so ˆτ i = y i y Let the residual vector ˆɛ ij = y ij y i = y ij ˆµ ˆτ i Then y ij = y + (y i y) + (y ij y i ) = ˆµ + ˆτ i + ˆɛ ij = ˆµ i + ˆɛ ij Several m m matrices will be useful Let S i be the sample covariance matrix corresponding to the ith treatment group Then the within sum of squares and cross products matrix is W = (n 1 1)S 1 + +( 1)S p = p i=1 j=1(y ij y i )(y ij y i ) T Then ˆΣɛ = W/(n p) The treatment or between sum of squares and cross products matrix is p B T = n i (y i y)(y i y) T i=1 The total corrected (for the mean) sum of squares and cross products matrix is T = B T + W = p ni i=1 j=1(y ij y)(y ij y) T Note that S = T/(n 1) is the usual sample covariance matrix of the y ij if it is assumed that all n of the y ij are iid so that the µ i µ for i = 1,, p The one way MANOVA model is y ij = µ i + ɛ ij where the ɛ ij are iid with E(ɛ ij ) = 0 and Cov(ɛ ij ) = Σɛ If all n of the y ij are iid with E(y ij ) = µ and Cov(y ij ) = Σɛ, it can be shown that A/df P Σɛ where A = W, B T, or T and df is the corresponding degrees of freedom Let t 0 be the test statistic Often Pillai s trace statistic, the Hotelling Lawley trace statistic, or Wilks lambda are used Wilks lambda Λ = W B T + W = W T p ni i=1 p i=1 ni = p i=1(n i 1)S i (n 1)S j=1(y ij y i )(y ij y i ) T ni j=1(y ij y)(y ij y) T Then t o = [n 05(m + p 2)] log(λ) and the test rejects H 0 if t 0 > χ 2 m(p 1) (1 α) See Johnson and Wichern (1988, p 238) Following Mardia, Kent, and Bibby (1979, p 335), let λ 1 λ 2 λ m be the eigenvalues of W 1 B T Then 1 + λ i for i = 1,, m are the eigenvalues of W 1 T and Λ = m i=1 (1 + λ i ) 1 Following Fujikoshi (2002) and Kakizawa (2009), let the Hotelling Lawley trace statistic U = tr(b T W 1 ) = tr(w 1 B T ) = m i=1 λ i, and let Pillai s trace statistic V = m tr(b T T 1 ) = tr(t 1 λ i B T ) = If the y i=1 1 + λ ij µ j are iid with common covariance matrix Σɛ, and if H 0 is true, then under regularity conditions [n 05(m + p i 3 =

4 2)] log(λ) D χ 2 m(p 1), (n m p 1)U D χ 2 m(p 1), and (n 1)V D χ 2 m(p 1) Note that the common covariance matrix assumption implies that each of the p treatment groups or populations has the same covariance matrix Σ i = Σɛ for i = 1,, p, an extremely strong assumption Kakizawa (2009) and Olive, Pelawa Watagoda, and Rupasinghe Arachchige Don (2015) show that similar results hold for the multivariate linear model The common covariance matrix assumption, Cov(ɛ k ) = Σɛ for k = 1,, n, is often reasonable for the multivariate linear regression model A useful one way MANOVA model is Z = XB + E where X is the full rank matrix where the first column of X is v 1 = 1 and the ith column v i of X is an indicator for group i 1 for i = 2,, p For example, v 3 = (0 T,1 T,0 T,,0 T ) T where the p vectors in v 3 have lengths n 1, n 2,,, respectively Then ˆβ 1k = Y p0k = ˆµ pk for k = 1,, m, and ˆβ ik = Y i 1,0k Y p0k = ˆµ i 1,k ˆµ pk for k = 1,, m and i = 2,, p Thus testing H 0 : µ 1 = = µ p is equivalent to testing H 0 : LB = 0 where L = [0 I p 1 ] Press (2005, p 262) uses the above model Then y ij = µ i + ɛ ij and µ T p (µ 1 µ p ) T (µ 2 µ p ) T B = (µ p 2 µ p ) T (µ p 1 µ p ) T Then a test statistic for the one way MANOVA model is w given by Equation (11) with T i = ˆµ i = y i where it is assumed that Σ i Σɛ for i = 1,, p Large sample theory can be used to derive a better test that does not need the equal population covariance matrix assumption Σ i Σɛ To simplify the large sample theory, assume n i = π i n where 0 < π i < 1 and p i=1 π i = 1 Assume H 0 is true, and let µ i = µ for i = 1,, p Suppose n i (T i µ) D N m (0,Σ i ), and ( n(t i µ) D N m 0, Σ ) i Let π i w = T 1 T p T 2 T p T p 2 T p T p 1 T p Then nw D N m(p 1) (0,Σw) with Σw = (Σ ij ) where Σ ij = Σ p for i j, and Σ ii = Σ i + Σ p for i = j Hence π p π i π p t 0 = nw T ˆΣ 1 ww = w T (11) ( ) 1 ˆΣw w D χ 2 m(p 1) n 4

5 as the n i if H 0 is true Here ˆΣw n = ˆ Σ 1 n 1 + ˆΣ p ˆ ˆΣ p Σ p ˆΣ 2 n 2 + ˆΣ p ˆ Σ p ˆΣ p ˆΣ p ˆΣ p ˆΣ p ˆ Σ p ˆ Σ p ˆ Σ p ˆΣ p is a block matrix where the off diagonal block entries equal ˆΣ p / and the ith diagonal block entry is ˆΣ i + ˆΣ p for i = 1,, (p 1) n i Reject H 0 if t 0 > m(p 1)F m(p 1),dn (1 α) (12) where d n = min(n 1,, ) It may make sense to relabel the groups so that is the largest n i or ˆΣ p / has the smallest generalized variance of the ˆΣ i /n i This test may start to outperform the one way MANOVA test if n (m + p) 2 and n i 20m for i = 1,, p Olive (2017b, ch 10) has the above result where T i = y i is the sample mean and ˆΣ i = S i is the sample covariance matrix of the ith group Then Σ i is the population covariance matrix of the ith group Rupasinghe Arachchige Don (2017) gives the general result If T = (T T 1, T T 2,, T T P ) T, θ = (µ T 1, µ T 2,, µ T p ) T, c is a constant vector, and A is a full rank r mp matrix with rank r, then a large sample test of the form H 0 : Aθ = c versus H 1 : Aθ c uses A n(t θ) D N r (0, A diag ( Σ1, Σ 2,, Σ ) p π 1 π 2 π p ) A T When H 0 is true, the statistic t 0 = [AT c] T [A diag ( ˆΣ1, ˆΣ 2,, ˆΣ ) p n 1 n 2 A T ] 1 [AT c] D χ 2 r The same statistic was used by Zhang and Liu (2013, p 138) with T i = y i and ˆΣ i = S i Section 2 shows how to get a bootstrap confidence region that can be used to test H 0 when ˆΣw is unknown or difficult to estimate Section 3 gives some simulations and an example 2 Bootstrapping Hypothesis Tests and the Prediction Region Method Olive (2017bc) shows that there is a useful relationship betweerediction regions and confidence regions Consider predicting a future r 1 test vector x f, giveast training 5

6 data x 1,, x n A large sample 100(1 δ)% prediction region is a set A n such that P(x f A n ) 1 δ while a large sample 100(1 δ)% confidence region for a parameter τ is a set A n such that P(τ A n ) 1 δ as n Consider testing H 0 : τ = c versus H 1 : τ c where c is a known r 1 vector Some notation is needed to describe the Olive (2013) prediction region for the multivariate location and dispersion model Let the r 1 column vector T be a multivariate location estimator, and let the r r symmetric positive definite matrix C be a dispersion estimator Then the ith squared sample Mahalanobis distance is the scalar D 2 i = D 2 i (T, C) = D 2 x i (T, C) = (x i T) T C 1 (x i T) (21) for each observation x i Notice that the Euclidean distance of x i from the estimate of center T is D i (T, I r ) where I r is the r r identity matrix The classical Mahalanobis distance uses (T, C) = (x, S), the sample mean and sample covariance matrix where x = 1 n n i=1 x i and S = 1 n 1 n (x i x)(x i x) T (22) i=1 A large sample 100(1 δ)% prediction region is the hyperellipsoid {w : D 2 w (x, S) D2 (c) } = {w : D w(x, S) D (c) } (23) for appropriate c Using c = n(1 δ) covers about 100(1 δ)% of the training data cases x i, but the prediction region will have coverage lower than the nominal coverage of 1 δ for moderate n This result is not surprising since empirically statistical methods perform worse on test data Increasing c will improve the coverage for moderate samples Let q n = min(1 δ + 005, 1 δ + r/n) for δ > 01 and q n = min(1 δ/2, 1 δ + 10δr/n), otherwise (24) If 1 δ < 0999 and q n < 1 δ , set q n = 1 δ Let D (Un) be the 100q n th percentile of the D i Then the Olive (2013) large sample 100(1 δ)% nonparametric prediction region for a future value x f given iid data x 1,, x n is {w : D 2 w(x, S) D 2 (U n)}, (25) while the classical large sample 100(1 δ)% prediction region is {w : Dw 2 (x, S) χ2 r,1 δ } (26) The Olive (2017abc) prediction region method obtains a confidence region for τ by applying the nonparametric prediction region (25) to the bootstrap sample T1,, T B, and the theory for the method is sketched below Let T and S T be the sample mean and sample covariance matrix of the bootstrap sample Assume n(t τ) D N r (0,Σ A ), and ns P T Σ A See Machado and Parente (2005) for regularity conditions for this assumption Following Bickel and Ren (2001), let the vector of parameters τ = T(F), the statistic T n = T(F n ), and T = T(Fn) where F is the cdf of iid x 1,, x n, F n is the empirical 6

7 cdf, and Fn is the empirical cdf of x 1,, x n, a sample from F n using the nonparametric bootstrap If n(f n F) D z F, a Gaussian random process, and if T is sufficiently smooth (with a Hadamard derivative T(F)), then n(t n τ) D U and n(ti T n ) D U with U = T(F)z F Olive (2017bc) uses these results to show that if U N r (0,Σ A ), then n(t T n ) D 0, n(ti T ) D U, n(t τ) D U, and that the prediction region method large sample 100(1 δ)% confidence region for τ is {w : (w T ) T [S T] 1 (w T ) D 2 (U B )} = {w : D 2 w(t, S T) D 2 (U B )} (27) where D(U 2 B ) is computed from D2 i = (Ti T ) T [S T] 1 (Ti T ) for i = 1,, B Note that the corresponding test for H 0 : τ = τ 0 rejects H 0 if (T τ 0 ) T [S T] 1 (T τ 0 ) > D(U 2 B ) This procedure is basically the one sample Hotelling s T 2 test applied to the Ti using S T as the estimated covariance matrix and replacing the χ2 r,1 δ cutoff by D2 (U B ) The prediction region method for testing H 0 : τ = c versus H 1 : τ c is simple Let ˆτ be a consistent estimator of τ and make a bootstrap sample w i = ˆτ i c for i = 1,, B Make the nonparametric prediction region (27) for the w i and fail to reject H 0 if 0 is in the prediction region, reject H 0 otherwise The Bickel and Ren (2001) hypothesis testing method is equivalent to using confidence region (27) with T replaced by T n and U B replaced by B(1 δ) If region (27) or the Bickel and Ren (2001) region is a large sample 100(1 δ)% confidence region, then so is the other region if n(t T n ) D 0 Hadamard differentiability and asymptotic normality are two of the sufficient conditions for both regions to be large sample confidence regions if ns P T Σ A, but Bickel and Ren (2001) showed that their method can work when Hadamard differentiability fails The location model with means, medians, and trimmed means is one example where the Bickel and Ren (2001, p 96) method works Since the univariate sample mean, sample median, and sample trimmed mean are Hadamard differentiable and asymptotically normal, each coordinate satisfies n(t in T i) D 0 for i = 1,, p Hence n(tn T ) D 0, and (27) is a large sample 100(1 δ)% confidence region if T n is the coordinatewise sample mean, median, or trimmed mean Fréchet differentiability implies Hadamard differentiability, and many statistics are shown to be Hadamard differentiable in Bickel and Ren (2001), Clarke (1986, 2000), Fernholtz (1983), Gill (1989), Ren (1991) and Ren and Sen (1995) Since the common covariance matrix assumption Cov(ɛ k ) = Σ ɛ for k = 1,, n is extremely strong, using the prediction region method for testing may be a useful alternative If T = (T1 T, T 2 T,, T P T)T, θ = (µ T 1, µt 2,, µt p )T, c is a constant vector, and A is a full rank r mp matrix with rank r, then consider a large sample test of the form H 0 : Aθ = c versus H 1 : Aθ c Then τ = Aθ, ˆτ = AT, and ˆτ i = AT i where T = (T1 T, T2 T,, TP T ) T, and Ti = ˆµ i We will illustrate this method with the one way MANOVA test for H 0 : Aθ = 0, where 0 is an r 1 vector of zeroes with r = (p 1)m This test is equivalent to H 0 : LB = 0 where L and B are given in Section 1, and 0 is a (p 1) m matrix of zeroes Take a sample of size n i with replacement from the n i cases for each group for i = 1, 2,, p Let ˆB i be the ith bootstrap estimator of B for i = 1,, B Let the 7

8 (p 1)m 1 vector w i = vec(l ˆB i) = ((ˆµ 1 ˆµ p) T,, (ˆµ p 1 ˆµ p) T ) T i for i = 1,, B, where vec(a) stacks columns of a matrix into a vector For a robust test use w i = AT i = ((T 1 T p )T,, (T p 1 T p )T ) T i where T i is a robust location estimator, such as the coordinatewise median or trimmed mean, applied to the cases in the ith treatment group The prediction region method fails to reject H 0 if 0 is in the resulting confidence region 3 EXAMPLE AND SIMULATIONS Example The Cornwell and Trumbull (1994) North Carolina Crime data consists of 630 observations on 24 variables This data set is available online from ( dockgithubio/rdatasets/datasetshtml) Region is a categorical variable with three categories: Central, West and Other with the number of observations 238, 147, and 245 respectively, and forms the three groups The m = 5 variables are Y 1 = wsta = weekly wage of state employees, Y 2 = avgsen = average sentence days, Y 3 = prbarr = probability of arrest, Y 5 = prbconv = probability of conviction, and Y 5 = taxpc = tax revenue per capita There were a few outliers and boxplots of the variables, not shown, showed that the sample medians of the three groups were nearly the same for all 5 variables The variables were highly skewed with different amounts of skew for the three groups Hence the location measures other than the population coordinatewise median likely do differ The test with the coordinatewise median had D 0 = 4086 with the cutoff of 432 and failed to reject H 0 The classical one way MANOVA test had a p-value of 0001 and rejected the null hypothesis The simulation used 5000 runs with B bootstrap samples and p = 3 groups We may need n 40mp, n (m + p) 2, and n i 40m Olive (2017bc) suggests that the prediction region method can give good results when the number of bootstrap samples B 50r = 50m(p 1), and the simulation used various values of B The sample mean, coordinatewise median, and coordinatewise 25% trimmed mean were the statistics T used The classical one way MANOVA Hotelling Lawley test statistic was also used Four types of data distributions w i were considered that were identical for i = 1, 2, and 3 Then y 1 = σ 1 Cw 1 + δ 1 1, y 2 = σ 2 Cw 2 + δ 2 1, and y 3 = σ 3 Cw 3 + δ 3 1 or y 3 = w 3 where 1 = (1,, 1) T is a vector of ones and C = diag(1, 2,, m) The w i distributions were the multivariate normal distribution N m (0, I), the mixture distribution 06N m (0, I) + 04N m (0, 25I), the multivariate t distribution with 4 degrees of freedom, and the multivariate lognormal distribution shifted to have nonzero mean µ = , but a population coordiatewise median of 0 If σ 1 = 1 and δ i = 0 for i = 1, 2, 3, note that Cov(y 2 ) = σ 2 2 Cov(y 1 ), and for the first three distributions, E(y i ) = E(w i ) = 0 If y 3 = w 3 then Cov(y 3 ) = ci m for some constant c > 0 If σ 1 = 1 and y 3 = σ 3 Cw 3 +δ 3 1, then Cov(y 3 ) = σ 2 3 Cov(y 1) Adding the same type and proportion of outliers to all three groups often resulted in three distributions that were still similar Hence outliers were added to the first group but not the second or third, making the covariance structures of the three groups quite different The outlier proportion was 100γ% Let y 1 = (y 11,, y m1 ) T The five outlier types for group 1 were type 1: a tight cluster at the major axis (0,, 0, z) T, type 2: a 8

9 tight cluster at the minor axis (z, 0,, 0) T, type 3: N m (z1, diag(1,, m)), type 4: y m1 replaced by z, and type 5: y 11 replaced by z The quantity z determines how far the outliers are from the clean data Let the coverage be the proportion of times that H 0 is rejected We want the coverage near 005 when H 0 is true and the coverage close to 10 for good power when H 0 is false With 5000 runs, an observed coverage inside of (004, 006) suggests that the true coverage is close to the nominal 005 coverage when H 0 is true The new tests work well with all the distributions and with the different covariance settings Tables 1 through 4 show simulation results for two distributions with various covariance settings We took δ 1 = δ 3 = 0 and B = the size of the bootstrap sample Balanced and unbalanced designs have also been considered For Tables 1 and 2, Σ i diag(1, 2,, m) for i = 1, 2, 3 For Tables 3 and 4, σ 2 = σ 3 = 1, and Σ 3 = ci does not have the same shape as Σ 1 and Σ 2 Tables 1 and 3 are for the multivariate normal (MVN) distribution The classical test works well with multivariate normal data when the covariance matrices are the same, but the type I error tends to be higher than the nominal level when the covariance matrices differ The classical test can be too conservative when the design is unbalanced Having an unbalanced design and different covariance matrices was the worst case scenario for the classical test regardless of the data distribution The bootstrap tests using the mean and coordinatewise trimmed mean usually performed well but occasionally had coverage near 007 Tables 2 and 4 are for the lognormal distribution, where the location measures other than the coordinatewise median differ if σ 2 σ 3 (then coverage near 1 is desired) Figures 1, 2, and 3 generated power curves for the bootstrap tests and for the Zhang and Liu (2013) MANOVA type test (12) based on the sample means y i and S i for the 3 groups The bootstrap test based on the sample means bootstraps the test (12) For these power curves, group i has mean µ i = δ i 1 where δ 2 = 2 δ 1 and δ 3 = 3 δ 1 When δ 1 increases, the distance between the mean vectors increases The power curves for the bootstrap test based on the sample means and for test (12) were always similar Figure 1 shows the power curve for clean MVN data with a balanced design where the groups have the same covariance matrices Here the three mean based tests had similar power The power curve for the classical test was poor for the next two figures Figure 2 shows clean MVN data with m = 5, σ 1 = 1, σ 2 = 2, σ 3 = 5, n 1 = 200, n 2 = 400, and n 3 = 600 Figure 3 used settings similar to Figure 2 with the multivariate t 4 distribution, and the coordinatewise trimmed mean had the best power Simulations were also done for type I error with contamination using the five types of outliers, and (γ, z) = (01, 10) or (005, 20) In Table 5 with m = 5, the test with the coordinatewise median works reasonably well (close to the nominal coverage) for 10% outliers with all the distributions and for all the outlier types with the exception of outlier type 3 All the other tests, including the classical test, failed Results were similar with m = 10, n i = 800, B = 1000, and γ = 005 Increasing z as m increases can help, but if m and γ are large enough, then the outliers move the coordinatewise median of the first group enough so that the test tends to reject H 0 9

10 Table 1: Type I error for clean MVN data with Σ 3 ci m n 1 n 2 n 3 B σ 2 σ 3 Median Mean TrMn Class Median Mean TrMean ManovaType Classical delta1 Figure 1: Power curve for clean MVN data with m = 5, σ 1 = 1, σ 2 = 1, σ 3 = 1, n 1 = 200, n 2 = 200, and n 3 =

11 Table 2: Type I error for clean lognormal data with Σ 3 ci m n 1 n 2 n 3 B σ 2 σ 3 Median Mean TrMn Class Table 3: Type I error for clean MVN data with Σ 3 = ci m n 1 n 2 n 3 B Median Mean TrMn Class

12 Table 4: Type I error for clean lognormal data with Σ 3 = ci m n 1 n 2 n 3 B Median Mean TrMn Class Median Mean TrMean ManovaType Classical delta1 Figure 2: Power curve for clean MVN data with m = 5, σ 1 = 1, σ 2 = 2, σ 3 = 5, n 1 = 200, n 2 = 400, and n 3 =

13 Median Mean TrMean ManovaTyp Classical Figure 3: Power curve for clean multivariate t 4 data with m = 5, σ 1 = 1, σ 2 = 2, σ 3 = 5, n 1 = 200, n 2 = 400, and n 3 = 600 Table 5: Type I error with contaminated data: m = 5, γ = 01 Dist n 1 = n 2 = n 3 B outlier Median Mean TrMn Class

14 4 CONCLUSIONS Bootstrapping different estimators of multivariate locatiorovides an alternative to the one way MANOVA test that assumes the population covariance matrices of the p groups are the same The bootstrap test and test (12) were similar when the sample means y i were used A larger simulation is in Rupasinghe Arachchige Don (2017) Rupasinghe Arachchige Don and Pelawa Watagoda (2017) consider bootstrapping analogs of the two sample Hotelling s T 2 test, and Konietschke, Bathke, Harrar, and Pauly (2015) suggest a method for bootstrapping the MANOVA model References for robust one way MANOVA tests are in Finch and French (2013), Todorov and Filzmoser (2010), Van Aelst and Willems (2011), Wilcox (1995), and Zhang and Liu (2013) The R software was used in the simulation See R Core Team (2016) Programs are in the Olive (2017b) collection of R functions mpacktxt available from ( siuedu/olive/mpacktxt) The function manbtsim2 was used to simulate the tests of hypotheses, and predreg computes the confidence region given the bootstrap values 5 References Bickel, PJ, and Ren, J J (2001), The Bootstrap in Hypothesis Testing, in State of the Art in Probability and Statistics: Festschrift for William R van Zwet, eds de Gunst, M, Klaassen, C, and van der Vaart, A, The Institute of Mathematical Statistics, Hayward, CA, Clarke, BR (1986), Nonsmooth Analysis and Fréchet Differentiability of M Functionals, Probability Theory and Related Fields, 73, Clarke, BR (2000), A Review of Differentiability in Relation to Robustness With an Application to Seismic Data Analysis, Proceedings of the Indian National Science Academy, A, 66, Cornwell, C, and Trumbull, WN (1994), Estimating the Economic Model of Crime with Panel Data, Review of Economics and Statistics, 76, Fernholtz, LT (1983), von Mises Calculus for Statistical Functionals, Springer, New York, NY Finch, H, and French, B (2013), A Monte Carlo Comparison of Robust MANOVA Test Statistics, Journal of Modern Applied Statistical Methods, 12, Fujikoshi, Y (2002), Asymptotic Expansions for the Distributions of Multivariate Basic Statistics and One-Way MANOVA Tests Under Nonnormality, Journal of Statistical Planning and Inference, 108, Gill, RD (1989), Non- and Semi-Parametric Maximum Likelihood Estimators and the von Mises Method, Part 1, Scandinavian Journal of Statistics, 16, Johnson, RA, and Wichern, DW (1988), Applied Multivariate Statistical Analysis, 2nd ed, Prentice Hall, Englewood Cliffs, NJ Kakizawa, Y (2009), Third-Order Power Comparisons for a Class of Tests for Multivariate Linear Hypothesis Under General Distributions, Journal of Multivariate Analysis, 100, Konietschke, F, Bathke, AC, Harrar, SW, and Pauly, M (2015), Parametric and 14

15 Nonparametric Bootstrap Methods for General MANOVA, Journal of Multivariate Analysis, 140, Machado, JAF, and Parente, P (2005), Bootstrap Estimation of Covariance Matrices Via the Percentile Method, Econometrics Journal, 8, Mardia, KV, Kent, JT, and Bibby, JM (1979), Multivariate Analysis, Academic Press, London, UK Olive, DJ (2013), Asymptotically Optimal Regression Prediction Intervals and Prediction Regions for Multivariate Data, International Journal of Statistics and Probability, 2, Olive, DJ (2017a), Applications of Hyperellipsoidal Prediction Regions, Statistical Papers, to appear Olive, DJ (2017b), Robust Multivariate Analysis, Springer, New York, NY, to appear Olive, DJ (2017c), Bootstrapping Hypothesis Tests and Confidence Regions, unpublished manuscript with the bootstrap material from Olive (2017b) at ( Olive, DJ, Pelawa Watagoda, LCR, and Rupasinghe Arachchige Don, HS (2015), Visualizing and Testing the Multivariate Linear Regression Model, International Journal of Statistics and Probability, 4, Press, SJ (2005), Applied Multivariate Analysis: Using Bayesian and Frequentist Methods of Inference, 2nd ed, Dover, Mineola, NY R Core Team (2016), R: a Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, (wwwr-projectorg) Ren, J-J (1991), On Hadamard Differentiability of Extended Statistical Functional, Journal of Multivariate Analysis, 39, Ren, J-J, and Sen, PK (1995), Hadamard Differentiability on D[0,1] p, Journal of Multivariate Analysis, 55, Rupasinghe Arachchige Don, HS (2017), Bootstrapping Analogs of the One Way MANOVA Test, PhD Thesis, Southern Illinois University, at ( siuedu/olive/shasthikaphdpdf) Rupasinghe Arachchige Don, HS, and Pelawa Watagoda, LCR (2017), Bootstrapping Analogs of the Two Sample Hotelling s T 2 Test, Communications and Statistics: Theory and Methods, to appear See preprint at ( stwosamplepdf) Todorov, V, and Filzmoser, P (2010), Robust Statistics for the One-Way MANOVA, Computational Statistics & Data Analysis, 54, Van Aelst, S, and Willems, G (2011), Robust and Efficient One-Way MANOVA Tests, Journal of the American Statistical Association, 106, Wilcox, R R (1995), Simulation Results on Solutions to the Multivariate Behrens- Fisher Problem via Trimmed Means, The Statistician, 44, Zhang, J-T, and Liu, X (2013), A Modified Bartlett Test for Heteroscedastic One- Way MANOVA, Metrika, 76,

A Squared Correlation Coefficient of the Correlation Matrix

A Squared Correlation Coefficient of the Correlation Matrix Rong Fan Southern Illinois University August 25, 2016 Abstract Multivariate linear correlation analysis is important in statistical analysis