A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

Size: px

Start display at page:

Download "A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic"

Neal Houston
5 years ago
Views:

1 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe Arachchige Do Correspodece: Hasthika S Rupasighe Arachchige Do, Departmet of Mathematical Scieces, Appalachia State Uiversity, Booe, NC, 28607, USA Received: August 23, 2018 Accepted: September 28, 2018 Olie Published: October 15, 2018 doi:105539/ijspv76p124 URL: Abstract The Oe-Way MANOVA model is a special case of the multivariate liear model, ad this paper shows that the Oe-Way MANOVA test statistic ad the Hotellig Lawley trace test statistic are equivalet if the desig matrix is carefully chose Keywords: ANOVA, Liear Moldel 1 Itroductio We wat to show that the Oe-Way MANOVA test statistic ad the Hotellig Lawley trace test statistic are equivalet for a carefully chose full rak desig matrix First we will describe the MANOVA model, ad the the Oe-Way MANOVA model The otatio i this paper follows that used i Olive (2017) ad closely follows Rupasighe Arachchige Do (2017) 11 MANOVA Multivariate aalysis of variace (MANOVA) is aalogous to a ANOVA, but there is more tha oe depedet variable ANOVA tests for the differece i meas betwee two or more groups, while MANOVA tests for the differece i two or more vectors of meas The multivariate aalysis of variace (MANOVA) model y i = B T x i + ϵ i for i = 1,, has m 2 respose variables Y 1, Y m ad p predictor variables x 1, x 2,, x p The ith case is (x T i, yt i ) = (x i1,, x ip, Y i1,, Y im ) If a costat x i1 = 1 is i the model, the x i1 could be omitted from the case For the MANOVA model predictors are idicator variables Sometimes the trivial predictor 1 is also i the model The MANOVA model i matrix form is Z = XB + E ad has E(ϵ k ) = 0 ad Cov(ϵ k ) = Σϵ = (σ i j ) for k = 1,, Also E(e i ) = 0 while Cov(e i, e j ) = σ i j I for i, j = 1,, m The B ad Σϵ are ukow matrices to be estimated Z = Y 1,1 Y 1,2 Y 1,m Y 2,1 Y 2,2 Y 2,m Y,1 Y,2 Y,m = ( ) Y 1 Y 2 Y m = y T 1 y T The p matrix X is ot ecessarily of full rak p, ad where ofte v 1 = 1 X = x 1,1 x 1,2 x 1,p x 2,1 x 2,2 x 2,p x,1 x,2 x,p = ( ) v 1 v 2 v p = x T 1 x T The p m coefficiet matrix is B = β 1,1 β 1,2 β 1,m β 2,1 β 2,2 β 2,m β p,1 β p,2 β p,m = ( ) β 1 β 2 β m The m error matrix is 124

2 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 ϵ 1,1 ϵ 1,2 ϵ 1,m ϵ 2,1 ϵ 2,2 ϵ 2,m ϵ,1 ϵ,2 ϵ,m E = = ( ) e 1 e 2 e m = ϵ T 1 ϵ T Each respose variable i a MANOVA model follows a ANOVA model Y j = Xβ j + e j for j = 1,, m, where it is assumed that E(e j ) = 0 ad Cov(e j ) = σ j j I MANOVA models are ofte fit by least squares The least squares estimator ˆB of B is ˆB = ( X T X ) X T Z = ( ˆβ 1 ˆβ 2 ˆβ m ) where ( X T X ) is a geeralized iverse of X T X If X has a full rak the ( X T X ) = ( X T X ) 1 ad ˆB is uique The predicted values or fitted values are Ẑ = X ˆB = ( ) Ŷ 1 Ŷ 2 Ŷ m = Ŷ 1,1 Ŷ 1,2 Ŷ 1,m Ŷ 2,1 Ŷ 2,2 Ŷ 2,m Ŷ,1 Ŷ,2 Ŷ,m The residuals are Ê = Z Ẑ = Z X ˆB Fially, ˆΣϵ = (Z Ẑ)T (Z Ẑ) p = ÊT Ê p 12 Oe Way MANOVA Assume that there are idepedet radom samples of size i from p differet populatios, or i cases are radomly assiged to p treatmet groups Let = p i be the total sample size Also assume that m respose variables y i j = (Y i j1,, Y i jm ) T are measured for the ith treatmet group ad the jth case Assume E(y i j ) = µ i ad Cov(y i j ) = Σϵ The oe way MANOVA is used to test H 0 : µ 1 = µ 2 = = µ p Note that if m = 1 the oe way MANOVA model becomes the oe way ANOVA model Oe might thik that performig m ANOVA tests is sufficiet to test the above hypotheses But the separate ANOVA tests would ot take the correlatio betwee the m variables ito accout O the other had the MANOVA test will take the correlatio ito accout i Let ȳ = p y i j/ be the overall mea Let ȳ i = i y i j/ i Several m m matrices will be useful Let S i be the sample covariace matrix correspodig to the ith treatmet group The the withi sum of squares ad cross products matrix is W = ( 1 1)S ( p 1)S p = p i (y i j y i )(y i j y i ) T The ˆΣϵ = W/( p) The treatmet or betwee sum of squares ad cross products matrix is B T = i (y i y)(y i y) T The total corrected (for the mea) sum of squares ad cross products matrix is T = B T + W = p i (y i j y)(y i j y) T Note that S = T/( 1) is the usual sample covariace matrix of the y i j if it is assumed that all of the y i j are iid so that the µ i µ for i = 1,, p The oe way MANOVA model is y i j = µ i + ϵ i j where ϵ i j are iid with E(ϵ i j ) = 0 ad Cov(ϵ i j ) = Σϵ The summary oe way MANOVA table is show bellow Source matrix df Treatmet or Betwee B T p 1 Residual or Error or Withi W p Total (Corrected) T 1 There are three commoly used test statistics to test the above hypotheses Namely, 125

3 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; Hotellig Lawley trace statistic: U = tr(b T W 1 ) = tr(w 1 B T ) 2 Wilks lambda: Λ = W B T + W 3 Pillai s trace statistic: V = tr(b T T 1 ) = tr(t 1 B T ) If the y i j µ j are iid with commo covariace matrix Σϵ, ad if H 0 is true, the uder regularity coditios Fujikoshi (2002) showed 1 ( m p 1)U D χ 2 m(p 1), 2 [ 05(m + p 2)]log(Λ) D χ 2 m(p 1), ad 3 ( 1)V D χ 2 m(p 1) Note that the commo covariace matrix assumptio implies that each of the p treatmet groups or populatios has the same covariace matrix Σ i = Σϵ for i = 1,, p, a extremely strog assumptio Kakizawa (2009) ad Olive et al (2015) show that similar results hold for the multivariate liear model The commo covariace matrix assumptio, Cov(ϵ k ) = Σϵ for k = 1,,, is ofte reasoable for the multivariate liear regressio model 13 Hotellig Lawley Trace Test Hotellig Lawley trace test statistic Hotellig (1951); Lawley (1938), ad the asymptotic distributio ( m p 1)U D χ 2 m(p 1) by Fujikoshi (2002) are widely used Olive et al (2015) explais the large sample theory of the Wilks Λ, Pillai s trace, ad Hotellig Lawley trace test statistics ad gives two theorems to show that the Hotellig Lawley test geeralizes the usual partial F test for m = 1 respose variable to m 1 respose variables 2 Method 21 A Relatioship Betwee the Oe-Way MANOVA Test ad the Hotellig Lawley Trace Test A alterative method for Oe-Way MANOVA is to use the model Z = XB + E where Y i j = Y i j1 Y i jm = µ i + e i j, ad E[Y ij ] = µ i = for i = 1,, p ad j = 1,, i The X is a full rak where the ith colum of X is a idicator for group i 1 for i = 2,, p X = µ i j1 µ i jm, (1) 126

4 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 B = The µ T p (µ 1 µ p ) T (µ p 1 µ p ) T ad let L = ( 0 I p 1 ) Note that Y T i j = µ T i + e T i j 1 2 p X T X = p 2 0 p 2 0 p p 1 (2) ad ( X T X ) 1 = 1 p p p p p p 1 (3) The the least squares estimator ˆB of B, ˆB = ȳ T p (ȳ 1 ȳ p ) T (ȳ p 1 ȳ p ) T, ad L ˆB = (ȳ 1 ȳ p ) T (ȳ 2 ȳ p ) T (ȳ p 1 ȳ p ) T The L ( X T X ) 1 L T becomes L ( X T X ) 1 L T = 1 p 1 + p p p p 1 (4) It ca be show that the iverse of the above matrix is [ L ( X T X ) ] 1 1 L T 1 = [ For coveiece, write L ( X T X ) ] 1 1 L T = 1 ( 1 ) p ( 2 ) p 1 1 p 1 2 p 1 p 1 ( p 1 ) p p 1 1 p 1 2 p 1 2 p p 1 The 127

5 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 ( L ˆB ) [ T L ( X T X ) ] 1 1 ( L T L ˆB ) = 1 p 1 p 1 p 1 i j (ȳ i ȳ p )(ȳ j ȳ p ) T + i (ȳ i ȳ p )(ȳ i ȳ p ) T = H Let X be as i (1) The the multivariate liear regressio Hotellig Lawley test statistic for testig H 0 : LB = 0 versus H 0 : LB 0 has U = tr(w 1 H) Oe-Way MANOVA is used to test H 0 : µ 1 = µ 2 = = µ p The Oe-Way MANOVA Hotellig Lawley test statistic for testig for above hypotheses is U = tr(w 1 B T ) where W = ( p) ˆΣϵ ad B T = i (ȳ i ȳ)(ȳ i ȳ) T Theorem 1 The Oe-Way MANOVA ad the multivariate liear regressio Hotellig Lawley trace test statistics are the same for the desig matrix as i (1) To show that the above two test statistics are equal, it is sufficiet to prove that H = B T First we will prove two special cases ad the give the proof for the theorem Proof Special case I: p = 2 (Two group case) Cosider H H = (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T + 1 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T Sice = 1 + 2, H = 1 ( )(ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T + 1 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T H = 1 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T + 1 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T H = 1 2 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T Now cosider B T with p = 2 Note that ȳ = ( 1 ȳ ȳ 2 )/ ad B T = 1 (ȳ 1 ȳ)(ȳ 1 ȳ) T + 2 (ȳ 2 ȳ)(ȳ 2 ȳ) T B T = 1 (ȳ ȳ 1 2 ȳ 2 )(ȳ 1 1 ȳ 1 2 ȳ 1 ) T + 2 (ȳ ȳ 1 2 ȳ 2 )(ȳ 2 1 ȳ 1 2 ȳ 2 ) T B T = (ȳ 2 1 ȳ 2 )(ȳ 1 ȳ 2 ) T (ȳ 2 1 ȳ 2 )(ȳ 1 ȳ 2 ) T B T = 1 2 (ȳ 1 ȳ 2 )(ȳ 1 ȳ 2 ) T Therefore B T = H whe p = 2 Proof Special case II: i = 1 i = 1,, p H = 1 p 1 p 1 p 1 i j (ȳ i ȳ p )(ȳ j ȳ p ) T + i (ȳ i ȳ p )(ȳ i ȳ p ) T Note that the i, j ruig from 1 through p 1 ad i, j ruig from 1 through p would yield the same H Therefore H ca be writte as H = 1 i j (ȳ i ȳ p )(ȳ j ȳ p ) T + i (ȳ i ȳ p )(ȳ i ȳ p ) T 128

6 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 Now cosider the double sum i H Note that = 1 p ad 1 = 1 p i j (ȳ i ȳ p )(ȳ j ȳ p ) T = p (ȳi ȳ T j ȳ i ȳ T p ȳ p ȳ T j + ȳ p ȳp) T ) (ȳi ȳ T j + p ȳ i ȳt p + pȳ p ȳ T j p2 ȳ p ȳ T p (5) Now cosider the rest of H, 1 (ȳ i ȳ p )(ȳ i ȳ p ) T = 1 ȳ i ȳ T i 1 ȳ i ȳt p 1 ȳ p ȳ T i + 1pȳ p ȳ T p (6) Therefore by (5) ad (6), it is clear that H = 1 ȳ i ȳ T i 1 p ȳ i ȳ T j (7) Now cosider Let Therefore, B T becomes Ȳ = ȳ T 1 ȳ T 2 ȳ T p B T = 1 (ȳ i ȳ)(ȳ i ȳ) T (8) The B T = 1 [Ȳ T Ȳ 1 ] pȳt 11 T Ȳ B T = 1 ȳ i ȳ T i 1 p ȳ i ȳ T j (9) From (8) ad (9) B T = H Proof Geeral case: H = 1 i j (ȳ i ȳ p )(ȳ j ȳ p ) T + i (ȳ i ȳ p )(ȳ i ȳ p ) T First cosider the double sum i H 1 i j ȳ i ȳ T j i j (ȳ i ȳ p )(ȳ j ȳ p ) T = i j ȳ i ȳ T p + 1 i j ȳ p ȳ T j 1 T ȳpȳ p i j (10) 1 i ȳ i j ȳ T j + 1 i ȳ i j ȳ T p + 1 ȳp i j ȳ T j 1 T ȳpȳ p 2 129

7 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; ȳȳt + 1 i ȳ i ȳ T p + 1 ȳp j ȳ T j ȳ p ȳ T p ȳȳ T + i ȳ i ȳ T p + ȳ p j ȳ T j ȳ p ȳ T p (11) Now cosider the rest of H, i (ȳ i ȳ p )(ȳ i ȳ p ) T = i ȳ i ȳ T i i ȳ i ȳ T p ȳ p i ȳ T i + ȳ p ȳ T p (12) Therefore by (11) ad (12) H = i ȳ i ȳ T i ȳȳ T (13) Now cosider B T = B T = i (ȳ i ȳ)(ȳ i ȳ) T i ȳ i ȳ T i i ȳ i ȳ T ȳ i ȳ T i + ȳȳ T B T = i ȳ i ȳ T i ȳȳ T ȳȳ T + ȳȳ T B T = i ȳ i ȳ T i ȳ i ȳ T (14) i (13) ad (14) proves that H = B T 22 Cell Meas Model We ca get the same result for the cell meas model which is defied for X ad B give below X = , B = µ T 1 µ T p ad L = ( I p 1 1 ) ˆB = ȳ T 1 ȳ T p, L ˆB = The X T X = diag ( ) ( ) 1,, p 1 ad (X T X) = diag 1,, p 1 (ȳ 1 ȳ p ) T (ȳ 2 ȳ p ) T (ȳ p 1 ȳ p ) T 130

8 Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 The L ( X T X ) 1 L T becomes L ( X T X ) 1 L T = 1 p 1 + p p p p 1 (15) Notice that the matrix equatio (15) is exactly same as (4) This is a idicatio that Theorem 1 does ot deped o the full rak desig matrix 3 Coclusios This work mathematically proved that the Oe-Way MANOVA test statistic ad the Hotellig Lawley trace test statistic are i fact the same The proof cosisted of two special cases ad the geeral case This result idicates that oe ca use the Oe-Way MANOVA test statistic ad the Hotellig Lawley trace test statistic alteratively if the desig matrix is carefully chose Ackowledgemets The author thaks Dr David J Olive ad Dr Lasathi C R Pelawa Watagoda for some commets o this paper Refereces Fujikoshi, Y (2002) Asymptotic expasios for the distributios of multivariate basic statistics ad oe-way MANOVA tests uder oormality Joural of Statistical Plaig ad Iferece, 108(1), Hotellig, H (1951) A geeralized T test ad measure of multivariate dispersio I J Neyma (Ed), Proceedigs of the Secod Berkeley Symposium o Mathematical Statistics ad Probability (pp 23-41) Berkeley: Uiversity of Califoria Press Kakizawa, Y (2009) Third-order power comparisos for a class of tests for multivariate liear hypothesis uder geeral distributios Joural of Multivariate Aalysis, 100(3), Lawley, D N (1938) A geeralizatio of Fisher s z-test Biometrika, 30, Olive, D J (2017) Robust Multivariate Aalysis Spriger Iteratioal Publishig Olive, D J, Pelawa, W L C R, & Rupasighe, A D H S (2015) Visualizig ad Testig the Multivariate Liear Regressio Model Iteratioal Joural of Statistics ad Probability, 4(1), Rupasighe Arachchige Do, H S (2017) Bootstrappig Aalogs of the Oe Way MANOVA Test (PhD Thesis), Souther Illiois Uiversity, USA, at ( Copyrights Copyright for this article is retaied by the author(s), with first publicatio rights grated to the joural This is a ope-access article distributed uder the terms ad coditios of the Creative Commos Attributio licese ( 131

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740 Ageda: Recap. Lecture. Chapter Homework. Chapt #,, 3 SAS Problems 3 & 4 by had. Copyright 06 by D.B. Rowe Recap. 6: Statistical Iferece: Procedures for μ -μ 6. Statistical Iferece Cocerig μ -μ Recall yes