Other hypotheses of interest (cont d)

Similar documents
Least Squares Estimation

Chapter 7, continued: MANOVA

Multivariate Linear Regression Models

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM

Applied Multivariate and Longitudinal Data Analysis

Repeated Measures Part 2: Cartoon data

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay

Multivariate Statistical Analysis

Comparisons of Several Multivariate Populations

4.1 Computing section Example: Bivariate measurements on plants Post hoc analysis... 7

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Lecture 5: Hypothesis tests for more than one sample

Multivariate Analysis of Variance

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

STAT 501 EXAM I NAME Spring 1999

Multivariate analysis of variance and covariance

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur

Covariance Structure Approach to Within-Cases

T. Mark Beasley One-Way Repeated Measures ANOVA handout

exemp531.r jmsinger Mon Mar 27 15:51:

Stevens 2. Aufl. S Multivariate Tests c

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Analysis of variance, multivariate (MANOVA)

Example 1 describes the results from analyzing these data for three groups and two variables contained in test file manova1.tf3.

UV Absorbance by Fish Slime

MANOVA MANOVA,$/,,# ANOVA ##$%'*!# 1. $!;' *$,$!;' (''

Applied Multivariate Analysis

Multivariate Data Analysis Notes & Solutions to Exercises 3

Multivariate Linear Models

Multivariate Regression (Chapter 10)

General Linear Model. Notes Output Created Comments Input. 19-Dec :09:44

Multivariate Tests. Mauchly's Test of Sphericity

General Linear Model

MULTIVARIATE ANALYSIS OF VARIANCE

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

Multiple comparisons - subsequent inferences for two-way ANOVA

Multivariate Analysis of Variance

Descriptive Statistics

Group comparison test for independent samples

BIOL 458 BIOMETRY Lab 8 - Nested and Repeated Measures ANOVA

5 Inferences about a Mean Vector

Chapter 5: Multivariate Analysis and Repeated Measures

3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

STAT 730 Chapter 5: Hypothesis Testing

Gregory Carey, 1998 Regression & Path Analysis - 1 MULTIPLE REGRESSION AND PATH ANALYSIS

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

GLM Repeated Measures

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

M A N O V A. Multivariate ANOVA. Data

Chapter Seven: Multi-Sample Methods 1/52

An Introduction to Multivariate Statistical Analysis

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Confidence Intervals, Testing and ANOVA Summary

International Journal of Current Research in Biosciences and Plant Biology ISSN: Volume 2 Number 5 (May-2015) pp

The SAS System 18:28 Saturday, March 10, Plot of Canonical Variables Identified by Cluster

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

Topic 20: Single Factor Analysis of Variance

You can compute the maximum likelihood estimate for the correlation

ANOVA, ANCOVA and MANOVA as sem

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

ANOVA in SPSS. Hugo Quené. opleiding Taalwetenschap Universiteit Utrecht Trans 10, 3512 JK Utrecht.

Introduction. Introduction

1998, Gregory Carey Repeated Measures ANOVA - 1. REPEATED MEASURES ANOVA (incomplete)

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Exst7037 Multivariate Analysis Cancorr interpretation Page 1

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

9 One-Way Analysis of Variance

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay

6 Multivariate Regression

Applied Multivariate and Longitudinal Data Analysis

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA

Chapter 2 Multivariate Normal Distribution

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is

On MANOVA using STATA, SAS & R

Neuendorf MANOVA /MANCOVA. Model: MAIN EFFECTS: X1 (Factor A) X2 (Factor B) INTERACTIONS : X1 x X2 (A x B Interaction) Y4. Like ANOVA/ANCOVA:

Lec 1: An Introduction to ANOVA

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

The Random Effects Model Introduction

Notes on Maxwell & Delaney

Analysis of Variance

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

STAT 525 Fall Final exam. Tuesday December 14, 2010

Multiple Linear Regression

Lecture 3: Inference in SLR

Transcription:

Other hypotheses of interest (cont d) In addition to the simple null hypothesis of no treatment effects, we might wish to test other hypothesis of the general form (examples follow): H 0 : C k g β g p = 0, Comparisons among treatments H 0 : β g p M p q = 0, Comparisons across traits H 0 : C k g β g p M p q = 0, A combination of both. For example, suppose that we measure p = 2 traits on each unit in g = 3 treatment groups, and we set τ 3 = [τ 31 τ 32 ] = 0. In this case, β = µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 389

Other hypotheses of interest (cont d) Comparisons of treatment means: Cβ = [ 0 1 0 0 1 1 ] µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 = [ τ 11 τ 12 τ 11 τ 21 τ 12 τ 22 ]. Under the restriction τ 13 = τ 23 = 0, the first row of β is the expected value of units in treatment 3. First row of C provides differences between mean resposnes for treatments 3 and 1 for both traits. Second row of C provides differences between mean responses for treatments 1 and 2 for both traits. 390

Other hypotheses of interest (cont d) Comparison of trait means with each treatment group: βm = µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 [ 1 1 ] = µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 The first row compares mean responses for trait 1 and trait 2 for units receiving treatment 3. The second row compares mean responses for trait 1 and trait 2 for units receiving treatment 1. The third row compares mean responses for trait 1 and trait 2 for units receiving treatment 2.. 391

Other hypotheses of interest (cont d) One might be interested in comparing the effects of the treatments on just the first trait: CβM = [ 0 1 0 0 1 1 ] µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 [ 1 0 ] = [ τ 11 τ 11 τ 21 ]. Row 1 is the difference of the effects of treatments 3 and 1 on mean responses for trait 1. Row 2 is the difference of the effect of treatments 1 and 2 on mean responses for trait 1. 392

Other hypotheses of interest (cont d) CβM = One might be interested in interaction contrasts : [ 0 1 0 0 1 1 ] µ 1 µ 2 τ 11 τ 12 τ 21 τ 22 [ 1 1 ] = [ τ 11 τ 21 (τ 11 τ 21 ) (τ 12 τ 22 ) ]. The first is the difference in mean responses to treatments 3 and 1 for trait 1 minus the difference in mean responses to treatments 3 and 1 for trait 2. The second is the difference in mean responses to treatments 1 and 2 for trait 1 minus the difference in mean responses to treatments 1 and 2 for trait 2. 393

F approximation to the sampling distribution of Wilk s Criterion C. R. Rao (1951) Bulletin Int. Stat. Inst. 33(2), 177-180. Coincides with exact F-distribution for cases described on a previous slide More accurate than large sample chi-square approximation Similar F-approximations are used by SAS for Pillai s trace and the Lawley-Hotelling trace. 394

F approximation to the sampling distribution of Wilk s Criterion When H 0 : C k r β r p M p u = 0 k u is true where 1 Λ 1/b ab c Λ 1/b uk F (ku,ab c) a = (n r) (u k + 1)/2 b = u2 k 2 4 u 2 + k 2 5 c = (uk 2)/2 395

Example: One-way MANOVA Populations: g = 3 types of students (k = 1) Technical school students (n 1 = 23) (k = 2) Architecture students (n 2 = 38) (k = 3) Medical technology students (n 3 = 21) Response variables: p = 4 test scores aptitude test mathematics test language test general knowledge 396

One-way MANOVA model where X ij = Example (cont d) X ij1 X ij2 X ij3 X ij4 ɛ ij = = ɛ ij1 ɛ ij2 ɛ ij3 ɛ ij4 µ 1 µ 2 µ 3 µ 4 + τ i1 τ i2 τ i3 τ i4 + NID(0, Σ) ɛ ij1 ɛ ij2 ɛ ij3 ɛ ij4 and use the SAS constraints τ 31 = τ 32 = τ 33 = τ 34 = 0 397

Example (cont d) Test the null hypothesis that the mean vectors for the four traits are the same for all three types of students Write the one-way MANOVA model in matrix form X n p = A n r β r p + ɛ n p where β 3 4 = µ 1 µ 2 µ 3 µ 4 τ 11 τ 12 τ 13 τ 14 τ 21 τ 22 τ 23 τ 24 398

Example (cont d) The null hypothesis that the mean vectors for the four traits are the same for all three types of students can be written as H 0 : Cβ = [ 0 1 0 0 0 1 ] µ 1 µ 2 µ 3 µ 4 τ 11 τ 12 τ 13 τ 14 τ 21 τ 22 τ 23 τ 24 Here k = 2 rows in C r = 3 rows in β p = 4 response variables M = I 4 4 so u = p = 4 n = n 1 + n 2 + n 3 = 23 + 38 + 21 = 82 = [ 0 0 0 0 0 0 0 0 ] 399

The value of Wilks Lambda is and Example (cont d) W B+W = 0.544 a = (82 3) (4 2 + 1)/2 = 77.5 b = 42 2 2 4 4 2 + 2 2 5 = 2 c = ((4)(2) 2)/2 = 3 F = ( 1 Λ Λ ) (ab c uk ) = 6.76 on (uk, ab c) = (8, 152) degrees of freedom and p-value <.0001 Conclude that the means scores are different for at least two types of students for at least one the the four response variables 400

Example (Cont d) Can test the null hypothesis of equal response means across populations by running a one way ANOVA for each of the four response variables. SAS code (in morel.sas) PROC GLM DATA=SET1; CLASS GROUP; MODEL X1-X4 = GROUP / Solution; MANOVA H=group /PRINTH PRINTE; RUN; 401

Example (Cont d) MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall group Effect H = Type III SSCP Matrix for group E = Error SSCP Matrix S=2 M=0.5 N=37 Statistic Value F Value Num DF Den DF p-value Wilks Lambda 0.543448 6.77 8 152 <.0001 Pillai s Trace 0.492698 6.29 8 154 <.0001 Hotelling-Lawley Trace 0.773588 7.29 8 106.27 <.0001 Roy s Greatest Root 0.675059 12.99 4 77 NOTE: F Statistic for Roy s Greatest Root is an upper bound. NOTE: F Statistic for Wilks Lambda is exact. 402

Example (Cont d): morel.r > morel[,1] <- as.factor(morel[,1]) > fit <- manova(as.matrix(morel[,-1])~morel[,1]) > summary(fit, test="wilks") Df Wilks approx F num Df den Df Pr(>F) morel[, 1] 2 0.54345 6.7736 8 152 1.384e-07 *** Residuals 79 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 > summary(fit, test="pillai") Df Pillai approx F num Df den Df Pr(>F) morel[, 1] 2 0.4927 6.2923 8 154 4.824e-07 *** Residuals 79 --- 403

Bonferroni t-tests and intervals Can also examine Bonferroni simultaneous t-tests and CIs For the difference in treatment means for the i-th response (µ i + τ ki ) (µ i + τ li ) = τ ki τ li, we need the variance of X ki = X li : ( 1 Var(ˆτ ki ˆτ li ) = Var( X ki X li ) = + 1 σ ii, n k n l where σ ii is estimated as s pooled,ii = w ii n g, with w ii the ith diagonal element of the within-groups SSP matrix W. ) 404

Bonferroni t-tests and intervals If we wish to carry out all pairwise comparisons, there will be pg(g 1)/2 of them. To maintain a simultaneous type I error level of no more than α we can use Reject H 0 : τ ki = τ li if t n g ( α pg(g 1) ) where m =. 2m 2 t = X ki = X li ) > t n g ( α spooled,ii 2m ) ( 1nk + 1 nl 405

Bonferroni simultaneous intervals We have three groups and four variables, for a total of 4 3 2/2 = 12 comparisons, three for each response variable. From the output: w 11 = 55036.1955 w 22 = 14588.9983 w 33 = 4759.8655 w 44 = 7040.6248 with n 1 = 23, n 2 = 37, n 3 = 20 and n g = 82 3 = 79. 406

Bonferroni simultaneous intervals A 95% confidence interval for the true difference between technical school and architecture students on mathematics is w 22 ( 1 x 12 x 22 ± t n g ( α 2m ) + 1 n g n 1 n 2 14588.9983 47.3913 51.1842 ± t 79 (0.05/24) 79 3.7929 ± 2.951 184.671 0.0705 ( 14.441, 6.855). ) ( 1 23 + 1 ) 37 Since the interval includes 0, we conclude that there is insufficient evidence to reject the null hypothesis of equal mathematics means scores for technical and architecture students. 407

Bonferroni simultaneous intervals We carry out similar calculations for other types of students. Compare technical school to medical technology students: w 22 ( 1 x 12 x 32 ± t n g ( α 2m ) + 1 n g n 1 n 2 14588.9983 47.3913 38.0952 ± t 79 (0.05/24) 79 9.296 ± 2.951 184.671 0.09348 ( 2.965, 21.577). ) ( 1 23 + 1 ) 20 408

Bonferroni simultaneous intervals Architecture versus medical technology students: x 22 x 32 ± t n g ( α 2m ) w 22 n g ( ) 1 n 1 + 1 n 2 13.089 ± 2.951 184.671 0.0713 (2.381, 23.797). Repeat these calculations for each of the other three response variables to construct 12 confidence intervals with simultaneous 95% confidence. 409

Test for Parallel Profiles Are the differences in mean scores for each pair of tests consistent across all 3 student groups? Are the differences in mean scores between two student groups consistent across all response variables? Parallel profiles are equivalent to no group test interaction 410

Test for Parallel Profiles The no group test interaction null hypothesis is expressed as [(µ 1 + τ 11 ) (µ 2 + τ 12 )] [µ 1 µ 2 ] = 0 [(µ 2 + τ 12 ) (µ 3 + τ 13 )] [µ 2 µ 3 ] = 0 [(µ 3 + τ 13 ) (µ 4 + τ 14 )] [µ 3 µ 4 ] = 0 [(µ 1 + τ 21 ) (µ 2 + τ 22 )] [µ 1 µ 2 ] = 0 [(µ 2 + τ 22 ) (µ 3 + τ 23 )] [µ 2 µ 3 ] = 0 [(µ 3 + τ 23 ) (µ 4 + τ 24 )] [µ 3 µ 4 ] = 0 411

This is equivalent to Test for Parallel Profiles H 0 : C k r β r p M p u = = [ 0 1 0 0 0 1 [ 0 0 0 0 0 0 ] µ 1 µ 2 µ 3 µ 4 τ 11 τ 12 τ 13 τ 14 τ 21 τ 22 τ 23 τ 24 ] 1 0 0 1 1 0 0 1 1 0 0 1 where k = 2 rows in C r = 3 rows in β p = 4 response variables u = 3 n = n 1 + n 2 + n 3 = 23 + 38 + 21 = 82 412

The value of Wilks Lambda is Test for Parallel Profiles W B+W = 0.569 a = (82 3) (3 2 + 1)/2 = 78 b = 32 2 2 4 3 2 + 2 2 5 = 2 c = ((3)(2) 2)/2 = 2 F = ( 1 Λ Λ ) (ab c uk ) = 8.36 on (uk, ab c) = (6, 154) degrees of freedom and p-value <.0001 Conclude that the mean score profiles are not parallel. For at least one pair of response variables, the difference in the mean responses is not the same for all types of students. 413

Testing for parallel profiles using R > M <- matrix(c(1, 0, 0, -1, 1, 0, 0, -1, 1, 0, 0, -1), ncol = 4) > fitp <- manova((as.matrix(morel[, -1]) %*% t(m)) ~ morel[, 1]) > summary(fitp, test="wilks") Df Wilks approx F num Df den Df Pr(>F) morel[, 1] 2 0.5689 8.3624 6 154 7.461e-08 *** Residuals 79 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 414

MANOVA Using the (preferred) car package > morel$studentgroup <- as.factor(morel$studentgroup) > library(car) > fit.lm <- lm(cbind(aptitude, mathematics, language, generalknowledge)~studentgroup, data = morel) > fit.manova <- Manova(fit.lm) > summary(fit.manova) Type II MANOVA Tests: Sum of squares and products for error: aptitude mathematics language generalknowledge aptitude 55036.195 8140.0376 5569.9774 5490.1278 mathematics 8140.038 14588.9983 2619.2097-128.0217 language 5569.977 2619.2097 4759.8655-102.4044 generalknowledge 5490.128-128.0217-102.4044 7040.6248 415

Term: studentgroup Sum of squares and products for the hypothesis: aptitude mathematics language generalknowledge aptitude 24600.207 6832.584 5709.5104-2040.4327 mathematics 6832.584 2329.599 1570.1805-1064.7222 language 5709.510 1570.181 1325.6955-455.5712 generalknowledge -2040.433-1064.722-455.5712 743.4849 Multivariate Tests: studentgroup Df test stat approx F num Df den Df Pr(>F) Pillai 2 0.4926982 6.292329 8 154 4.8238e-07 *** Wilks 2 0.5434483 6.773565 8 152 1.3843e-07 *** Hotelling-Lawley 2 0.7735884 7.252392 8 150 4.0901e-08 *** Roy 2 0.6750591 12.994889 4 77 3.9151e-08 *** 416

> fit$sspe aptitude mathematics language generalknowledge aptitude 55036.195 8140.0376 5569.9774 5490.1278 mathematics 8140.038 14588.9983 2619.2097-128.0217 language 5569.977 2619.2097 4759.8655-102.4044 generalknowledge 5490.128-128.0217-102.4044 7040.6248 > C <- matrix(c(0, 1, 0, 0, 1, -1), ncol = 3, by = T) > M <- matrix(c(1, 0, 0, -1, 1, 0, 0, -1, 1, 0, 0, -1), nrow = 4, by = T) > newfit <- linearhypothesis(model = fit.lm, hypothesis.matrix = C) > print(newfit) Multivariate Tests: Df test stat approx F num Df den Df Pr(>F) Pillai 2 0.4926982 6.292329 8 154 4.8238e-07 *** Wilks 2 0.5434483 6.773565 8 152 1.3843e-07 *** Hotelling-Lawley 2 0.7735884 7.252392 8 150 4.0901e-08 *** Roy 2 0.6750591 12.994889 4 77 3.9151e-08 *** 417

> newfit <- linearhypothesis(model = fit.lm, hypothesis.matrix = C, P = M) > print(newfit) Response transformation matrix: [,1] [,2] [,3] aptitude 1 0 0 mathematics -1 1 0 language 0-1 1 generalknowledge 0 0-1 418

Sum of squares and products for the hypothesis: [,1] [,2] [,3] [1,] 13264.6375 363.6553 5115.040 [2,] 363.6553 514.9337 853.636 [3,] 5115.0403 853.6360 2980.323 Sum of squares and products for error: [,1] [,2] [,3] [1,] 53345.119-9399.728-2667.382 [2,] -9399.728 14110.444-2115.038 [3,] -2667.382-2115.038 12005.299 Multivariate Tests: Df test stat approx F num Df den Df Pr(>F) Pillai 2 0.4539686 7.634505 6 156 3.3709e-07 *** Wilks 2 0.5689034 8.362413 6 154 7.4605e-08 *** Hotelling-Lawley 2 0.7175639 9.089142 6 152 1.7098e-08 *** Roy 2 0.6563063 17.063963 3 78 1.2962e-08 *** 419