STAT 730 Chapter 5: Hypothesis Testing

Similar documents
Chapter 7, continued: MANOVA

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is

Multivariate Statistical Analysis

Multivariate Regression (Chapter 10)

Multivariate Linear Models

Analysis of variance, multivariate (MANOVA)

STAT 730 Chapter 4: Estimation

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur

Multivariate Linear Regression Models

Lecture 3. Inference about multivariate normal distribution

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

MATH5745 Multivariate Methods Lecture 07

Lecture 5: Hypothesis tests for more than one sample

STAT 730 Chapter 1 Background

Example 1 describes the results from analyzing these data for three groups and two variables contained in test file manova1.tf3.

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Multivariate analysis of variance and covariance

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Least Squares Estimation

Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Comparisons of Several Multivariate Populations

Other hypotheses of interest (cont d)

Random Matrices and Multivariate Statistical Analysis

Inferences about a Mean Vector

You can compute the maximum likelihood estimate for the correlation

STAT 730 Chapter 9: Factor analysis

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Principal component analysis

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM

Group comparison test for independent samples

Applied Multivariate and Longitudinal Data Analysis

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

MULTIVARIATE ANALYSIS OF VARIANCE

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Lectures on Simple Linear Regression Stat 431, Summer 2012

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

4.1 Computing section Example: Bivariate measurements on plants Post hoc analysis... 7

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Stevens 2. Aufl. S Multivariate Tests c

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

Multiple Linear Regression

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay

Canonical Correlation Analysis of Longitudinal Data

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

October 1, Keywords: Conditional Testing Procedures, Non-normal Data, Nonparametric Statistics, Simulation study

2.2 Classical Regression in the Time Series Context

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Statistics 3858 : Maximum Likelihood Estimators

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

Ch 3: Multiple Linear Regression

STAT 540: Data Analysis and Regression

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Chapter 11. Hypothesis Testing (II)

Package CCP. R topics documented: December 17, Type Package. Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0.

Rejection regions for the bivariate case

Chapter 2 Multivariate Normal Distribution

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

Hypothesis testing: theory and methods

Introduction. Introduction

Repeated Measures Part 2: Cartoon data

MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT)

MANOVA MANOVA,$/,,# ANOVA ##$%'*!# 1. $!;' *$,$!;' (''

Better Bootstrap Confidence Intervals

Stat 579: Generalized Linear Models and Extensions

Gaussian Models (9/9/13)

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

GLM Repeated Measures

Stat 5101 Lecture Notes

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

STAT 461/561- Assignments, Year 2015

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

M M Cross-Over Designs

Random Numbers and Simulation

Testing equality of two mean vectors with unequal sample sizes for populations with correlation

Ch 2: Simple Linear Regression

Multivariate Regression

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference On the High-dimensional Gaussian Covarianc

COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure)

Asymptotic Statistics-VI. Changliang Zou

Linear models and their mathematical foundations: Simple linear regression

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

Neuendorf MANOVA /MANCOVA. Model: MAIN EFFECTS: X1 (Factor A) X2 (Factor B) INTERACTIONS : X1 x X2 (A x B Interaction) Y4. Like ANOVA/ANCOVA:

Composite Hypotheses and Generalized Likelihood Ratio Tests

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Multivariate Analysis of Variance

Confidence Intervals, Testing and ANOVA Summary

Department of Statistics

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Transcription:

STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28

Likelihood ratio test def n: Data X depend on θ. The likelihood ratio test statistic for H 0 : θ Ω 0 vs. H 1 : θ Ω 1 is λ(x) = L 0 L 1 = max θ Ω 0 L(X; θ) max θ Ω1 L(X; θ). def n: The likelihood ratio test (LRT) of size α for testing H 0 vs. H 1 rejects H 0 when λ(x) < c where c solves sup θ Ω0 P θ (λ(x) < c) = α. A very important large sample result for LRTs is thm: For a LRT, if Ω 1 R q and Ω 0 Ω 1 be r-dimensional where r < q, then under some regularity conditions 2 log λ(x) D χ 2 q r. 2 / 28

One-sample normal H 0 : µ = µ 0 Let x 1,..., x n iid Np (µ, Σ) where (µ, Σ) are unknown. We wish to test H 0 : µ = µ 0. Under H 0, ˆµ = µ 0 and ˆΣ = S + ( x µ 0 )( x µ 0 ). Under H 1, ˆµ = x and ˆΣ = S. We will show in class that 2 log λ(x) = n log{1 + ( x µ 0 ) S 1 ( x µ 0 )}. Recall that (n 1)( x µ 0 ) S 1 ( x µ 0 ) T 2 (p, n 1) when H 0 is true and has a scaled F -distribution. 3 / 28

Head lengths of 1st and 2nd sons Frets (1921) considers data on the head length and breadth of 1st and 2nd sons from n = 25 families. Let x i = (x i1, x i2 ) be the head lengths (mm) of the first and second son in Frets data. To test H 0 : µ = (182, 182) (p. 126) we can use Michail Tsagris hotel1t2 function in R. Note that your text incorrectly has p = 3 rather than p = 2 in the denominator of the test statistic. install.packages("calibrate") library("calibrate") data(heads) source("http://www.stat.sc.edu/~hansont/stat730/paketo.r") colmeans(heads) hotel1t2(heads[,c(1,3)],c(182,182),r=1) # normal theory 4 / 28

Bootstrapped p-value The underlying assumption here is normality. If this assumption is suspect and the sample sizes are small, we can sometimes perform a nonparametric bootstrap. The bootstrapped approximation to the sampling distribution under H 0 is simple to obtain. Simply resample the same number(s) of vectors with replacement, forcing H 0 upon the samples, and compute the sample statistic over and over again. This provides a Monte Carlo estimate of the sampling distribution. Michail Tsagris function automatically obtains a bootstrapped p-value via hotel1t2(heads[,c(1,3)],c(182,182),r=2000). R is the number of bootstrapped samples used to compute the p-value. The larger R is the more accurate the estimate, but the longer it takes. 5 / 28

One-sample normal H 0 : Σ = Σ 0 Let x 1,..., x n iid Np (µ, Σ) where (µ, Σ) are unknown. We wish to test H 0 : Σ = Σ 0. Under H 0, ˆµ = x and ˆΣ = Σ 0. Under H 1, ˆµ = x and ˆΣ = S. We will show in class that 2 log λ(x) = n tr Σ 1 0 S n log Σ 1 0 S np. Note that the statistic depends on the eigenvalues of Σ 1 0 S. Your book discusses small sample results. For large samples we can use the usual χ 2 p(p+1)/2 approximation. cov.equal(heads[,c(1,3)],diag(c(100,100))) # 1st hypothesis, p. 127 cov.equal(heads[,c(1,3)],matrix(c(100,50,50,100),2,2)) # 2nd hypothesis A parametric bootstrapped p-value can also be obtained here. 6 / 28

Parametric bootstrap The parametric bootstrap conditions on sufficient statistics under H 0, typically nuisance parameters, to compute the p-value. Conditioning on a sufficient statistic is a common approach in hypothesis testing (e.g. Fisher s test of homogeneity for 2 2 tables). Since the sampling distribution relies on the underlying parametric model, a parametric bootstrap may be sensitive to parametric assumptions, unlike the nonparametric bootstrap. Recall that MLEs are functions of sufficient statistics. The parametric boostrap proceeds by simulating R samples X r = [x r1 x rn] of size n from the parametric model under the MLE from H 0 and computing λ(x r ). The p-value is ˆp = 1 R R r=1 I {λ(x r ) < λ(x)}. Let s see how this works for the test of H 0 : Σ = Σ 0. 7 / 28

Parametric bootstrap Sigma0=matrix(c(100,50,50,100),2,2) # Sigma0=matrix(c(100,0,0,100),2,2) lambda=function(x,sigma0){ n=dim(x)[1]; p=dim(x)[2] S0inv=solve(Sigma0); S=(n-1)*cov(x)/n n*sum(diag(s0inv%*%s))-n*det(s0inv%*%s)-n*p } ts=lambda(heads[,c(1,3)],sigma0) xd=heads[,c(1,3)]; n=dim(xd)[1]; p=dim(xd)[2] xbar=colmeans(heads[,c(1,3)]); S0root=t(chol(Sigma0)) # MLEs under H0 R=2000; pval=0 for(i in 1:R){ xd=t(s0root%*%matrix(rnorm(p*n),p,n)+xbar) if(lambda(xd,sigma0)>ts){pval=pval+1} } cat("parametric bootstrapped p-value = ",pval/r,"\n") Note that parametric bootstraps can take a long time to complete depending on the model, hypothesis, and sample size. 8 / 28

Union-intersection test of H 0 : µ = µ 0 Often a multivariate hypothesis can be written as the union of univariate hypotheses. H 0 : µ = µ 0 holds H 0a : a µ = a µ 0 holds for every a R p. In this sense, H 0 = a R ph 0a. For each a we have a simple univariate hypothesis, based on univariate data a x 1,..., a iid x n N(a µ, a Σa), and testable using the usual t-test t a = a x a µ 0 a S u a/n. We reject H 0a if t a > t n 1 (1 α 2 ), which means we reject H 0 if any t a > t n 1 (1 α 2 ), i.e. the union of all rejection regions, which happens when max a R p t a > t n 1 (1 α 2 ). 9 / 28

UIT of H 0 : µ = µ 0, continued Squaring both sides of max a R p t a > t n 1 (1 α 2 ), the multivariate test statistic can be rewritten max a R p t2 a = max a R p(n 1) a ( x µ 0 )( x µ 0 ) a a = (n 1)( x µ Sa 0 ) S 1 ( x µ 0 ), the one-sample Hotelling s T 2. We used a maximization result from Chapter 4 here. For H 0 : µ = µ 0, the LRT and the UIT give the same test statistic. 10 / 28

H 0 : Rµ = r Here, r R q. From Chapter 4 we have ˆµ = x SR [RSR ] 1 (R x r). We can then show (n 1)[λ(X) 2/n 1] = (n 1)(R x r) (RSR ) 1 (R x r). Your book shows that (n 1)[λ(X) 2/n 1] T 2 (q, n 1). This is a special case of the general test discussed in Chapter 6. If µ = (µ 1, µ 2 ), then H 0 : µ 2 = 0 is this type of test where R = [I 0] and r = 0. H 0 : µ = κµ 0 where µ 0 is given is also a special case. Here, the k = p 1 rows of R are orthogonal to µ 0 and r = 0. The UIT gives the same test statistic. 11 / 28

Test for sphericity H 0 : Σ = κi p On p. 134, MKB derive { 1 p 2 log λ(x) = np log tr } S. S 1/p Asymptotically, this is χ 2 (p 1)(p+2)/2. ts=function(x){ n=dim(x)[1]; p=dim(x)[2] S=cov(x)*(n-1)/n d=n*p*(log(sum(diag(s))/p)-log(det(s))/p) cat("asymptotic p-value = ",1-pchisq(d,0.5*(p-1)*(p+2)),"\n") } ts(heads) A parametric bootstrap can also be employed. R has mauchly.test built in for testing sphericity. 12 / 28

Testing H 0 : Σ 12 = 0, i.e. x 1i indep. x 2i Let x i = (x 1i, x 2i ), where x 1i R p 1, x 2i R p 2 and p 1 < p 2. We can always rearrange elements via a permutation matrix to test any subset is indpendent of another subset. Partition Σ and S accordingly. MKB (p. 135) find p 1 2 log λ(x) = n log (1 λ i ) = n log I S 1 i=1 where λ i are the e-values of S 1 11 S 12S 1 22 S 21. 11 S 12S 1 22 S 21, Exactly, λ 2/n Λ(p 1, n 1 p 2, p 2 ) assuming n 1 p. When both p i > 2 we can use (n 1 2 (p + 3)) log I S 1 11 S 12S 1 22 S 21 χ 2 p 1 p 2. If p 1 = 1 then 2 log λ = n log(1 R 2 ) where R 2 is the sample multiple correlation coefficient between x 1i and x 2i, discussed in the next Chapter. The UIT gives the largest e-value λ 1 of S 1 11 S 12S 1 22 S 21 as the test statistic, different from the LRT. 13 / 28

Head length/breadth data Say we want to test that the head/breadths are independent between sons. Let x i = (x i1, x i2, x i3, x i4 ) be the 1st son s length & breadth followed [ by the] 2nd son s. We want to test Σ11 0 H 0 : Σ = where all submatrices are 2 2. 0 Σ 22 S=24*cov(heads)/25 S2=solve(S[1:2,1:2])%*%S[1:2,3:4]%*%solve(S[3:4,3:4])%*%S[3:4,1:2] ts=det(diag(2)-s2) # ~WL(p1,n-1-p2,p2)=WL(2,22,2) # p. 83 => (21/2)*(1-sqrt(ts))/sqrt(ts)~F(4,42) 1-pf((21/2)*(1-sqrt(ts))/sqrt(ts),4,42) ats=-(25-0.5*7)*log(ts) # ~chisq(4) asymptotically 1-pchisq(ats,4) # asymptotics work great here! What do we conclude about how head shape is related between 1st and 2nd sons? 14 / 28

H 0 : Σ is diagonal We may want to test that all variables are independent. Under H 0 ˆµ = x and ˆΣ = diag(s 11,..., s pp ). Then 2 log λ(x) = n log R. This is approximately χ 2 p(p 1)/2 in large samples. The parametric bootstrap can be used here as well. 1-pchisq(-25*log(det(cov2cor(S))),6) 15 / 28

One-way MANOVA, LRT Want to test H 0 : µ 1 = = µ k assuming Σ 1 = = Σ k = Σ. Under H 0, ˆµ = x and ˆΣ 0 = S from the entire sample. Under H a, ˆµ i = x i and ˆΣ a = 1 n k i=1 n is i. Here, n = n 1 + + n k. Then λ(x) 2/n = ˆΣ a ˆΣ 0 = W T = WT 1, where T = ns is the total sums of squares and cross products (SSCP) and W = k i=1 n is i is the SSCP for error, or within groups SSCP. The SSCP for regression is B = T W, or between groups. Then λ(x) 2/n = W B + W = 1 I p + W 1 B. 16 / 28

A bit different than usual F-test When p = 1 we have λ(x) 2/n = 1 I p + W 1 B = 1 1 + SSR SSE = 1 1 + k 1 n p F, where F = MSR MSE. The usual F -statistic is a monotone function of λ(x) and vice-versa. For p = 1, λ(x) is distributed beta. 17 / 28

ANOVA decomposition Let X(n p) = [X 1 X k ] be the k d.m. stacked on top of each other. Recall that for p = 1 SSE = k n i (x ij x i ) 2 = X [I n P Z ]X def = X C 1 X, i=1 j=1 where P Z = Z(Z Z) 1 Z and Z = block-diag(1 n1,..., 1 nk ). For p > 1 it is the same! E = k n i (x ij x i )(x ij x i ) = X [I n P Z ]X = X C 1 X. i=1 j=1 Note that P Z = block-diag( 1 n 1 1 n1 1 n 1,..., 1 n k 1 nk 1 n k ). 18 / 28

ANOVA decomposition Similarly, for p = 1 SSR = k n i ( x i x) 2 = X [P Z P 1n ]X def = X C 2 X, i=1 j=1 where P 1n = 1 n 1 n1 n. This generalizes for p > 1 to B = k n i ( x i x)( x i x) = X [P Z P 1n ]X = X C 2 X. i=1 j=1 19 / 28

ANOVA decomposition Finally, for p = 1 we have SST = k n i (x ij x) 2 = X [I n P 1n ]X, i=1 j=1 which generalizes for p > 1 to T = k n i (x ij x)(x ij x) = X [I n P 1n ]X. i=1 j=1 Now note that T = X [I n P 1n ]X = X [I n P Z + P Z P 1n ]X = X [I n P Z ]X + X [P Z P 1n ]X = E + B. 20 / 28

One-way MANOVA, LRT Note that X d.m. N p (µ, Σ) under H 0 : µ 1 = µ k. Then W = X C 1 X, B = X C 2 X, where C 1 and C 2 are projection matrices of ranks n k and k 1, and C 1 C 2 = 0. Using Cochran s theorem and Craig s theorem, so then W W p (Σ, n k) indep. B W p (Σ, k 1), under H 0 provided n p + k. λ 2/n Λ(p, n k, k 1), 21 / 28

One-way MANOVA, UIT On p. 139 your book argues that the test statistic for the UIT is the largest e-value of W 1 B. The LRT and UIT lead to different test statistics, but they are based on the same matrix. There are actually four statistics in common use in multivariate regression settings. Let λ 1 > > λ p be e-values from W 1 B. Then Roy s greatest root is λ 1, Wilk s lambda is p 1 i=1 1+λ i, Pillai-Bartlett trace is p λ i i=1 1 λ i, and Hotelling-Lawley trace is p i=1 λ i. Note that the one-way model is a special case of the general regression model in Chapter 6. MANOVA is further explored in Chapter 12. The Hotelling s two-sample test of H 0 : µ 1 = µ 2 is a special case of MANOVA where k = 2. In this case, Wilk s lambda boils down to λ 2/n = 1 + n 1 n 2 ( x 1 x 2 ) S 1 u ( x 1 x 2 ). The last term differs from the Hotelling s two-sample test statistic by a factor of n. 22 / 28

Behrens-Fisher problem How to compare two means H 0 : µ 1 = µ 2 when Σ 1 Σ 2? A standard LRT approach works but requires numerical optimization to obtain the MLEs; your book describes an iterative procedure. The UIT approach fares better here. On pp. 143 144 a UIT test procedure is described that yields a test statistic that is approximately distributed Hotelling s T 2 with df computed using Welch s (1947) approximation. Tsagris rather implements an approach due to James (1954). 23 / 28

Behrens-Fisher problem An alternative, for k 2, is to test H 0 : µ 1 = = µ k, Σ 1 = = Σ k (complete homogeneity) vs. the alternative that the means and covariances are all different. This is the hypothesis that data all come from one population vs. k separate populations. MKB pp. 141 142 show 2 log λ = n log 1 k n i=1 n is i k i=1 n i log S i. This is asymptotically χ 2 p(k 1)(p+3)/2. The James (1954) approach can be extended to k 2; Tsagris implements this in maovjames. 24 / 28

Dental data library(reshape) # need to turn $\by$ into $\by$. library(car) # allows for multivariate linear hypotheses library(heavy) # has dental data data(dental) d2=cast(melt(dental,id=c("subject","age","sex")),subject+sex~age) names(d2)[3:6]=c("d8","d10","d12","d14") # Hotelling s T-test, exact p-value and bootstrapped hotel2t2(d2[d2$sex== Male,3:6],d2[d2$Sex== Female,3:6],R=1) hotel2t2(d2[d2$sex== Male,3:6],d2[d2$Sex== Female,3:6],R=2000) # MANOVA gives same p-value as Hotelling s parametric test f=lm(cbind(d8,d10,d12,d14)~sex,data=d2) summary(anova(f)) # James T-tests do not assume the same covariance matrices james(d2[d2$sex== Male,3:6],d2[d2$Sex== Female,3:6],R=1) james(d2[d2$sex== Male,3:6],d2[d2$Sex== Female,3:6],R=2000) 25 / 28

Iris data Here s a one-way MANOVA on the iris data. library(car) scatterplotmatrix(~sepal.length+sepal.width+petal.length+petal.width Species, data=iris,smooth=false,reg.line=f,ellipse=t,by.groups=t,diagonal="none") f=lm(cbind(sepal.length,sepal.width,petal.length,petal.width)~species, data=iris) summary(anova(f)) f=manova(cbind(sepal.length,sepal.width,petal.length,petal.width)~species, data=iris) summary(f) # can ask for other tests besides Pillai maovjames(iris[,1:4],as.numeric(iris$species),r=1000) # bootstrapped 26 / 28

Homogeneity of covariance Now we test H 0 : Σ 1 = = Σ k in the general model with differing means and covariances. Your book argues k 2 log λ(x) = n log S n i log S i = i=1 k i=1 n i log S 1 i S. Asymptotically, this has a χ 2 p(p+1)(k 1)/2 Tsagris functions, try distribution. Using cov.likel(iris[1:4],as.numeric(iris$species)) cov.mtest(iris[1:4],as.numeric(iris$species)) # Box s M-test Your book discusses that the test can be improved in smaller samples (Box, 1949). 27 / 28

Non-normal data MKB suggest the use of multivariate skew and kurtosis as statistics for assessing multivariate normality. (p. 149). They further point out that, broadly, for non-normal data the normal-theory tests on means are sensitive to β 1,p whereas tests on covariance are sensitive to β 2,p. Both tests, along with some others, are performed in the the MVN package. They are also in the psych package. library(mvn) cork=read.table("http://www.stat.sc.edu/~hansont/stat730/cork.txt",header=t) mardiatest(cork,qqplot=t) Another option is to examine the sample Mahalanobis distances Di 2 = (x i x) S 1 (x i x), or stratified versions of these for multi-sample situations. These are approximately χ 2 p. The mardiatest function provides a Q-Q plot of the M-distances compared to what is expected under multivariate normality. 28 / 28