Lecture 5: Hypothesis tests for more than one sample

1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011

2/23 Outline Paired comparisons Repeated measures Comparing mean vectors from two populations Comparing mean vectors from more than two populations MANOVA

3/23 Repetition: Testing H 0 : µ = µ 0 Let X N p (µ, Σ). When testing the hypothesis H 0 : µ = µ 0, we use Hotelling s T 2 : Under H 0, T 2 = n( X µ 0 ) S 1 ( X µ 0 ). n p (n 1)p T 2 F p,n p. The T 2 test therefore rejects H 0 : µ = µ 0 at level α if T 2 > (n 1)p n p F p,n p(α). Similarly, the p-value of the test is obtained as p = P(T 2 > x) where x is the observed value of the statistic.

4/23 Paired comparisons If we wish to study the effect of a treatment, it is often desirable to measure the response variables of interest on a single unit before and after the treatment was applied to that unit. This procedure eliminates unit-to-unit variation. Examples: ph of lakes before and after chalk was added, health of patients before and after medication, people s view of Uppsala University before and after a nationwide advertising campaign... Similarly, if we wish to compare two treatments, we can apply both treatments to the same (or identical) experimental unit. Such experimental designs are called paired comparisons, since the measurements are made in pairs.

5/23 Paired comparisons: Hotelling s T 2 Let X j1 denote the response to treatment 1 and X j2 denote the response to treatment 2 for experimental unit j. If X j1 and X j2 are multivariate normal, then D j = X j1 X j2 N p (δ, Σ d ), where δ is the mean difference between the treatments. If the treatments are applied independently to n independent units, so that D 1,..., D n are independent N p (δ, Σ d ) random vectors, then T 2 = n( D δ) S 1 d ( D δ) p(n 1) n p F p,n p. This is simply the result about Hotelling s T 2 from the last lecture. The problem of comparing the two samples X 11,..., X 1n1 and X 21,..., X 2n2 is simplified to the familiar one sample problem by looking at the pairwise differences.

6/23 Paired comparisons: testing The hypothesis H 0 : δ = 0 is rejected in favour of the alternative H 1 : δ 0 if T 2 = n d S 1 d d > p(n 1) n p F p,n p(α) where d j = (d j1, d j2,..., d jp ), j = 1,..., n are the observed differences between the n units. A confidence region for δ with confidence level α consists of all δ such that ( d δ) S 1 d ( d p(n 1) δ) n(n p) F p,n p(α). The simultaneous Bonferroni confidence intervals for the individual mean differences δ i are given by ( α ) ) I δi = ( d i ± t n 1 s 2 2p d i /n.

7/23 Paired comparisons: contrasts If we use a little matrix algebra, it is not necessary to calculate all the differences. Instead, we can use contrast matrices. See blackboard! On the other hand, it may be advisable to calculate the differences d 1,..., d n in order to assess their normality. The notion of contrast matrices can also be used for repeated measures designs.

8/23 Repeated measures A situation that is similar to what we just studied is when we wish to compare the effects of q different treatments on a single response variable. Let X 1,..., X n be i.i.d. N p (µ, Σ) observations, with X j = (X j1, X j2,..., X jq ), where X ji is the response to the ith treatment on the jth experimental unit. Typically, we wish to test the hypothesis that there is no difference between the treatment means. This is stated using contrast matrices. See blackboard!

9/23 Repeated measures: testing When the treatment means are equal, C 1 µ = C 2 µ = 0. In fact, Cµ = 0 for any contrast matrix C. Given C, we can compute the observed contrasts Cx j, with mean C x and sample covariance CSC. The hypothesis Cµ = 0 is tested using T 2 = n(c x) (CSC ) 1 (C x) (q 1)(n 1) F q 1,n q+1 n q + 1 under H 0. The statistic T 2 is independent of the choice of contrast matrix C.

10/23 Comparing mean vectors from two populations Often we wish to compare the mean vectors of two populations in situations where it isn t possible to use paired comparisons. Assume that we have a p-variate sample X 11, X 12,..., X 1n1 from a distribution with mean µ 1 and covariance Σ 1 and a p-variate sample X 21, X 22,..., X 2n2 from a distribution with mean µ 2 and covariance Σ 2. Furthermore, assume that the two samples are independent. We wish to test the hypothesis that µ 1 µ 2 = δ 0.

Two populations: Hotelling s T 2 In order to construct a test statistic for this hypothesis, we think about how Hotelling s T 2 is constructed in the one-sample case. See blackboard! Result 6.2. If X 11, X 12,..., X 1n1 are i.i.d. N p (µ 1, Σ) and X 21, X 22,..., X 2n2 are i.i.d. N p (µ 2, Σ), then ) ( T 2 = ( X 1 X 2 (µ 1 µ 2 ) ( 1 + 1 ) 1 ) )S p ( X 1 X 2 (µ 1 µ 2 ) n 1 n 2 (n 1 + n 2 2)p n 1 + n 2 p 1 F p,n 1 +n 2 p 1. The assumption that the covariance matrices are equal is quite strong! There are p variances and p(1 p) 2 distinct covariances in the covariance matrix. On the other hand, the real null hypothesis may be that the distributions, and not just the mean vectors, are equal for the two treatments. 11/23

12/23 Two populations: Behrens-Fisher problem The problem of making inferences about the two means of two (univariate) normal populations without assuming that the variances are equal is called the Behrens-Fisher problem. Different approaches to this problem have been proposed by Fisher, Behrens, Chapman, Dudewicz and Ahmed, among others. The most commonly used solution was given by Welch, who proposed a t-test using s 2 d = s2 1 /n 1 + s 2 2 /n 2. His statistic is approximately t-distributed, with a complicated expression for the degrees of freedom. Further reading: Kim, S.-H., Cohen, A.S. (1998): On the Behrens-Fisher problem: a review, Journal of Educational and Behavioral Statistics, 23, pp. 356-377.

13/23 Two populations: Behrens-Fisher problem When comparing mean vectors of two multivariate normal populations with unequal covariance matrices, the problem becomes even more complicated. Some possible solutions are: Use that ) ( T 2 = ( X 1 X 2 δ 0 ( 1 S 1 + 1 ) 1 ) S 2 ) ( X 1 X 2 δ 0 χ 2 p n 1 n 2 under H 0 when n 1 p and n 2 p are large, even if the data is non-normal. Use that, for normal data, T 2 above is approximately distributed as νp ν p + 1 F p,ν p+1 where ν is given by the complicated expression (6-29) in J&W. Use a different, more robust, test! (e.g. Tiku and Singh (1982))

14/23 MANOVA: Multivariate ANalysis Of VAriance Now let s assume that we have observations from g populations: Population 1: Population 2:. Population g: X 11, X 12,..., X 1n1 X 21, X 22,..., X 2n2. X g1, X g2,..., X gng and that we wish to test the hypothesis that all populations have the same mean. If there are differences, we d like to be able to say which means differ.

15/23 MANOVA: Assumptions For MANOVA, we make the following assumptions: X l1, X l2,..., X lnl are i.i.d. with mean µ l, l = 1, 2,..., g. The samples from different populations are independent. All populations have the same covariance matrix Σ. The populations are multivariate normal. If the sample sizes are large, MANOVA can be used as an approximative method due to the multivariate central limit theorem.

16/23 MANOVA: Model Linear model: X lj = µ + τ l + e lj, j = 1, 2,..., n l and l = 1, 2,..., g where e lj are independent N p (0, Σ) variables. Here the parameter vector µ is an overall mean and τ represents the lth treatment effect with We wish to test g n l τ l = 0. l=1 H 0 : τ 1 = τ 2 =... = τ g against the hypothesis that at least two effects differ.

17/23 MANOVA: Sums of squares and cross products In analogue to the univariate MANOVA, the total sum of squares (and cross products) is partitioned into different sources of variation: g n l (x lj x)(x lj x) = l=1 j=1 g n l ( x l x)( x l x) + l=1 g n l (x lj x l )(x lj x l ) = B + W l=1 j=1 where B is the treatment (Between) sum of squares and cross products and W is the residual (Within) sum of squares and cross products. B and W are p p matrices. The latter can be rewritten as W = (n 1 1)S 1 + (n 2 1)S 2 +... + (n g 1)S g

18/23 MANOVA: Test statistic In univariate ANOVA, H 0 : τ 1 = τ 2 =... = τ g is tested by studying a suitable rescaling of SS Tr /SS Res. This is equivalent to studying SS Tr /SS Res or 1 + SS Tr /SS Res = (SS Res + SS Tr )/SS Res. This, in turn, is equivalent to studying SS Res /(SS Tr + SS Res ). We would like to construct a similar statistic for MANOVA, but ratios of matrices are not defined. Wilks suggested using the statistic known as Wilks lambda. Λ = det W det(b + W)

19/23 MANOVA: Distribution of Wilks Λ What can be said about the distribution of Λ = det W det(b + W)? Let N = g l=1 n l. Then we have the following Exact results: N g 1 Λ p = 1 g 2 g 1 Λ F g 1, N g N g 1 1 Λ p = 2 g 2 Λ F 2(g 1), 2(N g 1) p 1 g = 2 p 1 g = 3 g 1 N p 1 1 Λ p N p 2 p Approximate result: (for N large) Λ F p, N p 1 1 Λ Λ F 2p, 2(N p 2) (N 1 (p + g)/2) ln Λ χ 2 p(g 1)

20/23 MANOVA: Other test statistics Three other tests statistics are also common for MANOVA: Lawley Hotelling trace: Pillai trace: Roy s largest root: tr(bw 1 ) tr(b(b + W) 1 ) maximum eigenvalue of W(B + W) 1 For g = 2, all four statistics reduce to Hotelling s T 2. For large samples, all four are nearly equivalent.

21/23 MANOVA: Confidence intervals Simultaneous confidence intervals for the mean differences are obtained using the Bonferroni approach. Let N = g l=1 n l. Then ( α ) ( x w ) ii ki x li ± t N g pg(g 1) N g (1/n k + 1/n l ) where w ii is the ith diagonal element of W, is a confidence interval for τ ki τ li with confidence level at least 1 α.

22/23 Equality of covariance matrices As previously mentioned, the assumption of equal covariance matrices is quite strong, as there are p(p+1) 2 distinct elements in the covariance matrix. There are a few methods to investigate the assumption of equality: Visual investigation of matrices. Box s M test. Discussed in J&W. Good theoretical properties. Not as good in practice. Some authors call this test super-sensitive and say that it isn t useable for α > 0.01. Bartlett s test or Levene s test for equal variances, for marginals.

23/23 Summary Paired comparisons Repeated measures Comparing mean vectors from two populations Comparing mean vectors from more than two populations MANOVA Different statistics to choose from Equality of covariance matrices